Created: March 08, 2020
Modified: June 06, 2020

graph neural networks

This page is from my personal notes, and has not been specifically reviewed for public consumption. It might be incomplete, wrong, outdated, or stupid. Caveat lector.

A 'graph neural net' is a differentiable, parameterized function whose input or output (or both) is a graph.
Discriminative: graph as input.
- Typically we want networks that are invariant to graph isomorphism, meaning that they will give the same outputs given isomorphic inputs. You might think that this is hard, since testing for graph isomorphism is nontrivial (but quasipolynomial now). But invariance is a much weaker condition. For example, consider the constant function that always returns zero: it is trivially invariant to isomorphism (and all other transformations!).
- Less trivially invariance can be ensured by defining computation in terms of local operations at each node and/or edge. The overall computation is then determined by the graph structure, so two graphs with the same structure will lead to the same result.
- Uses:
  - predicting properties of molecules.
    - I could imagine: polarity, acidity, charge at each atom, 3D conformation(s) / bond angles / pairwise atomic distances, chirality (specific rotation), boiling point, melting point, density.
    - Given two molecule inputs: can they bond to each other? if so, where?
    - From https://arxiv.org/abs/1704.01212: atomization energy at 0K and room temperature, atomization enthalpy, atomization free energy…
      - Q: why aren't all of these just bond-dissociation enthalpy?
    - … highest fundamental frequency of vibration, zero point vibrational energy (energy of vibration at absolute zero from pure quantum uncertainty), energy of the highest occupied molecular orbital (HOMO), energy of the lowest unoccupied molecular orbital (LOMO), energy gap (LUMO-HOMO: at absolute zero all energy levels are filled in order so this must be positive), electronic spatial extent, norm of the dipole moment, norm of the static polarizability.
    - It seems useful to frame predicting 3D structure (as in AlphaFold) as a structured prediction problem.
- relevant papers:
  - [[Convolutional Networks on Graphs for Learning Molecular Fingerprints]][paper](https://arxiv.org/abs/1509.09292) 2015
  - Graph Attention Networks 2017
  - Self-attention with relative position representations 2018
  - Graph transformer networks 2019 but only 7 citations
Generative:
- The tricky thing about generating graphs is that it's hard to compute likelihoods, because whatever representation you use (adjacency matrix, etc) you'd have to sum across all combinatorially many representations of the same isomorphic graph. So usually we just use an ordered representation and hope for the best.
- The Junction tree VAE provides a somewhat 'hierarchical' approach to generation. TODO read further and brainstorm
- relevant papers
For projects:
- the QM-9 dataset has all small molecules up to nine atoms.

graph neural networks

Links to this note

attention

Meta