Graph ML Lab

From-scratch implementations of graph neural networks, conditional graph generation, variational autoencoders, spatial 3D tree generation, and geometric mesh interpolation — all built with PyTorch, no external graph libraries.

PyTorch FastAPI D3.js From Scratch

Semi-Supervised Node Classification

GCN and GAT layers implemented from scratch using only PyTorch tensor operations. Evaluated on Zachary's Karate Club graph (34 nodes, 78 edges) with 50% labeled nodes.

Mr. Hi's group Officer's group
34 Nodes
78 Edges
90.5% Test Acc
0.897 F1 Macro

Architecture

class GCNLayer(nn.Module): # Kipf & Welling (ICLR 2017) # Message passing: aggregate then transform def forward(self, x, adj): a_hat = normalize_adj(adj) return a_hat @ x @ self.weight

Training Curve

Conditional Graph Generation

A conditional VAE that generates small graphs matching target structural properties (node count, density, clustering). Trained on mixtures of Erdos-Renyi, Barabasi-Albert, and Watts-Strogatz random graphs.

10
0.30

Graph Variational Autoencoder

Variational autoencoder that learns a continuous latent representation of graph structure, enabling smooth interpolation between topologically distinct graphs and generation of novel network structures.

Model Architecture

Input
Graph (A, X)
Encoder
2-layer GCN
+ mean readout
Latent
μ, σ
z ~ N(μ, σ²)
Decoder
MLP
→ node embeds
Output
σ(ZZ⊤)
= adj matrix

Geometric Graph Interpolation

Spectral layout computes deterministic 2D node positions from graph topology. Greedy node matching and edge blending produce smooth morphing sequences between structurally different graphs.

VAE Training Loss (ELBO)

Autoregressive Spatial Tree Generation

A variational autoencoder for trees embedded in continuous 3D space. At each step the decoder selects a parent node (discrete attention), predicts a 3D offset (learned Gaussian), and decides whether to stop — combining discrete structural decisions with continuous spatial positions.

Decoder Loop

# Autoregressive tree construction for t in range(max_nodes): parent = attention(h, nodes) # discrete offset = gaussian(h, parent) # continuous pos[t] = pos[parent] + offset stop = sigmoid(stop_head(h)) if stop > 0.5: break
25–40 Nodes
2–6 Branch Pts
2–3 Strahler
23–41 Path Length

Geometric Mesh Interpolation

Smooth interpolation between low-poly 3D meshes with different topology and vertex counts. Uses greedy nearest-neighbour node correspondence, linear position blending, and gradual edge topology transition — no learned decoder required.

Procedural Mesh Library

Six distinct low-poly shapes spanning Platonic solids, prisms, stellated forms, and a toroidal surface. Each mesh has fixed topology with randomized vertex perturbation for training diversity.

Interpolation Method

1
Node Correspondence

Centre both meshes at origin. Pad the smaller mesh with phantom nodes at the centroid. Greedily match nodes by nearest-neighbour in 3D.

2
Position Blending

Linearly interpolate matched node positions: p(t) = (1−t) · pA + t · pB

3
Edge Topology Transition

Classify edges as shared, source-only, or target-only. Remove source-only edges proportionally to t and add target-only edges as t increases.

Interpolation Results

Three shape pairs demonstrating interpolation across varying topology complexity: same vertex count, different vertex count, and reduction.

Mesh VAE Architecture

A graph VAE with inner-product decoder (VGAE, Kipf & Welling 2016) extended to jointly predict 3D node positions. Used for generative sampling from the learned latent space.

Input
Mesh (A, pos)
Encoder
Position-aware
GNN
Latent
μ, σ
z ~ N(μ, σ²)
Decoder
Inner-product
adj + MLP pos
Output
Mesh
(Â, p̂)

Mesh VAE Training Loss

Discrete Differential Geometry on Graphs

Physics-informed graph neural networks that embed geometric priors directly into the message-passing framework. Cotangent Laplacian edge weights, discrete curvature encodings, reaction-diffusion dynamics for pattern formation, and energy-based regularisation (Dirichlet, total variation, elastic) ensure that learned representations respect the underlying differential geometry of the domain.

Architecture

1
Cotangent Laplacian

Compute geometric edge weights from vertex positions using cotangent weights, replacing the combinatorial Laplacian with a mesh-aware operator.

2
Curvature Encoding

Discrete mean, Gaussian, and principal curvatures computed per vertex and injected as positional node features into the GNN encoder.

3
Reaction-Diffusion

Gray-Scott dynamics on graph structure for pattern formation: coupled diffusion of two species with nonlinear reaction terms drives Turing-like instabilities.

4
Energy Regularisation

Dirichlet energy for smoothness, total variation for edge-preserving gradients, and elastic energy for shape preservation — all computed on the graph Laplacian.

49 Nodes
84 Edges
6 Curvatures
3 Energies

Graph Neural Networks in Hyperbolic Space

Embedding hierarchical data in the Poincaré ball model of hyperbolic space, where trees embed with zero distortion. Exponential volume growth matches the exponential branching of hierarchical structures, enabling faithful low-dimensional representations.

Method

1
Poincaré Ball Model

Map data to hyperbolic space where distance grows exponentially near the boundary. The metric tensor gx = (2/(1−||x||²))² · I warps Euclidean distances.

2
Hyperbolic Message Passing

Aggregate neighbours via logmap → tangent space → expmap. Operations stay on the manifold throughout the forward pass.

3
Riemannian Optimization

RiemannianAdam preserves manifold constraints during gradient descent by rescaling gradients with the inverse metric tensor.

4
Geodesic Visualization

Curved arcs orthogonal to the boundary circle — the shortest paths in hyperbolic geometry rendered in the Poincaré disk.

15 Nodes
14 Edges
4 Levels
0 Distortion

1-WL Color Refinement & GNN Expressivity

The 1-dimensional Weisfeiler-Leman test iteratively refines node colors by aggregating neighbor multisets. It is provably equivalent in discriminative power to message-passing GNNs (Xu et al., 2019; Morris et al., 2019), making it a fundamental tool for understanding GNN expressivity limits.

Algorithm

1
Initialize

Assign each node a color based on its degree. Nodes with the same degree get the same initial color.

2
Refine

For each node, hash (color, sorted neighbor colors) and relabel to consecutive integers. Repeat for k iterations.

3
Compare

If color histograms of two graphs differ at any iteration, they are provably non-isomorphic. If they always match, WL cannot distinguish them.

Instructions & What to Observe

A walkthrough of every module in the Graph ML Lab. Each card explains the workflow, expected outputs, and key observations for understanding graph neural networks.

1. GNN Explorer

What it does: Node classification on Zachary’s Karate Club (34 nodes, 78 edges) using GCN or GAT layers built from scratch.

Steps

  1. Click Load Karate Club to see the force-directed graph coloured by community.
  2. Choose GCN or GAT, set hidden dim (16), layers (2), and train for 200 epochs.
  3. Watch the loss curve descend. Test accuracy appears when training finishes.
  4. Enter node IDs (e.g. 0, 16, 33) and click Predict to see per-node class probabilities.

Observe

  • Expect 90–97% test accuracy. GAT often edges out GCN on bridge nodes like node 2.
  • The loss curve should drop steeply in the first 50 epochs, then plateau.
  • After training, graph colours update to predicted classes — misclassified nodes near the community boundary stand out.

2. Graph Generator

What it does: Conditional graph VAE that generates graphs matching target properties (node count, density, clustering).

Steps

  1. Train the generator (500 epochs). It learns to reconstruct and sample small graphs.
  2. Set target conditions with three sliders: nodes (5–20), density (0.05–0.95), clustering (0–1).
  3. Generate 6 samples. A gallery shows each graph with measured n, e, d values.

Observe

  • Generated graphs should approximately match the requested conditions — exact match is unlikely.
  • High density (>0.7) produces near-complete graphs; low density (<0.15) produces sparse trees.
  • High clustering + moderate density produces triangle-rich “community” structures.

3. VAE / Diffusion

What it does: Graph VAE encodes graphs into a latent space; diffusion model denoises random noise into coherent graphs.

Steps

  1. Train VAE (latent dim 8, hidden dim 32). Generate from VAE samples from the prior N(0,I).
  2. Interpolate between two training graphs — a strip shows smooth geometric morphing.
  3. Train diffusion (100 timesteps). Generate to see iterative denoising produce graphs.

Observe

  • VAE loss = reconstruction + KL divergence. Watch for posterior collapse (KL near zero early on).
  • Interpolation strip shows edges gradually appearing/disappearing and positions blending.
  • Diffusion graphs often look more realistic (fewer isolated nodes) but train slower.

4. Spatial 3D

What it does: Autoregressive tree VAE for 3D spatial trees and neuron morphologies, plus geometric mesh interpolation.

Steps

  1. Choose “Branching tree” or “Neuron morphology” and generate synthetic training data.
  2. Train the spatial VAE, then generate novel trees from the learned distribution.
  3. For meshes: view 6 canonical shapes, train a mesh VAE, and interpolate between shapes.

Observe

  • Isometric projection with x/y/z axis indicators. Large nodes = soma, medium = branch points, small = tips.
  • Tree cards show morphological stats: n (nodes), br (branch points), tips, S (Strahler order).
  • Mesh interpolation strip (t=0 to t=1): positions blend linearly while edge topology transitions.
  • Neuron morphologies should have 3–6 primary branches with Strahler order 2–4.

5. Physics GNN

What it does: Physics-informed GNN with cotangent Laplacians, discrete curvatures, reaction-diffusion dynamics, and energy regularisation.

Steps

  1. Choose a scenario: Heat Diffusion (grid), Curvature Classification (3D points), Elastic Stress, or Molecular.
  2. Generate the system, toggle physics components (curvature, diffusion, reaction-diffusion, attention, energy), and train.
  3. Run ablation study to compare 7 configurations. Visualise heat diffusion and Gray-Scott reaction-diffusion patterns.

Observe

  • Full model: 90–97% on heat diffusion, 85–92% on curvature, 80–90% on stress.
  • Ablation: removing curvature features causes the largest accuracy drop (~5–10%).
  • Heat diffusion strip: heat spreads concentrically in a diamond wavefront on grids.
  • Gray-Scott: f=0.055, k=0.062 → spots. Try f=0.04, k=0.06 for stripes.

6. Hyperbolic GNN

What it does: Embeds graphs in the Poincaré ball model of hyperbolic space, where hierarchical structures like trees can be represented with zero distortion.

Steps

  1. Generate a binary tree graph. It appears in the Poincaré disk with root at centre, leaves near boundary.
  2. Run the spring-mass simulation to optimise the layout in hyperbolic space.
  3. Train a hyperbolic GNN with message passing via logmap/expmap.
  4. Compare Euclidean vs Poincaré embeddings to see distortion differences.

Observe

  • Trees embed with near-zero distortion in hyperbolic space; Euclidean distortion is always higher.
  • Geodesic arcs curve towards the disk centre — the shortest paths in hyperbolic geometry.
  • Nodes near the boundary appear close but are actually far apart in the hyperbolic metric.

7. WL Test

What it does: Interactive 1-Weisfeiler-Leman colour refinement for testing graph isomorphism and understanding GNN expressivity limits.

Steps

  1. Select a graph pair (star vs path, chords, or cycles) and set iterations (1–10).
  2. Run the test. Step through iterations with the playback controls.

Observe

  • Node colours split by neighbourhood structure at each iteration.
  • When histogram bars diverge, a “Graphs distinguished!” badge appears.
  • The minimum WL iterations needed = minimum GNN depth required (Xu et al., 2019).
  • Star vs Path: separated in 1 iteration (degree-5 hub). Chords: 2–3 iterations.