Diffusion models

Diffusion models are generative models whose outputs are generated by iteratively denoising Gaussian noise. During inference, the reverse process is executed, whereas during training, the forward process is carried out (converting data to noise). This process can be guided at inference time to respect specific geometric constraints.

Details

Diffusion models try to recover samples $X_{0}$ drawn from $p_{d a t a}$ from Gaussian noise $X_{T}$ by iteratively denoising across $T$ time steps. The forward process is defined as

d X = drift f (X, t) d t + diffusion g (t) d B_{t} total SDE

$B_{t}$ : Standard Weiner process (by which noise is added)

Typically these drift and diffusion functions are used:

f (X, t) = - \frac{β ( t )}{2} X

g (t) = β (t)

The reverse process:

d X = [f (X, t) - g (t)^{2} \nabla_{x} lo g p_{t} (X) scoring function] d t + g (t) B_{t}

\nabla_{x} lo g p_{t} (X) = \frac{E [ X _{0} ∣ X _{t} = x ] - x}{conditional variance of distribution p _{t} ( X ∣ X _{0} = x _{0} ) σ _{t}^{2}}

The scoring function is approximated by a neural network, $s_{θ} (x, t)$ .

Diffusion models can also be zero-shot classifiers (1).

Diffusion models have been combined with Replica-exchange molecular dynamics (2).

Types of sequence-based diffusion

For protein sequences, which are fundamentally discrete, Yang et al (3) describe three approaches to generation using diffusion models:

Diffusion in pre-trained latent space (e.g., continuous diffusion)
Diffusion in discrete space with uniform noise matrices
Diffusion in discrete space via absorbing matrices, e.g., one-at-a-time masking/unmasking of individual tokens

Li AC, Prabhudesai M, Duggal S, Brown E, Pathak D. Your Diffusion Model is Secretly a Zero-Shot Classifier. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE; 2023. p. 2206–17. Available from: https://doi.org/10.1109/iccv51070.2023.00210

Wang Y, Herron L, Tiwary P. From data to noise to data for mixing physics across temperatures with generative artificial intelligence. Proceedings of the National Academy of Sciences. 2022;119(32). Available from: https://doi.org/10.1073/pnas.2203656119

Yang J, Chu W, Khalil D, Astudillo R, Wittmann BJ, Arnold FH, et al. Steering Generative Models with Experimental Data for Protein Fitness Optimization. 2025; Available from: https://arxiv.org/abs/2505.15093

Quartz 4

Explorer

Diffusion models

Details

Types of sequence-based diffusion

Implementation

Increasing diffusion samples is sufficient to yield correctly predicted antibody-antigen complexes

Subquadratic scaling for protein backbone diffusion

Protein design

Inference-time scaling of de novo designed proteins is more effective for harder targets

Backbone diffusion works best on small protein sizes

Diffusion outperforms hallucination when using AF3-generation protein structure prediction methods

Diffusion-based protein design methods undersample structural diversity in specific topologies

Guided sequence-only diffusion outperforms autoregressive fine-tuned PLMs on designing high-fitness sequences

Hallucination outperforms diffusion on protein refolding accuracy, particularly among larger proteins

Multimodal sequence-structure guidance outperforms unimodal guidance in cases where both need to be designed

No one-size-fits-all best approach to motif scaffolding protein design

Partial diffusion of protein structures reduces structural complexity

Partial structure diffusion can make de novo backbones more designable

Protein backbone diffusion models undersample loop-rich and alpha-beta domains and functional motifs

Protein backbones designed by diffusion, but not by language models, have more secondary structure

Protein backbones designed using diffusion, but not sequence-based models, have fewer beta sheets

Proteins designed by diffusion are more compact than those designed by hallucination

Stronger diffusion guidance reduces diversity of generated outputs

Structure prediction

Antibody-antigen modeling by diffusion-based structure prediction is data-limited

Diffusion-based structure prediction can be steered by modifying the conditioning embeddings rather than the latent space, and such embeddings can be used for subsequent iterations

Diffusion-based protein structure prediction methods double as energy methods comparable to traditional force fields

Diffusion-based structure prediction can sometimes model conformational ensembles

Enhanced diffusion with metadynamics-like potentials can sometimes convergence slower than unbiased diffusion

Ensembles can be modeled by structure prediction NNs using experimental data via guidance sampling

Multistate Bennett acceptance ratio can be used to reweigh one or more samples from a guided diffusion trajectory

Protein backbone design diffusion models can be repurposed for fitting structures into electron density

Flow matching and diffusion perform comparably for biomolecular structure prediction

Biomolecular diffusion models cannot reproduce the equilibrium dynamics of the simulations they are trained on

Confidence metrics for diffusion-based structure prediction methods can be improved with minimal changes to conditioning representations

Guidance potentials can be added to diffusion-based structure prediction for enhanced sampling of protein conformations

Training

Training backbone diffusion models on synthetic data improves designability

Training inverse folding and diffusion models exclusively on predicted protein structures worsens performance due to how locally perfect they are