Memorization in protein structure prediction

Summary

Training set memorization seems to occur more frequently in all-atom protein structure prediction.

Protein-ligand co-folding methods do not generalize beyond their training set
All-atom structure prediction of RNA is driven by memorization
AlphaFold3 performs comparably to AlphaFold2 when predicting multiple conformations of fold-switching proteins
Correct antibody-antigen prediction in AF3 and related models is partially determined by training set similarity
(1,2) note that AF2 memorizes conformations

Lazou M, Khan O, Nguyen T, Padhorny D, Kozakov D, Joseph-McCarthy D, et al. Predicting multiple conformations of ligand binding sites in proteins suggests that AlphaFold2 may remember too much. Proceedings of the National Academy of Sciences. 2024;121(48). Available from: https://doi.org/10.1073/pnas.2412719121

Chakravarty D, Schafer JW, Chen EA, Thole JF, Ronish LA, Lee M, et al. AlphaFold predictions of fold-switched conformations are driven by structure memorization. Nature Communications. 2024;15(1). Available from: https://doi.org/10.1038/s41467-024-51801-z

Quartz 4

Explorer

Memorization in protein structure prediction

Summary

Graph View

Quartz 4

Explorer

Memorization in protein structure prediction

Summary

Related notes

Graph View