Summary
Training set memorization seems to occur more frequently in all-atom protein structure prediction.
Related notes
- Protein-ligand co-folding methods do not generalize beyond their training set
- All-atom structure prediction of RNA is driven by memorization
- AlphaFold3 performs comparably to AlphaFold2 when predicting multiple conformations of fold-switching proteins
- Correct antibody-antigen prediction in AF3 and related models is partially determined by training set similarity
- (1,2) note that AF2 memorizes conformations
1.
Lazou M, Khan O, Nguyen T, Padhorny D, Kozakov D, Joseph-McCarthy D, et al. Predicting multiple conformations of ligand binding sites in proteins suggests that AlphaFold2 may remember too much. Proceedings of the National Academy of Sciences. 2024;121(48). Available from: https://doi.org/10.1073/pnas.2412719121
2.
Chakravarty D, Schafer JW, Chen EA, Thole JF, Ronish LA, Lee M, et al. AlphaFold predictions of fold-switched conformations are driven by structure memorization. Nature Communications. 2024;15(1). Available from: https://doi.org/10.1038/s41467-024-51801-z