ESM

ESM is series of a protein language models that use the BERT masking training objectives (introduced in (1,2), and (3)). Embeddings from a 3B ESM2 model (which was sequence-only) were used to train ESMFold (2). ESM3 directly uses input and output structure tokens as training objectives (3).

Details

ESM1b used dropout in the attention matrices during training, but ESM2 did not.
Verkuil et al. (4) argue that it generalizes beyond natural proteins. Its training set excludes artificial proteins.
ESM-1v is a variant effect prediction model trained on MSA Transformer logits (5).

Rives A, Meier J, Sercu T, Goyal S, Lin Z, Liu J, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences. 2021;118(15). Available from: https://doi.org/10.1073/pnas.2016239118

Lin Z, Akin H, Rao R, Hie B, Zhu Z, Lu W, et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023;379(6637):1123–30. Available from: https://doi.org/10.1126/science.ade2574

Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, Lin Z, et al. Simulating 500 million years of evolution with a language model. Science. 2025;387(6736):850–8. Available from: https://doi.org/10.1126/science.ads0018

Verkuil R, Kabeli O, Du Y, Wicky BIM, Milles LF, Dauparas J, et al. Language models generalize beyond natural proteins. openRxiv; 2022. Available from: https://doi.org/10.1101/2022.12.21.521521

Meier J, Rao R, Verkuil R, Liu J, Sercu T, Rives A. Language models enable zero-shot prediction of the effects of mutations on protein function. openRxiv; 2021. Available from: https://doi.org/10.1101/2021.07.09.450648

Quartz 4

Explorer

ESM

Details

Graph View

Backlinks