Summary
Protein language models are able to predict epistasis in a zero-shot setting, but must be nonlinearly transformed to achieve meaningful accuracy (1). This was studied using ESM2 model and showed the same scaling dependency observed with prediction of other properties.
Details
Epistasis is defined as where is the experimental fitness of the double mutant and and is the experimental fitness of single mutants and . The nonlinear transform is applied with fit parameters and , with being log-likelihood predictions from the PLM, and being the output transformed prediction.
Figures
Ref (1)
1.
Nambiar A, Littlefield SB, Cuellar C, Khorana R, Maslov S. Protein Language Models Capture Structural and Functional Epistasis in a Zero-Shot Setting. openRxiv; 2025. Available from: https://doi.org/10.1101/2025.09.14.676130