Summary
PLMs are biased against sequences with multiple mutations (1).
Details
The authors propose normalizing these by generating large quantities () of mutants with equal number of mutations and ranking them that way.
Figures
Ref (1)
1.
Shaw A, Spinner H, Shin J, Gurev S, Rollins N, Marks D. Removing bias in sequence models of protein fitness. openRxiv; 2023. Available from: https://doi.org/10.1101/2023.09.28.560044