Summary

PLMs are biased against sequences with multiple mutations (1).

Details

The authors propose normalizing these by generating large quantities () of mutants with equal number of mutations and ranking them that way.

Figures

Ref (1)

1.
Shaw A, Spinner H, Shin J, Gurev S, Rollins N, Marks D. Removing bias in sequence models of protein fitness. openRxiv; 2023. Available from: https://doi.org/10.1101/2023.09.28.560044