Summary
Fine-tuned protein language models don’t generalize to positions absent from the fine-tuning dataset (1).
Figures

Ref (1); blue dots show position-based split
1.
Didi K, Alamdari S, Lu AX, Wittmann B, Johnston KE, Amini AP, et al. FLIP2: Expanding Protein Fitness Landscape Benchmarks for Real-World Machine Learning Applications. openRxiv; 2026. Available from: https://doi.org/10.64898/2026.02.23.707496