Summary
Larger protein language models generate more novel protein sequences from sparsely populated families (1). These sequences are also slightly more likely to express, although the effect isn’t huge. These results were observed using ProGen3.
Figures
Ref (1)
See also
1.
Bhatnagar A, Jain S, Beazer J, Curran SC, Hoffnagle AM, Ching KS, et al. Scaling Unlocks Broader Generation and Deeper Functional Understanding of Proteins. openRxiv; 2025. Available from: https://doi.org/10.1101/2025.04.15.649055