Summary

Larger PLMs are better at detecting distant homologs (1). This is in contrast with the observation that larger models are generally no better at most tasks (Protein property prediction using PLMs does not benefit from scale except when predicting inferring features of either structural or sparsely populated sequence families).

Figures

Ref (1)

1.
Wu KE, Chang H, Zou J. ProteinCLIP: enhancing protein language models with natural language. openRxiv; 2024. Available from: https://doi.org/10.1101/2024.05.14.594226