Summary

Residue conservation and solvent exposure data perform comparably to PLMs at some property prediction tasks (1). The former was assessed using position-specific scoring matrices. However, as with language models, they perform worse than inverse folding on stability prediction.

Figures

MethodModel typeAllStabilityBindingExpressionActivityFitness
−RSASTR*0.3560.4800.2810.3230.3300.286
LORALI*0.4270.4470.3750.3900.4520.414
RSALORSTR & ALI0.4730.5510.4550.4280.4720.419
ProSST-2048 [13]STR & pLM0.5220.6380.5270.5270.4860.441
PoET [14]ALI & pLM0.4700.4580.4400.4590.4950.474
SaProt (650M) [19]STR & pLM0.4620.5650.4410.4820.4590.375
VespaG [20]pLM0.4610.4790.4150.4500.4890.440
TranceptEVE (L) [21]ALI & pLM0.4500.4240.4050.4470.4890.458
GEMME [22]ALI0.4470.4520.3670.4300.4770.444
EVE (ensemble) [23]ALI0.4310.4100.3820.3980.4660.446
ESM2 (650M) [24]pLM0.4280.4960.3820.4090.4310.381

Table from (1)

See also

1.
Tsishyn M, Hermans P, Rooman M, Pucci F. Residue conservation and solvent accessibility are (almost) all you need for predicting mutational effects in proteins. Bioinformatics. 2025;41(6). Available from: https://doi.org/10.1093/bioinformatics/btaf322