Summary
Quality metrics based on protein sequence and structure largely do not correlate with each other (1). In contrast, correlation within groups was observed, for example ProteinMPNN to ESM-IF. Moreover, none of them except MSA Transformer probabilities correlate with percent sequence identity to naturally occurring sequences.
Figures
Figure 2B from (1)
See also
- Most ML quality metrics cannot effectively predict enzyme activity after controlling for similarity to native
- Protein folding neural networks cannot predict protein stability
- pLDDT and PAE in protein folding neural networks are correlated
- Distance between averaged PLM embeddings does not correlate with structural difference
1.
Johnson SR, Fu X, Viknander S, Goldin C, Monaco S, Zelezniak A, et al. Computational scoring and experimental evaluation of enzymes generated by neural networks. Nature Biotechnology. 2024;43(3):396–405. Available from: https://doi.org/10.1038/s41587-024-02214-2