Thermostability refers to a protein’s ability to remain folded at high temperatures or under harsh conditions. It is a highly desirable property for engineered proteins.
Prediction
- Phantom epistasis refers to the inclusion of unnecessary model parameters when building biophysical/statistical fitness models (Fitness prediction). Faure et al. (1) attribute this to the epistasis mechanisms reviewed by Domingo et al. (2).
- Thermodynamic reversability can be used for expanding training sets for stability prediction/ddG prediction ML models. However, it has been shown to lead to biases that favor WT amino acids. Diaz et al. (3) claim to mitigate this.
- The amount of ddG data available for a given residue for training can be expanded using thermodynamic permutation, where measurements are increased to . This was used by MutComputeXGT on the Tsuboyama et al. (4) dataset. It is useful for stability prediction and improves generalization in (3).
- ddG data is skewed with hydrophobic amino acids (e.g., alanine scans). This has been reported to increase solvation ddG by 0.8 kcal/mol in studies cited by (3). The Tsuboyama et al. (4) data does not have this bias.
1.
Faure AJ, Martí-Aranda A, Hidalgo-Carcedo C, Beltran A, Schmiedel JM, Lehner B. The genetic architecture of protein stability. Nature. 2024;634(8035):995–1003. Available from: https://doi.org/10.1038/s41586-024-07966-0
2.
Domingo J, Baeza-Centurion P, Lehner B. The Causes and Consequences of Genetic Interactions (Epistasis). Annual Review of Genomics and Human Genetics. 2019;20:433–60. Available from: https://doi.org/10.1146/annurev-genom-083118-014857
3.
Diaz DJ, Gong C, Ouyang-Zhang J, Loy JM, Wells J, Yang D, et al. Stability Oracle: A Structure-Based Graph-Transformer for Identifying Stabilizing Mutations. openRxiv; 2023. Available from: https://doi.org/10.1101/2023.05.15.540857
4.
Tsuboyama K, Dauparas J, Chen J, Laine E, Mohseni Behbahani Y, Weinstein JJ, et al. Mega-scale experimental analysis of protein folding stability in biology and design. Nature. 2023;620(7973):434–44. Available from: https://doi.org/10.1038/s41586-023-06328-6