Protein evolution is the process and result of gradual sequence changes resulting in functional and/or structural changes. See Epistasis for examples on why evolutionary trajectories are difficult to predict. This note excludes any discussion of somatic hypermutation.
Notes
Paradigms and preliminaries
- Neutral theory: most observed amino acid changes are neutral (i.e., silent in fitness effects). This leads to genetic drift. Developed by Kimura (1).
- Nearly neutral theory: deleterious mutations are retained and subsequently compensated for by advantageous mutations (which is consistent with the observation that most missense mutations are destabilizing). Developed by Ohta (2) to explain why the rate of protein evolution was independent of generation time, which is in turn inversely proportional to population size.
Figure from (2) - The theory of punctuated equilibrium suggests that phenotypes change very little for long stretches of time, followed by abrupt rapid changes (3).
- Statistical physics approach: population size is equated with inverse temperature (such that infinite population is analogous to zero degrees Kelvin), and log-fitness with energy (4). Advantageous and deleterious mutations are predicted to occur with equal frequency. This framing ignores the imbalance in sequence data (5,6).
- Fisher’s geometric model: The overall fitness of a phenotype can be quantified along dimensions; Fisher postulated that phenotypes in a population were distributed as a hypersphere centered on a local maximum.
- Protein evolvability refers to the ability of a protein to 1) evolve new functions in relatively few mutations and 2) be robust to mutations that lead to loss-of-function (7). These are described as contradictory statements by Tokuriki & Tawfik (8) but are described as complementary at the structural level.
- “The principle of minimal frustration suggests that naturally evolved proteins with the same structure should have similar folding rates and that modulation of thermodynamic stability should occur via unfolding rates” (quoted from (9)). This has been supported by the observation that thioredoxins fold at similar rates but unfold at rates that correlate with their thermostability values.
Observations
- Protein folds with high sequence diversity also have high functional diversity (7).
- The sequence capacity of a protein exceeds for even small proteins (35-40 AAs), but the fraction of stable states is extremely small and inversely correlated with protein size. These values were estimated using Potts models (10).
Figure from (10)
1.
Kimura M. The Neutral Theory of Molecular Evolution. Cambridge University Press; 1985. Available from: https://books.google.com/books/about/The_Neutral_Theory_of_Molecular_Evolutio.html?id=e_HoAwAAQBAJ
2.
OHTA T. Slightly Deleterious Mutant Substitutions in Evolution. Nature. 1973;246(5428):96–8. Available from: https://doi.org/10.1038/246096a0
3.
Duran-Nebreda S, Bentley RA, Vidiella B, Spiridonov A, Eldredge N, O’Brien MJ, et al. On the multiscale dynamics of punctuated evolution. Trends in Ecology & Evolution. 2024;39(8):734–44. Available from: https://doi.org/10.1016/j.tree.2024.05.003
4.
Sella G, Hirsh AE. The application of statistical physics to evolutionary biology. Proceedings of the National Academy of Sciences. 2005;102(27):9541–6. Available from: https://doi.org/10.1073/pnas.0501865102
5.
Ding F, Steinhardt J. Protein language models are biased by unequal sequence sampling across the tree of life. openRxiv; 2024. Available from: https://doi.org/10.1101/2024.03.07.584001
6.
Weinstein EN, Amin AN, Frazer J, Marks DS. Non-identifiability and the Blessings of Misspecification in Models of Molecular Fitness. openRxiv; 2022. Available from: https://doi.org/10.1101/2022.01.29.478324
7.
Wagner A. Robustness and evolvability: a paradox resolved. Proceedings of the Royal Society B: Biological Sciences. 2007;275(1630):91–100. Available from: https://doi.org/10.1098/rspb.2007.1137
8.
Tokuriki N, Tawfik DS. Protein Dynamism and Evolvability. Science. 2009;324(5924):203–7. Available from: https://doi.org/10.1126/science.1169375
9.
Tzul FO, Vasilchuk D, Makhatadze GI. Evidence for the principle of minimal frustration in the evolution of protein folding landscapes. Proceedings of the National Academy of Sciences. 2017;114(9). Available from: https://doi.org/10.1073/pnas.1613892114
10.
Tian P, Best RB. How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis. Biophysical Journal. 2017;113(8):1719–30. Available from: https://doi.org/10.1016/j.bpj.2017.08.039