Summary
Potts models derived from MSAs can be used as the basis to model fitness landscapes (1). This is proposed to take epistatic effects into account. Fannjiang and (2) give several citations for these correlating with actual fitness landscapes. However, MSA Transformer seemed to outperform Potts in this regard (3,4). While PSSMs can also be used, they do not account for any higher-order interactions.
1.
Sesta L, Pagnani A, Fernandez-de Cossio-Diaz J, Uguzzoni G. Inference of annealed protein fitness landscapes with AnnealDCA. PLOS Computational Biology. 2024;20(2):e1011812. Available from: https://doi.org/10.1371/journal.pcbi.1011812
2.
Fannjiang C, Listgarten J. Is novelty predictable? arXiv. 2023; Available from: https://arxiv.org/abs/2306.00872
3.
Lupo U, Sgarbossa D, Bitbol A-F. Protein language models trained on multiple sequence alignments learn phylogenetic relationships. Nature Communications. 2022;13(1). Available from: https://doi.org/10.1038/s41467-022-34032-y
4.
Sgarbossa D, Lupo U, Bitbol A-F. Generative power of a protein language model trained on multiple sequence alignments. eLife. 2023;12. Available from: https://doi.org/10.7554/elife.79854