Summary
(1) trained a protein language model with sequences generated by ancestral sequence reconstruction and found improvements in prediction of enzymatic activity and epistasis. This could be because the sequences cover regions of space that are distinct from extant sequences.
Figures
Ref (1)
1.
Matthews DS, Spence MA, Mater AC, Nichols J, Pulsford SB, Sandhu M, et al. Leveraging ancestral sequence reconstruction for protein representation learning. Nature Machine Intelligence. 2024;6(12):1542–55. Available from: https://doi.org/10.1038/s42256-024-00935-2