Multiple sequence alignments (MSAs) are sets of sequences aligned either to a reference sequence or to each other. They are widely used in bioinformatics:

Notes

  • The inclusion of MSAs improves zero-shot prediction using PLMs (1) Ref (1)
1.
Su J, Han C, Zhou Y, Shan J, Zhou X, Yuan F. SaProt: Protein Language Modeling with Structure-aware Vocabulary. openRxiv; 2023. Available from: https://doi.org/10.1101/2023.10.01.560349