Summary
DNA substitutions during somatic hypermutation are spatially clustered. In other words, codons are more likely to get multiple nucleotide substitutions in the same codon than equivalent stretches of DNA in other parts of the genome. (1) found that a “multihit correction” during parametrization of their antibody language model could account for this phenomenon. This was achieved by training on out-of-frame immunoglobulin sequence stretches.
Figures
Ref (1)
1.
Matsen FA, Dumm W, Sung K, Johnson MM, Rich DH, Starr TN, et al. Separating selection from mutation in antibody language models. eLife. 2026;15. Available from: https://doi.org/10.7554/elife.109644.3