Alternative noise schedules improve training of multimodal PLMs

Summary

Alternative noise schedules improve training of protein language models (1). No ablation data is provided for this claim, but authors of ESM3 state that training using a typical 15% masking rate yielded poor results relative to a more random noise schedule.

Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, Lin Z, et al. Simulating 500 million years of evolution with a language model. Science. 2025;387(6736):850–8. Available from: https://doi.org/10.1126/science.ads0018

Quartz 4

Explorer

Alternative noise schedules improve training of multimodal PLMs

Summary

Graph View