Summary
Weighting training sequences by fitness improves identification of other fit sequences (1). This was shown when training linear models and two-layer MLPs for design of Adeno-associated virus capsid proteins, significantly improving prediction of fit sequences compared to an unweighed dataset.
Figures

Ref (1)
1.
Zhu D, Brookes DH, Busia A, Carneiro A, Fannjiang C, Popova G, et al. Optimal trade-off control in machine learning–based library design, with application to adeno-associated virus (AAV) for gene therapy. Science Advances. 2024;10(4). Available from: https://doi.org/10.1126/sciadv.adj3786