ProteinGym is a dataset presented formally by Notin et al. (1) but used for several years previously, that aggregates deep mutational scanning data and ΔΔG data (e.g., from (2)) for evaluation of various deep learning models.
1.
Notin P, Kollasch AW, Ritter D, van Niekerk L, Paul S, Spinner H, et al. ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction. openRxiv; 2023. Available from: https://doi.org/10.1101/2023.12.07.570727
2.
Tsuboyama K, Dauparas J, Chen J, Laine E, Mohseni Behbahani Y, Weinstein JJ, et al. Mega-scale experimental analysis of protein folding stability in biology and design. Nature. 2023;620(7973):434–44. Available from: https://doi.org/10.1038/s41586-023-06328-6