Protein backbone design is the generation of protein backbones in three-dimensional space. This section also covers generation and design of entire protein structures in Cartesian space, but most methods uncouple design of the backbone and design of the sequence given the backbone (inverse folding). As of May 2024, the current state of the art uses diffusion.
Methods
- Chroma (1)
- RF-diffusion (2) and RFam (3)
- Hallucination using AlphaFold2 and RosettaFold
- Inpainting using RosettaFold (4)
Datasets
- Verkuil et al. (5) use a test set of 39 PDBs for their validation, although they cite someone else:
- 1QYS
- 2KL8
- 2KPO
- 2LN3
- 2LTA
- 2LVB
- 2N2T
- 2N2U
- 2N3Z
- 2N76
- 4KY3
- 4KYZ
- 5CW9
- 5KPE
- 5KPH
- 5L33
- 5TPJ
- 5TRV
- 6CZG
- 6CZH
- 6CZI
- 6CZJ
- 6D0T
- 6DG6
- 6DKM A
- 6DKM B
- 6DLM A
- 6DLM B
- 6E5C
- 6LLQ
- 6MRR
- 6MRS
- 6MSP
- 6NUK
- 6W3F
- 6W3W
- 6WI5
- 6WVS
- 7MCD
1.
Ingraham JB, Baranov M, Costello Z, Barber KW, Wang W, Ismail A, et al. Illuminating protein space with a programmable generative model. Nature. 2023;623(7989):1070–8. Available from: https://doi.org/10.1038/s41586-023-06728-8
2.
Watson JL, Juergens D, Bennett NR, Trippe BL, Yim J, Eisenach HE, et al. De novo design of protein structure and function with RFdiffusion. Nature. 2023;620(7976):1089–100. Available from: https://doi.org/10.1038/s41586-023-06415-8
3.
Kim D, Woodbury SM, Ahern W, Tischer D, Kang A, Joyce E, et al. Computational design of metallohydrolases. Nature. 2025;649(8095):246–53. Available from: https://doi.org/10.1038/s41586-025-09746-w
4.
Wang J, Lisanza S, Juergens D, Tischer D, Watson JL, Castro KM, et al. Scaffolding protein functional sites using deep learning. Science. 2022;377(6604):387–94. Available from: https://doi.org/10.1126/science.abn2100
5.
Verkuil R, Kabeli O, Du Y, Wicky BIM, Milles LF, Dauparas J, et al. Language models generalize beyond natural proteins. openRxiv; 2022. Available from: https://doi.org/10.1101/2022.12.21.521521