Summary
The encoder from ProteinMPNN and the structure update module from (AlphaFold2) can be combined into a vector-quantized variational autoencoder (1). They use a vector-quantized VAE similar to Foldseek to learn a discrete vocabulary of either ~4000 or ~64000 possible structural tokens. Frame aligned point error is used as a loss function.
1.
Gaujac B, Donà J, Copoiu L, Atkinson T, Pierrot T, Barrett TD. Learning the Language of Protein Structure. 2024; Available from: https://arxiv.org/abs/2405.15840