The Top2018 set is a structural dataset with residues captured at <2.0 Å resolution and high confidence by X-ray crystallography, clustered at various levels of sequence homology. These are single chains isolated from multi-chain complexes.

Details

Full filtering criteria: chain is protein and at least 60% complete; released on or before Dec 31, 2018; resolution < 2.0 Å; MolProbity Score < 2.0; <3% residues with Cβ deviations; <2% residues with covalent bond length outliers; <2% residues with covalent bond geometry outliers. Residue-level filtering additionally requires B-factor ≤ 40, real-space correlation coefficient ≥ 0.7, 2Fo-Fc map value ≥ 1.2, and no covalent geometry outliers, steric overlaps, or alternate conformations.

These were used for side-chain rotamer prediction (1).

See also

1.
Randolph NZ, Kuhlman B. Invariant point message passing for protein side chain packing. Proteins: Structure, Function, and Bioinformatics. 2024;92(10):1220–33. Available from: https://doi.org/10.1002/prot.26705