A small, highly incomplete, list of what to know about PDB formats (at least, what I found relevant but, concurrently, to so much highlighted).
WARNING: The PDB format is not updated anymore. Now the standard is the, quite heavy but coherent, mmCIF file format.
- If specified a polymer with the characteristic
Protein/Oligosaccharide
(RCSB), then with have an AA bind to another molecule such as a Carbohydrate - unit cell (x-ray exp)
$\neq$ biological unit. Here the guide from RCSB
- Missing residues are described by the
REMARK 465
record. SEQRES
describe the sequence: it contains missing residues to be compared withREMARK 465
- AA that has been modified by the experimentalist, or post-translationally or other, are present with the
HETATM
record and described in the header with theMODRES
,HET
andFORMULA
records- Missing atoms that are non-standard AA are not required to be described by a
MODRES
record. Nevertheless, they appear in theSEQRES
record
- Missing atoms that are non-standard AA are not required to be described by a
REMARK 610
describe non-polymer residues with missing atoms
Biopython
- API doc and general description
- very nicely implemented library but the documentation is far from being complete
PDB
used to read PDBs, speciallyATOM
recordsSeqIO
used to read pdb headers and sequences. Used especially for the second case, hence bio-informatics applications.
MDAnalysis
- Used with MD trajectories for IO and analysis. Complete library with steep learning curve
openmm/pdbfixer
Modeller
- VEry complete software for a large range of application, maybe too much, and API doc which is not well structured