Analysis of macromolecular complexes data in the PDB

Disclaimer

This is a repository of a Jupyter notebook, which is supplementary material for the publication:

Publication
Appasamy, S.D., Berrisford, J., Gaborova, R., Nair, S., Anyango, S., Grudinin, S., Deshpande, M., Armstrong, D., Pidruchna, I., Ellaway, J.I.J., Leines, G.D., Gupta, D., Harrus, D., Varadi, M. and Velankar, S. (2023) Annotating macromolecular complexes in the Protein Data Bank: improving the FAIRness of structure data. Scientific Data, 10, 853. https://doi.org/10.1038/s41597-023-02778-9 .

The code in this notebook reproduces the analysis presented in the publication.

Update (October 2025):
The data CSV files in the data/ directory have been updated to align with the analyses prepared for the PDBe-KB Complexes draft manuscript.

Background

Macromolecular complexes are crucial functional units in virtually all cellular processes. Their atomic-level understanding is vital to understanding molecular mechanisms and affects applications, such as developing new therapeutics. The Protein Data Bank (PDB) is the central repository for experimentally determined macromolecular structures. However, it can be challenging to find all instances that represent the same assembly in the PDB due to the current PDB annotation practices, which do not include the annotation of assemblies. This study highlights the importance of annotations for macromolecular complexes and the need for more robust methods to identify and classify these complexes across the PDB. We propose a new approach that uses external resources such as the Complex Portal and Gene Ontology to describe assemblies accurately and put them into their biological contexts. We anticipate that the new approach to identifying and classifying complexes will improve the usability and utility of the PDB for researchers in the field of structural biology and drug discovery.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
data		data
LICENSE		LICENSE
README.md		README.md
assemblies_analysis.ipynb		assemblies_analysis.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of macromolecular complexes data in the PDB

Disclaimer

Background

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Analysis of macromolecular complexes data in the PDB

Disclaimer

Background

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages