Our knowledge of proteins is pivotal to the advance of medical science and our understanding of human biology.
Proteins are complex molecules that carry out most of the work in cells and are essential for the structure, function and regulation of the body’s tissues and organs.

The chemical properties of proteins are determined by their extremely complex 3D structures, and until recently, only a very small fraction of proteins were fully known.
Predicting these 3D shapes requires intensive processing power, but last year the artificial intelligence company DeepMind made a huge splash in the scientific community when it shared the results of its AI system AlphaFold.
The system had predicted the 3D structures of around 350,000 proteins from their single-dimensional amino acid sequences.
Now, DeepMind has gone far beyond that achievement by announcing that it had determined the structures of nearly 200 million individual proteins.
This represents almost all known proteins or, as DeepMind founder and CEO Demis Hassabis put it at a press conference, the structures for “the whole protein universe”.
Even with last year’s addition of the 350,000 predicted by AlphaFold, only around a million were previously known.
The new protein database is being shared with other researchers
DeepMind has collaborated with the European Molecular Biology Laboratory’s European Bioinformatics Institute (EMBL-EBI), an intergovernmental organisation that specialises in bioinformatics.
The protein structure database they created is being freely shared with researchers working in various fields of medicine and biology.
AlphaFold protein models have already been used for a range of purposes, including developing antibodies for malaria and even creating special enzymes to break down plastics.
More than half a million researchers from nearly 200 countries have accessed the database since it was set up, creating more than 1,000 scientific papers in the process.
Other areas of research enabled by the protein structures held in the database have included the health of bees, new understanding of the process of ice formation, and lesser-known diseases such as leishmaniasis and Chagas disease.
Sameer Velankar, who heads up EMBL-EBI’s protein data bank team in Europe, said that it illustrated the impact of access to just a million protein structures, with 200 million offering even more potential.
The proteins solved by the AI come from a wide range of organisms from bacteria to plants to vertebrates, including mice, fish and humans.
Today’s news was brought to you by TD SYNNEX – the UK’s number one solutions distributor.
Read more of our latest Industry Updates news stories