Google Deepmind getting better at folding protein – could revolutionize biology

The open database Protein Data Bank today contains the three-dimensional structures of about 180,000 proteins that occur in various organisms of all kinds, from humans and fruit flies to fungi and bacteria.

Google’s subsidiary Deepmind is expanding its database of estimated structures for an additional 350,000 proteins. These are structures that the company’s machine learning algorithm Alphafold 2 has developed.

The algorithm has been trained on existing proteins and is an improved version of Alphafold which was released in 2018. What now makes biologists around the world play is that the new algorithm is much more accurate, reports The Verge. According to Deepmind’s researchers’ tests, the models correspond to reality in almost 90 percent of the cases. It is also 16 times faster.

Several researchers The Verge has spoken to say that the progress will have a major impact on biological research in the future. Marcelo C. Sousa at the University of Colorado, for example, says that his research team had had a protein for ten years without succeeding in determining the structure – for Alphafold, it took a quarter of an hour.

The protein structures that Alphafold 2 has developed must normally be confirmed experimentally in order to be used, for example, in the production of new drugs, but it is far from as time-consuming and expensive as developing it from scratch.