Accelerating Structure Prediction of Protein Monomers and Multimer by 11 Times! An Open Source Solution from Colossal-AI and BioMap
The latest solution from the Colossal-AI team (https://github.com/hpcaitech/ColossalAI) and BioMap for protein monomer and multimer structure prediction, xTrimo Multimer, has recently become open-source to the public. This new solution can predict both monomer and multimer structure simultaneously accelerating the process by up to 11 times!
The hero behind is Colossal-AI, which is a powerful deep learning system that aims to make large AI model training easy and accessible in the community and industry. By integrating large model training techniques and optimizations provided by Colossal-AI, we can significantly reduce the time and cost of both protein monomer and multimer structure prediction during model training and inference. As an important practice of the Colossal-AI system in the pharmaceutical industry, xTrimo Multimer can greatly increase the pace of the model design and development for protein structure prediction, facilitating breakthroughs regarding large AI model applications in healthcare and bioinformatics.
Learn more about our powerful solutions here: https://github.com/hpcaitech/ColossalAI/#xTrimoMultimer
Colossal-AI is a user-friendly deep learning system that enables companies to maximize the efficiency of AI deployments while drastically reducing costs. Since open source to the public, Colossal-AI has reached №1 in trending projects on Github and Papers With Code several times, together with other projects that have as many as 10K stars. Furthermore, Colossal-AI always keeps increasing the availability of AI solutions for industries and is already showing tremendous potential across a variety of fields including medicine, autonomous vehicles, cloud computing, retail, and chip production. The most recent application is the partnership of Colossal-AI with BioMap to propose the latest cost-effective solution in protein monomer and multimer structure prediction. This application is able to help healthcare providers and pharmaceutical companies in diagnosis and stimulate novel drug research and discovery.
Protein structure prediction is one of the most important topics in structural biology and supplements the understanding of gene translation and protein function. Unfortunately, the multi-level structure and sophisticated protein interactions make it challenging to predict the 3D structure accurately.
In recent years, the success of deep neural networks has transformed various practices. Since DeepMind’s release of AlphaFold (which accurately predicts protein structure based on amino acid sequences), the field of biology has witnessed a boom in utilization of AI for protein structure prediction.
Specifically, AlphaFold can generate end-to-end 3D structure predictions of protein monomers directly from amino acid sequences. The use of AlphaFold also exceeds the realm of monomers. Since the majority of proteins function as multimers, DeepMind’s AlphaFold-Multimer model is recently released to be able to predict the structure of multimers.
To boost the development of AlphaFold, the Colossal-AI team has already released FastFold, an open-source and optimized implementation of AlphaFold in the past few months. The Colossal-AI team managed to successfully minimize AlphaFold’s training time from 11 days to only 67 hours abd accelerate inference time by up to ~11.6 times. The Colossal-AI team continues their efforts to democratize large-scale AI model applications in the pharmaceutical field.
Interactions between proteins are critical to their biological functions. To address difficulties related to protein monomer and multimer structure prediction, the Colossal-AI Team proposed the industry’s latest solution, the xTrimo Multimer. xTrimo Multimer is able to better reflect protein interactions, thus enhancing the potential target analysis, protein structure prediction/ simulation, as well as high-precision antibody designs in drug discovery and development.
The unaffordable economic and time costs from AlphaFold’s inference has led to challenges in its research and development, particularly when facing long sequence inferences with rising computational complexity and memory consumption. Based upon computational features in the AlphaFold-Multimer model, the Colossal-AI team introduced CUDA optimization and Kernel Fusion techniques for the xTrimo Multimer achieving remarkable inference performances. Compared to AlphaFold2 and OpenFold (from Columbia University), the xTrimo Multimer has a significantly improved inference performance on a single GPU by 1.58–2.14 times and 1.14–2.23 times, respectively.
Additionally, the xTrimo Multimer model supports distributed inferences for lengthy sequences. After introducing Dynamic Axial Parallelism, the xTrimo Multimer was able to efficiently distribute computation and partial GPU memory across a variety of devices, thereby solving computational and memory challenges that long sequences face. xTrimo Multimer achieves a 8.47x and 11.15x speedup compared to OpenFold and AlphaFold 2 on multiple GPUs with sequence lengths ranging anywhere from 2–3K. xTrimo Multimer also shows a 4.45x acceleration compared to Uni-Fold 2.0. Furthermore, xTrimo Multimer can support inferences with sequences reaching up to 4K, whereas OpenFold and AlphaFold 2 are restricted from such lengths due to GPU memory. With xTrimo Multimer, scientists can run a 4K length sequence inference in about 20 minutes.
“The collaboration with HPC-AI Tech brings together the cutting-edge technology in large AI model training from Colossal-AI team and the biocomputing domain expertise from BioMap. The release of xTrimo Multimer is an important step towards integrating the advantages of large AI model training and inference into the construction of BioMap’s xTrimo multimodal system. ” — Le Song, Chief AI Scientist of BioMap
“Our latest solution of protein Monomer and Multimer structure prediction is an important progress of Colossal-AI to solve the industrial problems in the real world. In the future, we will continue to cooperate with BioMap more deeply in biocomputing large models, to stimulate the application and implementation of deep learning in innovative drug development. ” — Yang You, the Chairman of HPC-AI Tech
The accelerated implementation, xTrimo Multimer, will serve as one of the important products, alongside other impressive industrial solutions built upon Colossal-AI, to facilitate the large-scale AI modeling for global companies. The Colossal-AI team will continue to explore all the emerging possibilities of AI model training in various fields, endeavor to tackle issues in the modern industry, and empower the future of the global AI market.
Open Source xTrimoMultimer: https://github.com/hpcaitech/ColossalAI/#xTrimoMultimer
Open Source Colossal-AI: https://github.com/hpcaitech/ColossalAI
BioMap is a team of world-renowned scientists who have extensive expertise in disease biology, bioinformatics, machine learning/deep learning, and antibody engineering. BioMap was co-founded by Baidu’s Founder/CEO Robin Li and the former CEO of Baidu Ventures, Wei Liu. They are committed to bringing first-in-class medicine for unmet medical needs in the areas of immune-oncology, autoimmune diseases, fibrosis and aging-related diseases.
About HPC-AI Tech
HPC-AI Tech is a global company which aims to help users improve the efficiency of training and deploying large AI models. The company was founded by Dr. Yang You, who received his Ph.D. in Computer Science from UC Berkeley and is currently the Presidential Young Professor at the National University of Singapore. HPC-AI Tech has developed an efficient large AI model training and inference system, Colossal-AI, that integrates advanced technologies which help users efficiently deploy large AI model training and inference at a low cost.