Skip to content

Benchmarks

speed-svgrepo-com

The performance analytics of Colossal-AI demonstrate that our software is the fastest and most cost efficient solution for your deep learning infrastructure needs.

Top performance results in a nutshell

10x

faster training time

50% 

inference acceleration

14x

larger batch sizes

11x

lower GPU memory consumption

24x

larger model size on same hardware

50%

longer sequence length

50%

saved GPU resources

45%

fine-tuning speedup

1 GPU

suffices for model development
 

 

colossal-ai_logo_vertical


PyTorch_logo_black PyTorch is a machine learning framework for Python. 120x
larger model sizes


deepspeed-logo-uppercase-bold-white-1.15-1 Microsoft DeepSpeed is a deep learning optimization library. 3x
higher throughput


NVDA_BIG NVIDIA NeMo Megatron is a framework to build and deploy LLMs. 5x
faster training

 

model-as-brain-green ViT model

The Vision Transformer, or ViT, is a model for image classification that employs a Transformer-like architecture over patches of the image. ViTs are being adopted in a wide range of computer vision tasks, from image classification to object detection and segmentation.

Colossal-AI vs. Megatron

  • Achieve 14x larger batch sizes with Colossal-AI
  • and 5x faster training for ViT
Scaling ViT with GPU RAM & Throughput

model-as-brain-green GPT-3 model

Using text on the internet, GPT-3 is trained to generate realistic human text. GPT-3 has been used to create articles, poetry, stories, news reports and dialogue using just a small amount of input text that can be used to produce large amounts of quality copy.

Colossal-AI vs. Megatron

  • You can save 50% of your GPU resources with Colossal-AI
  • and achieve a 10.7% acceleration
Performance on GPT-3
Colossal-AI for GPT-3

model-as-brain-green GPT-2 model

GPT-2 is an unsupervised deep learning transformer-based language model created by OpenAI back in February 2019 for the single purpose of predicting the next word(s) in a sentence. GPT-2 is an acronym for “Generative Pretrained Transformer 2”.

Colossal-AI vs. Megatron

  • You benefit from a 11x lower GPU memory consumption with Colossal-AI
  • and superlinear scaling efficiency due to its tensor parallelism
Scaling GPT-2 with TFLOPs
Scaling GPT-2 with GPU RAM

Colossal-AI vs. PyTorch

  • Scale to a 24x larger model size on the same hardware
Scaling GPT-2 with Model Size-1

Colossal-AI vs. DeepSpeed

  • Benefit from a 3x speedup on the same computing devices
Scaling GPT-2 with Throughput

model-as-brain-greenBERT model

BERT is an open source machine learning framework for natural language processing (NLP). BERT is designed to help computers understand the meaning of ambiguous language in text by using surrounding text to establish context.

Colossal-AI vs. Megatron

  • Colossal-AI propels you to a 2x faster training
  • or enables you to run a 50% longer sequence length
Scaling BERT with Max Batch Size
Scaling BERT with Pipeline Parallelism
Scaling BERT with Sequence Length

model-as-brain-green GPT-3 model

Colossal-AI vs. NVIDIA FastTransformer

  • Colossal-AI enables you to achieve 50% inference acceleration on the same hardware infrastructure
GPT-3 Inference, Padding = 128, TP = 2 grey bg
GPT-3 Inference, Padding = 128, TP = 4 grey bg

model-as-brain-greenOPT model

Meta AI has introduced a large language model trained on billions of parameters called OPT (Open Pre-trained Transformers). It can be used to generate creative text, solve simple math problems, answer reading comprehension questions.

Colossal-AI vs. DeepSpeed

  • With Colossal-AI, a 45% speedup fine-tuning OPT is possible
  • at low cost in lines
1 GPU Performance on OPT 01
8 GPU Performance on OPT

Don't wait, accelerate!

Speed up and scale deep learning with Colossal-AI.

Try open source