Recently, Microsoft announced the new advancements in the popular deep learning optimisation library known as DeepSpeed. This library is an important part of Microsoft’s new AI at Scale initiative to enable next-generation AI capabilities at scale.
DeepSpeed, the open-source deep learning training optimisation library was unveiled in February this year along with ZeRO (Zero Redundancy Optimiser), which is a memory optimisation technology in the library that assists large model training by improving scale, speed, cost, and usability.
The researchers at the tech giant developed this library in order to make distributed training easy, efficient, and effective. DeepSpeed now can train a language model with one trillion parameters using as few as 800 NVIDIA V100 GPUs.
DeepSpeed has combined three powerful technologies to enable training of trillion-scale models and to scale to thousands of GPUs, which include data-parallel training, model parallel training, and pipeline parallel training.
There are several intuitive features of this deep learning library. Some of them are mentioned below:
Training a large and advanced deep learning model is complex as well as challenging. It includes a number of conditions, such as model design, setting up state-of-the-art training techniques including distributed training, mixed precision, gradient accumulation, among others.
Even after putting in a lot of effort, there is no certainty that the system will perform up to the expectation or achieve the desired convergence rate. This is because large models easily run out of memory with pure data parallelism, and it is hard to utilise model parallelism in such cases. This is where DeepSpeed comes into the picture. The library not only addresses these drawbacks but also accelerates model development and training.
According to this blog post, the DeepSpeed library specifically added four new system technologies that offer extreme compute, memory, communication efficiency and powers model training with billions to trillions of parameters. The blog post mentioned that it can compute long input sequences and power on hardware systems with a single GPU, low-end clusters with very slow ethernet networks, and more.
Researchers at the tech giant have been continuing to innovate at a fast rate, pushing the boundaries of speed and scale for deep learning training. The library has enabled researchers to create the Turing Natural Language Generation (Turing-NLG), also known as one of the largest language models with 17 billion parameters and state-of-the-art accuracy.
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box. Contact: [email protected]
Deep learning, Microsoft Corporation, Artificial intelligence, Graphics processing unit
World news – CA – Microsoft Releases Latest Version Of DeepSpeed, Its Python Library For Deep Learning Optimisation