ServiceNow open sources Fast-LLM in a bid to help enterprises train AI models 20X quicker
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Training a large language model (LLM) is among the most costly and time consuming exercises for enterprises. A new open-source model being released today by ServiceNow could make a big difference, with the promise of training 20% faster, saving enterprises time and money.
The Fast-LLM technology has already been in development inside the company, helping ServiceNow to accelerate its own LLM training efforts. Fast-LLM helped train ServiceNow’s StarCoder 2 LLM, which the company released earlier this year. StarCoder itself is an open source effort, as well, which benefits from the contributions of Hugging Face, Nvidia and others. ServiceNow also uses Fast-LLM for large, trillion-token continuous pre-training from existing models, as well as for fine-tuning jobs.
Because it is an open source technology, anyone can use Fast-LLM to help accelerate AI training, including fine tuning operations. The intent is that it can be a drop-in replacement to an existing AI training pipeline with minimal configuration changes. The new open source project aims to differentiate against commonly used AI training frameworks, including the open-source PyTorch, with a series of innovations for data parallelism and memory management.
“When you’re dealing with compute clusters that cost hundreds of millions and training runs that cost millions of dollars, 20% can be a huge saving in terms of both dollars and time and the overall CO2 footprint,” Nicholas Chapados, VP of research at ServiceNow, told VentureBeat.
The innovations that enable Fast-LLM to accelerate AI training
The AI industry well understands the challenge of training AI more efficiently. VentureBeat Transform 2024 featured a panel that discussed that very issue, detailing options for scaling infrastructure.
The Fast-LLM approach isn’t about scaling infrastructure; it’s about optimizing the efficiency of existing training resources.
“We carefully looked at all the operations needed to train large language models, especially transformer based large language models,” Chapados explained. “We carefully optimize both the way in which the compute is distributed to the individual cores within the GPU, as well as how the memory is being used by the models themselves.”
Fast-LLM’s competitive advantage stems from two primary innovations that help to differentiate it. The first is Fast-LLM’s approach to computation ordering, which defines the order in which computations occur in an AI training run. Chapados explained that Fast-LLM uses a new technique that ServiceNow calls “Breadth-First Pipeline Parallelism.”
“This is the fundamental scientific innovation around the way that compute is scheduled, both inside a single GPU and across multiple GPUs,” said Chapados.
The second major innovation addresses memory management. In large training operations, memory fragments over time. This means memory becomes broken into pieces over time as training progresses. The fragmentation creates memory inefficiency, preventing training clusters from using all available memory properly.
“We’ve been very careful in the way that we design Fast LLM to almost completely eliminate the problem of memory fragmentation when training those large language models,” said Chapados.
How enterprises can use Fast-LLM today to accelerate training
The Fast-LLM framework is designed to be accessible while maintaining enterprise-grade capabilities. It functions as a drop-in replacement for PyTorch environments and integrates with existing distributed training setups.
“For any model developer or any researcher, it’s just a simple configuration file that lets you specify all the architectural details that matter,” said Chapados .
Running training operations faster has multiple benefits and can allow enterprises to experiment more.
“It makes the risk of large training runs smaller,” said Chapados. “It equips users, researchers and model builders with a bit more ambition to train larger runs, because they will not be afraid that it will cost so much anymore.”
Looking forward, the expectation is that as an open source project, Fast-LLM will be able to expand faster, benefiting from external contributions. ServiceNow has already been successful with that approach with StarCoder.
“Our goal is really to be very, very transparent and responsive to the community contributions in terms of the use of this framework,” said Chapados.” We’re still getting early feedback about what people like, what they are able to do with it and our goal is really to scale this.”