Sunday, July 7, 2024

AI21 Labs juices up gen AI transformers with Jamba

Be a part of us in Atlanta on April tenth and discover the panorama of safety workforce. We are going to discover the imaginative and prescient, advantages, and use circumstances of AI for safety groups. Request an invitation right here.


Ever for the reason that groundbreaking analysis paper “Consideration is All You Want” debuted in 2017, the idea of transformers has dominated the generative AI panorama.

Transformers nevertheless usually are not the one path ahead with generative AI. A brand new strategy from AI21 Labs dubbed “Jamba” appears to transcend transformers. Jamba combines the Mamba mannequin, which relies on the Structured State House mannequin (SSM), together with a transformer structure to create an optimized gen AI mannequin.  Jamba is an acronym that stands for Joint Consideration and Mamba (Jamba) structure, and it goals to convey the very best attributes of SSM and transformers collectively. Jamba is being launched as an open-source mannequin underneath the Apache 2.0 license.

To be clear, it’s not going that Jamba will substitute present transformer-based massive language fashions (LLM) as we speak, however it would doubtless be a complement in sure areas. In line with AI21 Labs Jamba can outperform conventional transformer-based fashions on generative reasoning duties as measured by benchmarks similar to HellaSwag. Nevertheless, it at present doesn’t outperform transformer-based fashions on different vital benchmarks such because the Large Multitask Language Understanding (MMLU) for problem-solving.

Jamba isn’t only a new Jurassic take from AI21 Labs

AI21 Labs has a selected give attention to gen AI for enterprise use circumstances. The corporate raised $155 million in Aug. 2023 to assist it’s rising efforts.

VB Occasion

The AI Impression Tour – Atlanta

Persevering with our tour, we’re headed to Atlanta for the AI Impression Tour cease on April tenth. This unique, invite-only occasion, in partnership with Microsoft, will characteristic discussions on how generative AI is reworking the safety workforce. House is proscribed, so request an invitation as we speak.


Request an invitation

The corporate’s enterprise instruments embrace Wordtune, an optimized service to assist enterprises generate content material that matches a corporation’s tone and model. A121 Labs informed VentureBeat in 2023 that it usually competes and straight wins in opposition to gen AI large OpenAI for enterprise enterprise.

So far, AI21 Labs’ LLM know-how has relied on the transformer structure, identical to each different LLM. Simply over a yr in the past, the corporate launched its Jurassic-2 LLM household, which is a part of the AI21 Studio pure language processing (NLP)-as-a-service platform and can also be obtainable by way of APIs for enterprise integrations.

Jamba shouldn’t be an evolution of Jurassic, it’s one thing fairly totally different as a hybrid SSM and transformer mannequin.

Consideration isn’t all you want, you additionally want context

Transformers have dominated the gen AI panorama to this point, however nonetheless have some shortcomings. Most notable is the truth that inference usually slows as context home windows develop. 

Because the AI21 Labs researchers notice, a transformer’s consideration mechanism scales with sequence size and slows down throughput, as every token is determined by all the sequence that got here earlier than it. This locations lengthy context use circumstances outdoors the scope of environment friendly manufacturing. 

The opposite situation highlighted by AI21 Labs is the big reminiscence footprint requirement for scaling transformers. The transformer reminiscence footprint scales with context size, making it difficult to run lengthy context home windows or quite a few parallel batches with out intensive {hardware} sources. 

The context and reminiscence sources points are two considerations that the SSM strategy appears to unravel.

The Mamba SSM structure was initially proposed by researchers at Carnegie Mellon and Princeton Universities, with much less reminiscence requirement and a unique consideration mechanism to deal with massive context home windows. Nevertheless, the Mamba strategy struggles to supply the identical output degree as a transformer mannequin. The Jamba hybrid SSM Transformer strategy is an try to mix the useful resource and context optimization of the SSM structure with the robust output capabilities of a transformer.

AI21 Labs’ Jamba mannequin provides a 256K context window and might ship 3x throughput on lengthy contexts in comparison with Mixtral 8x7B. AI21 Labs additionally claims that Jamba is the one mannequin in its measurement class that matches as much as 140K context on a single GPU.

Of notice, identical to Mixtral, Jamba makes use of a Combination of Consultants (MoE) mannequin. Nevertheless, Jamba makes use of MoE as a part of its hybrid SSM Transformer strategy, which permits for an excessive degree of optimization. Particularly, Jamba’s MoE layers permit it to attract on simply 12B of its obtainable 52B parameters at inference, making these 12B lively parameters extra environment friendly than a Transformer-only mannequin of equal measurement, in accordance with AI21 Labs.

It’s nonetheless early days for Jamba and it’s not but a part of an enterprise providing from AI21 Labs. The corporate plans to supply an instruct model on the AI21 Platform as a beta quickly.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles