Introducing Stable LM 2 12B

rw-book-cover

Metadata

Author: Bryce Wilson
Full Title: Introducing Stable LM 2 12B
URL: https://stability.ai/news/introducing-stable-lm-2-12b?utm_source=website&utm_medium=blog&utm_campaign=twitter

Highlights

Key takeaways • Stable LM 2 12B is a pair of powerful 12 billion parameter language models trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch, featuring a base and instruction-tuned model during release. You can now try Stable LM 2 12B live here. • Both models are available for testing on Hugging Face (base & chat) and can be utilized non-commercially as well as commercially with a Stability AI Membership. • This release includes an update to Stable LM 2 1.6B which improves its conversational skills in all of the seven aforementioned languages and incorporates tool usage and function calling. (View Highlight)
Introducing the latest additions to our Stable LM 2 language model series: a 12 billion parameter base model and an instruction-tuned variant, trained on 2 trillion tokens in seven languages: English, Spanish, German, Italian, French, Portuguese, and Dutch. This medium-sized model balances strong performance, efficiency, memory requirements, and speed, following our established Stable LM 2 1.6B framework as detailed in our previously released technical report. With this release, we’re extending our model range, offering a transparent and powerful tool for developers to innovate in AI language technology. Soon, we plan to introduce a long-context variant of these models which will be available on Hugging Face upon release. (View Highlight)
Stable LM 2 12B is designed as an efficient open model tailored for multilingual tasks with smooth performance on widely available hardware. This model can handle a variety of tasks that are typically feasible only for significantly larger models, which often necessitate substantial computational and memory resources, such as large Mixture-of-Experts (MoEs). Moreover, the instruction-tuned version features high performance in tool usage and function calling, allowing it to be perfectly suited for various uses including as a central part of retrieval RAG systems. (View Highlight)
We compare Stable LM 2 12B to other popular strong language models such as Mixtral (MoE, 13B active parameters out of 47B in total), Llama2 (13B & 70B), Qwen 1.5 (14B), Gemma (8.5B), and Mistral (7B). As shown below, the new Stable LM 2 12B offers solid performance when tested on zero-shot and few-shot tasks across general benchmarks outlined in the Open LLM Leaderboard and (the newly corrected) MT-Bench. (View Highlight)
With this new release, we extend the StableLM 2 family of models into the 12B category, providing an open and transparent model that makes no compromise on power and accuracy. We are confident this new release will enable developers and businesses to continue developing the future while retaining full control over their data. Stable LM 2 12B can be used now for commercial and non-commercial purposes with a Stability AI Membership. Learn more about commercial applications by contacting us here. (View Highlight)

New highlights added April 10, 2024 at 11:09 PM

Stable LM 2 12B is a pair of powerful 12 billion parameter language models trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch, featuring a base and instruction-tuned model during release. (View Highlight)

Pelayo Arbués

Explorer

Introducing Stable LM 2 12B

Metadata

Highlights

New highlights added April 10, 2024 at 11:09 PM

Graph View

Table of Contents

Recent Notes

AI Enhanced Knowledge Management

Sucessful Model

The Rise of the Dataset Engineer

Now Reading

A Guide to Structured Generation Using Constrained Decoding

Pelayo Arbués

Explorer

Introducing Stable LM 2 12B

Metadata §

Highlights §

New highlights added April 10, 2024 at 11:09 PM §

Graph View

Table of Contents

Recent Notes

AI Enhanced Knowledge Management

Sucessful Model

The Rise of the Dataset Engineer

Now Reading

A Guide to Structured Generation Using Constrained Decoding

Metadata

Highlights

New highlights added April 10, 2024 at 11:09 PM