Ai2 launches new language models that rival Meta’s Llama language

[ad_1]

There is a new family of AI models, one of the few that can be reproduced from scratch.

On Tuesday, Ai2, the nonprofit AI research organization founded by the late Paul Allen, released OLMo 2, the second family of models in its OLMo series. (OLMo is short for “open language model.”) While there is no shortage of “open” language models to choose from (see: Meta’s Llama), OLMo 2 meets the Open Source Initiative’s definition of open source AI, which means tools and data The information used to develop it is publicly available.

open source initiative, Long term enterprise With the goal of defining and “stewarding” all things open source, it finalized the definition of open source AI in October. But the first OLMo models, released in February, also met this criterion.

“OLMo 2 was developed from start to finish with open and accessible training data, open source training code, repeatable training recipes, transparent assessments, intermediate checkpoints, and more,” AI2 wrote in a note. Blog post. “By openly sharing our data, recipes, and results, we hope to provide the open source community with the resources needed to discover new and innovative approaches.”

There are two models in the OLMo 2 family: one with 7 billion parameters (OLMo 7B) and the other with 13 billion parameters (OLMo 13B). The parameters roughly correspond to the model’s problem-solving skills, and models with more parameters generally perform better than those with fewer.

Like most language models, OLMo 2 7B and 13B can perform a range of textual tasks, such as answering questions, summarizing documents, and writing code.

To train the models, Ai2 used a dataset of 5 trillion symbols. Symbols represent pieces of raw data; 1 million symbols equal about 750,000 words. The training set included websites “filtered for high quality,” academic papers, question-and-answer discussion boards, and exercise books “both synthetic and human.”

Ai2 claims the result is competitive models, in terms of performance, with open models like Meta’s Llama 3.1 release.

Image credits:Ai2

“Not only did we notice a significant improvement in performance across all tasks compared to the previous OLMo model, but it’s worth noting that the OLMo 2 7B outperforms the LLama 3.1 8B,” Ai2 wrote. “[OLMo 2 represents]the best fully open language model to date.”

The OLMo 2 models and all their components can be downloaded from the Ai2 website Website. It is subject to the Apache 2.0 license, which means it can be used commercially.

There has been some controversy recently about the safety of open models, with Chinese researchers reportedly using llama models to develop defensive tools. When I asked Dirk Grunfeld, Ai2’s engineer, last February if he was concerned about OLMo’s abuse, he told me that he believed the benefits ultimately outweighed the harms.

“Yes, it is possible that open forms could be used inappropriately or for unintended purposes,” he said. “(However, this approach) also promotes technical progress that leads to more ethical models; is a prerequisite for verifiability and replicability, as this can only be achieved through access to the full range; and reduces the increasing concentration of power, creating more equitable access.” .

[ad_2]

Leave a Comment