Llemma: An Open Language Model For Mathematics | allainews.com

Oct. 17, 2023, 2 a.m. |

Blog on EleutherAI Blog blog.eleuther.ai

ArXiv | Models | Data | Code | Blog | Sample Explorer
Today we release Llemma: 7 billion and 34 billion parameter language models for mathematics. The Llemma models were initialized with Code Llama weights, then trained on the Proof-Pile II, a 55 billion token dataset of mathematical and scientific documents. The resulting models show improved mathematical capabilities, and can be adapted to various tasks through prompting or additional fine-tuning.

arxiv billion blog code code llama data dataset documents language language model language models llama mathematics release token

More from blog.eleuther.ai / Blog on EleutherAI Blog

VINC-S: Closed-form Optionally-supervised Knowledge Elicitation with Paraphrase Invariance 1 week, 4 days ago | blog.eleuther.ai

form knowledge project results +2

Pile-T5 1 month, 2 weeks ago | blog.eleuther.ai

Yi-34B, Llama 2, and common practices in LLM training: a fact check of the New … 2 months, 1 week ago | blog.eleuther.ai

check llama llama 2 llm +4

The Foundation Model Development Cheatsheet 3 months ago | blog.eleuther.ai

cheatsheet dev development foundation +2

Least-Squares Concept Erasure with Oracle Concept Labels 5 months, 2 weeks ago | blog.eleuther.ai

concept inference labels least +2

Diff-in-Means Concept Editing is Worst-Case Optimal 5 months, 3 weeks ago | blog.eleuther.ai

case concept diff editing +3

The third New England RLHF Hackers Hackathon 6 months, 1 week ago | blog.eleuther.ai

community discord elephants england +15

Extending the RoPE 6 months, 2 weeks ago | blog.eleuther.ai

eleutherai rope

How the Foundation Model Transparency Index Distorts Transparency 7 months, 1 week ago | blog.eleuther.ai

foundation foundation model foundation model transparency index index +2

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

View on ai-jobs.net

GN SONG MT Market Research Data Analyst 11

@ Accenture | Bengaluru, BDC7A

View on ai-jobs.net