RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens | allainews.com

s

April 17, 2023, 5:13 p.m. |

Simon Willison's Weblog simonwillison.net

RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens

With the amount of projects that have used LLaMA as a foundation model since its release two months ago - despite its non-commercial license - it's clear that there is a strong desire for a fully openly licensed alternative.

RedPajama is a collaboration between Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, Hazy Research, and MILA Québec AI Institute aiming to build exactly …

ai collaboration dataset foundation foundation model generativeai homebrewllms llama llms opensource project projects redpajama release research stanford tokens training

More from simonwillison.net / Simon Willison's Weblog

Si

Bullying in Open Source Software Is a Massive Security Vulnerability 15 hours ago | simonwillison.net

backdoor contributor linux linux distributions +13

Si

experimental-phi3-webgpu 15 hours ago | simonwillison.net

ai browser browsers cache +20

Si

datasette-pins — a new Datasette plugin for pinning tables and queries 19 hours ago | simonwillison.net

alex alexgarcia cloud databases +11

Si

Quoting Nathaniel Borenstein 1 day, 17 hours ago | simonwillison.net

basic consent engineer ethics +5

Si

Slop is the new name for unwanted AI-generated content 1 day, 19 hours ago | simonwillison.net

ai ai generated ai-generated content art +11

Si

OpenAI Model Spec, May 2024 edition 1 day, 20 hours ago | simonwillison.net

ai api chatgpt core +10

Si

Modern SQLite: Generated columns 1 day, 21 hours ago | simonwillison.net

antonzhiyanov features generated modern +6

Si

Tagged Pointer Strings (2015) 1 day, 23 hours ago | simonwillison.net

embed implementation least macos +6

Si

Towards universal version control with Patchwork 2 days, 12 hours ago | simonwillison.net

ai applications beyond control +12

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net