all AI news
RedPajama, a project to create leading open-source models, starts by reproducing LLaMA training dataset of over 1.2 trillion tokens
Simon Willison's Weblog simonwillison.net
With the amount of projects that have used LLaMA as a foundation model since its release two months ago - despite its non-commercial license - it's clear that there is a strong desire for a fully openly licensed alternative.
RedPajama is a collaboration between Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, Hazy Research, and MILA Québec AI Institute aiming to build exactly …
ai collaboration dataset foundation foundation model generativeai homebrewllms llama llms opensource project projects redpajama release research stanford tokens training