[N] OpenLLaMA: An Open Reproduction of LLaMA | allainews.com

May 3, 2023, 8:51 a.m. | /u/Philpax

Machine Learning www.reddit.com

https://github.com/openlm-research/open_llama

> We train our models on the RedPajama dataset released by Together, which is a reproduction of the LLaMA training dataset containing over 1.2 trillion tokens. We follow the exactly same preprocessing steps and training hyperparameters as the original LLaMA paper, including model architecture, context length, training steps, learning rate schedule, and optimizer. The only difference between our setting and the original one is the dataset used: OpenLLaMA employs the RedPajama dataset rather than the one utilized by the …

architecture context dataset difference llama machinelearning paper rate redpajama together tokens training

More from www.reddit.com / Machine Learning

[D] Is Evaluating LLM Performance on Domain-Specific QA Sufficient for a Top-Tier Conference Submission? 5 hours ago | www.reddit.com

conference domain five hello +9

[D] Best community/website to find ML engineer interested in hourly work 8 hours ago | www.reddit.com

apis building community custom models +15

[D] What on earth is "discretization" step in Mamba? 10 hours ago | www.reddit.com

article core earth form +11

[R] Better & Faster Large Language Models via Multi-token Prediction 10 hours ago | www.reddit.com

abstract efficiency future gpt +17

[D] How to use RAG benchmarks in practice 15 hours ago | www.reddit.com

context datasets however machinelearning +5

[D] ECCV-2024 reviews are out 23 hours ago | www.reddit.com

eccv machinelearning reviews

[D] ICLR Outstanding Paper Awards. Congratulations! 1 day, 1 hour ago | www.reddit.com

abstract feature identify images +12

[D] Where does the term "feature" come from? 1 day, 3 hours ago | www.reddit.com

call engineering feature features +8

[D] Any encoder only model having bigger max token than 512 (BERT, Roberta, etc)? 1 day, 9 hours ago | www.reddit.com

advance bert bigger class +8

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net