Nov. 29, 2022, 3:19 p.m. | Together

Blog Content - TOGETHER www.together.xyz

With a new decentralized training algorithm, we fine-tuned GPT-J (6B) on
3.53 billion tokens, resulting in GPT-JT (6B), a model that outperforms
many 100B+ parameter models on classification benchmarks.

algorithm benchmarks billion classification decentralized gpt gpt-j gpt-jt open-source ai tokens training

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne