Feb. 13, 2024, 5:45 a.m. | Yu Pan Ye Yuan Yichun Yin Jiaxin Shi Zenglin Xu Ming Zhang Lifeng Shang Xin Jiang Qun

cs.LG updates on arXiv.org arxiv.org

The rapid progress of Transformers in artificial intelligence has come at the cost of increased resource consumption and greenhouse gas emissions due to growing model sizes. Prior work suggests using pretrained small models to improve training efficiency, but this approach may not be suitable for new model structures. On the other hand, training from scratch can be slow, and progressively stacking layers often fails to achieve significant acceleration. To address these challenges, we propose a novel method called Apollo, which …

artificial artificial intelligence consumption cost cs.ai cs.lg efficiency emissions greenhouse intelligence language language models prior progress small training transformers work

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote