Aug. 10, 2022, 8:23 p.m. | /u/Singularian2501

Machine Learning www.reddit.com

Paper: [https://arxiv.org/abs/2208.03306](https://arxiv.org/abs/2208.03306)

Github: [https://github.com/hadasah/btm](https://github.com/hadasah/btm) Code and Models coming soon!

Abstract:

>We present Branch-Train-Merge (BTM), a communication-efficient algorithm for embarrassingly parallel training of large language models (LLMs). We show it is possible to independently train subparts of a new class of LLMs on different subsets of the data, eliminating the massive multi-node synchronization currently required to train LLMs. BTM learns a set of independent expert LMs (ELMs), each specialized to a different textual domain, such as scientific or legal text. These …

ai expert language language models machinelearning meta meta ai training

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Research Associate (Data Science/Information Engineering/Applied Mathematics/Information Technology)

@ Nanyang Technological University | NTU Main Campus, Singapore

Associate Director of Data Science and Analytics

@ Penn State University | Penn State University Park

Student Worker- Data Scientist

@ TransUnion | Israel - Tel Aviv

Vice President - Customer Segment Analytics Data Science Lead

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

Middle/Senior Data Engineer

@ Devexperts | Sofia, Bulgaria