Feb. 7, 2024, 5:44 a.m. | Da Yu Sivakanth Gopi Janardhan Kulkarni Zinan Lin Saurabh Naik Tomasz Lukasz Religa Jian Yin H

cs.LG updates on arXiv.org arxiv.org

Suppose we want to train text prediction models in email clients or word processors. These models, which serve billions of predictions per hour, must preserve the privacy of user data and adhere to specific model size constraints to meet memory, inference time requirements, and to reduce inference cost. Building small, fast, and private domain-specific language models is a thriving area of research. In this work, we show that a careful pre-training on a {\em subset} of the public dataset that …

building constraints cost cs.cr cs.lg data domain email fine-tuning hour inference language memory per prediction prediction models predictions pre-training privacy processors reduce requirements serve small text train training user data word

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne