Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes | allainews.com

Sept. 21, 2023, 9:25 p.m. | Google AI (noreply@blogger.com)

Google AI Blog ai.googleblog.com

Posted by Cheng-Yu Hsieh, Student Researcher, and Chen-Yu Lee, Research Scientist, Cloud AI Team

Large language models (LLMs) have enabled a new data-efficient learning paradigm wherein they can be used to solve unseen new tasks via zero-shot or few-shot prompting. However, LLMs are challenging to deploy for real-world applications due to their sheer size. For instance, serving a single 175 billion LLM requires at least 350GB of GPU memory using specialized infrastructure, not to mention that today's state-of-the-art …

acl chen cloud cloud ai data deploy few-shot language language models large language large language models llms machine learning natural language processing paradigm prompting research researcher research scientist solve tasks team training training data

More from ai.googleblog.com / Google AI Blog

Generative AI to quantify uncertainty in weather forecasting 4 weeks, 1 day ago | ai.googleblog.com

climate decisions engineer example +17

AutoBNN: Probabilistic time series forecasting with compositional bayesian neural networks 4 weeks, 1 day ago | ai.googleblog.com

bayesian data economic engineer +23

Computer-aided diagnosis for lung cancer screening 1 month ago | ai.googleblog.com

cancer cancer screening computer diagnosis +16

Using AI to expand global access to reliable flood forecasts 1 month, 1 week ago | ai.googleblog.com

billion disaster engineering environment +13

ScreenAI: A visual language model for UI and visually-situated language understanding 1 month, 1 week ago | ai.googleblog.com

charts communication design diagrams +24

SCIN: A new resource for representative dermatology images 1 month, 1 week ago | ai.googleblog.com

crowd-sourcing dataset datasets dermatology +14

MELON: Reconstructing 3D objects from images with unknown poses 1 month, 1 week ago | ai.googleblog.com

3d objects capacity computer vision engineer +16

HEAL: A framework for health equity assessment of machine learning performance 1 month, 1 week ago | ai.googleblog.com

assessment clinical core differences +17

Cappy: Outperforming and boosting large multi-task language models with a small scorer 1 month, 1 week ago | ai.googleblog.com

boosting engineers framework google +25

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Global Data Architect, AVP - State Street Global Advisors

@ State Street | Boston, Massachusetts

View on ai-jobs.net

Data Engineer

@ NTT DATA | Pune, MH, IN

View on ai-jobs.net