May 8, 2023, 12:43 p.m. | MLOps.community

MLOps.community www.youtube.com

// Abstract
In this panel discussion, the topic of the cost of running large language models (LLMs) is explored, along with potential solutions. The benefits of bringing LLMs in-house, such as latency optimization and greater control, are also discussed. The panelists explore methods such as structured pruning and knowledge distillation for optimizing LLMs. OctoML's platform is mentioned as a tool for the automatic deployment of custom models and for selecting the most appropriate hardware for them. Overall, the discussion provides …

abstract benefits conference control cost language language models large language models latency llms optimization panel performance production pruning running solutions

Senior Data Engineer

@ Publicis Groupe | New York City, United States

Associate Principal Robotics Engineer - Research.

@ Dyson | United Kingdom - Hullavington Office

Duales Studium mit vertiefter Praxis: Bachelor of Science Künstliche Intelligenz und Data Science (m/w/d)

@ Gerresheimer | Wackersdorf, Germany

AI/ML Engineer (TS/SCI) {S}

@ ARKA Group, LP | Aurora, Colorado, United States

Data Integration Engineer

@ Find.co | Sliema

Data Engineer

@ Q2 | Bengaluru, India