Benchmarking LLM performance with LangChain Auto-Evaluator // Lance Martin //LLMs in Prod Con Part 2 | allainews.com

July 31, 2023, 5:54 p.m. | MLOps.community

MLOps.community www.youtube.com

// Abstract
Document Question-Answering is a popular LLM use case. LangChain makes it easy to assemble LLM components (e.g., models and retrievers) into chains that support question-answering. But, it is not always obvious to (1) evaluate the answer quality and (2) use this evaluation to guide improved QA chain settings (e.g., chunk size, retrieved docs count) or components (e.g., model or retriever choice). We recently released an open-source, hosted app to address these limitations (see blog post here). We have …

abstract auto benchmarking case components easy evaluation langchain llm llm performance llms part performance popular prod quality support

More from www.youtube.com / MLOps.community

What is AI Quality? // Mohamed Elgendy // MLOps Podcast #229 15 hours ago | www.youtube.com

abstract ceo co-founder concept +11

AI's Struggle with Abstraction in Analogies // Shane Morris // MLOps podcast #223 clip 1 day, 16 hours ago | www.youtube.com

abstract automation autonomous autonomous systems +19

The Mind Behind the AI Coding Assistant // Peter Guagenti // MLOps podcast #222 clip 2 days, 16 hours ago | www.youtube.com

ai coding ai coding assistant assistant business +20

Streamlining Model Deployment // Daniel Lenton // AI in Production Talk 2 days, 20 hours ago | www.youtube.com

abstract aiaas ai companies ai infrastructure +21

LLMOps and GenAI at Enterprise Scale - Challenges and Opportunities // Andy McMahon // AI … 2 days, 20 hours ago | www.youtube.com

abstract andy challenges development +17

Data Labeling Best Practices // Charles Brecque // AI in Production Conference Lightning Talk 2 days, 20 hours ago | www.youtube.com

abstract best practices bio conference +17

Explaining ChatGPT to Anyone in 10 Minutes // Cameron Wolfe // AI in Production Conference 2 days, 20 hours ago | www.youtube.com

abstract become chatgpt conference +13

Handling Multi-Terabyte LLM Checkpoints // Simon Karasik // MLOps Podcast #228 3 days, 15 hours ago | www.youtube.com

abstract big cloud cloud storage +15

Leading Enterprise Data Teams // Sol Rashidi // MLOps Podcast #227 1 week ago | www.youtube.com

abstract building cases ceo +20

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net