Oct. 23, 2023, 9:55 a.m. | /u/Ok_Constant_9886

Machine Learning www.reddit.com

Hey all, I'm building this open source project that helps ML engineers evaluate LLM applications (its like unit testing for LLMs), and it works great in development since users can just write a test\_file.py like how you would normally do it in pytest, but as I'm going onto the next phase I'm thinking how to bring evaluation to production, especially on metrics such as factual consistency where I need a ground truth. I'm hoping to get some ideas around this. …

applications building development engineers evaluation hey llm llm applications llms machinelearning next nlp normally open source project pytest test testing thinking

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Risk Management - Machine Learning and Model Delivery Services, Product Associate - Senior Associate-

@ JPMorgan Chase & Co. | Wilmington, DE, United States

Senior ML Engineer (Speech/ASR)

@ ObserveAI | Bengaluru