[D] Referenceless NLP Evaluation | allainews.com

Oct. 23, 2023, 9:55 a.m. | /u/Ok_Constant_9886

Machine Learning www.reddit.com

Hey all, I'm building this open source project that helps ML engineers evaluate LLM applications (its like unit testing for LLMs), and it works great in development since users can just write a test\_file.py like how you would normally do it in pytest, but as I'm going onto the next phase I'm thinking how to bring evaluation to production, especially on metrics such as factual consistency where I need a ground truth. I'm hoping to get some ideas around this. …

applications building development engineers evaluation hey llm llm applications llms machinelearning next nlp normally open source project pytest test testing thinking

More from www.reddit.com / Machine Learning

[R] Training-free Graph Neural Networks and the Power of Labels as Features 6 hours ago | www.reddit.com

features free graph graph neural networks +6

[D] Modern best coding practices for Pytorch (for research)? 9 hours ago | www.reddit.com

coding config example good +14

[P] I reproduced Anthropic's recent interpretability research 12 hours ago | www.reddit.com

anthropic attention basic capabilities +8

[R] KAN: Kolmogorov-Arnold Networks 13 hours ago | www.reddit.com

abstract every function functions +11

[D] Looking for a recent study/paper/article that showed that an alternate model with a similar … 13 hours ago | www.reddit.com

article conversation machinelearning nothing +4

[2404.10667] VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time 14 hours ago | www.reddit.com

audio generated machinelearning vasa +1

[D] Is RPE still a valid approach, or is RoPE entirely superior? 18 hours ago | www.reddit.com

attention datasets embed information +8

[D] TensorDock — GPU Cloud Marketplace, H100s from $2.49/hr 20 hours ago | www.reddit.com

building cloud cloud gpu gpu +17

How does freezing a model work? [D] 23 hours ago | www.reddit.com

clip encoder guides inputs +9

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Risk Management - Machine Learning and Model Delivery Services, Product Associate - Senior Associate-

@ JPMorgan Chase & Co. | Wilmington, DE, United States

View on ai-jobs.net

Senior ML Engineer (Speech/ASR)

@ ObserveAI | Bengaluru

View on ai-jobs.net