Sept. 28, 2023, 6:34 p.m. | /u/vatsadev

Machine Learning www.reddit.com

I've recently started working with ML and NLP, so I'm sorry if this sounds Naive.

Unlike Llama 2 or other open source, we don't have access to the model weights for GPT-4, Claude or Bard, so Benchmark Evals are being run through either APIs or the chat Interface. So how do we know that the model isn't being Boosted by custom web-searching abilities or RAG? While GPT-4 might have a turnoff option, I'm pretty sure Bard is always online, being …

apis bard benchmark benchmarks claude evals gpt gpt-4 llama llama 2 machinelearning nlp open source through

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Codec Avatars Research Engineer

@ Meta | Pittsburgh, PA