May 22, 2024, 2:27 p.m. | /u/madredditscientist

Machine Learning www.reddit.com

[**Reference: Full blog post**](https://www.kadoa.com/blog/ai-agents-hype-vs-reality)

There has been a lot of hype about the promise of autonomous agent-based LLM workflows. By now, all major LLMs are capable of interacting with external tools and functions, letting the LLM perform sequences of tasks automatically.

But reality is proving more challenging than anticipated.

The [WebArena leaderboard](https://docs.google.com/spreadsheets/d/1M801lEpBbKSNwP-vDBkC_pF7LdyGU1f_ufZb_NWNBZQ/edit#gid=0), which benchmarks LLMs agents against real-world tasks, shows that even the best-performing models have a success rate of only 35.8%.

# Challenges in Practice

After seeing many attempts …

agent agents ai agents autonomous challenges functions hype llm llms machinelearning major practice reality tasks tools workflows

Senior Data Engineer

@ Displate | Warsaw

Associate Director, Technology & Data Lead - Remote

@ Novartis | East Hanover

Product Manager, Generative AI

@ Adobe | San Jose

Associate Director – Data Architect Corporate Functions

@ Novartis | Prague

Principal Data Scientist

@ Salesforce | California - San Francisco

Senior Analyst Data Science

@ Novartis | Hyderabad (Office)