[R] Identifying the Risks of LM Agents with an LM-Emulated Sandbox - University of Toronto 2023 - Benchmark consisting of 36 high-stakes tools and 144 test cases! | allainews.com

Oct. 8, 2023, 11:59 p.m. | /u/Singularian2501

Machine Learning www.reddit.com

Paper: [https://arxiv.org/abs/2309.15817](https://arxiv.org/abs/2309.15817)

Github: [https://github.com/ryoungj/toolemu](https://github.com/ryoungj/toolemu)

Website: [https://toolemu.com/](https://toolemu.com/)

Abstract:

>Recent advances in Language Model (LM) agents and tool use, exemplified by applications like ChatGPT Plugins, enable a rich set of capabilities but also amplify potential risks - such as leaking private data or causing financial losses. Identifying these risks is labor-intensive, necessitating implementing the tools, manually setting up the environment for each test scenario, and finding risky cases. As tools and agents become more complex, the high cost of testing these agents …

abstract advances agents amplify applications capabilities cases chatgpt chatgpt plugins data environment financial labor language language model losses machinelearning plugins private data risks set test tool tools

More from www.reddit.com / Machine Learning

[Discussion] Should I go to ICML and present my paper? 5 hours ago | www.reddit.com

academia data data scientist future +10

[Discussion] Seeking help to find the better GPU setup. Three H100 vs Five A100? 6 hours ago | www.reddit.com

70b a100 budget five +9

[D] Something I always think about, for top conferences like ICML, NeurIPS, CVPR,..etc. How many … 7 hours ago | www.reddit.com

conferences cvpr etc good +8

[D] Benchmark creators should release their benchmark datasets in stages 8 hours ago | www.reddit.com

benchmark benchmarks concerns data +11

[P] spRAG - Open-source RAG implementation for challenging real-world tasks 9 hours ago | www.reddit.com

core hey implementation machinelearning +7

[D] Paper accepted to ICML but not attending in person? 12 hours ago | www.reddit.com

authors conference icml machinelearning +6

[D] Why do juniors (undergraduates or first- to second-year PhD students) have so many papers … 14 hours ago | www.reddit.com

academic conferences etc hello +12

[D] How can I detect the text orientation using MMOCR or MMDET models? 18 hours ago | www.reddit.com

example image images issue +5

[D] Current state of Chatbot pipelines in Commercial settings? 22 hours ago | www.reddit.com

build chatbot commercial current +12

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Data Analyst (Digital Business Analyst)

@ Activate Interactive Pte Ltd | Singapore, Central Singapore, Singapore

View on ai-jobs.net