Feb. 6, 2024, 5:53 a.m. | Yi Dong Ronghui Mu Gaojie Jin Yi Qi Jinwei Hu Xingyu Zhao Jie Meng Wenjie Ruan Xiaowei

cs.CL updates on arXiv.org arxiv.org

As Large Language Models (LLMs) become more integrated into our daily lives, it is crucial to identify and mitigate their risks, especially when the risks can have profound impacts on human users and societies. Guardrails, which filter the inputs or outputs of LLMs, have emerged as a core safeguarding technology. This position paper takes a deep look at current open-source solutions (Llama Guard, Nvidia NeMo, Guardrails AI), and discusses the challenges and the road towards building more complete solutions. Drawing …

become building core cs.ai cs.cl daily filter guardrails human identify impacts inputs language language models large language large language models llms paper risks technology

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US