Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks | allainews.com

Feb. 20, 2024, 5:51 a.m. | Yichen Wang, Shangbin Feng, Abe Bohan Hou, Xiao Pu, Chao Shen, Xiaoming Liu, Yulia Tsvetkov, Tianxing He

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.11638v1 Announce Type: new
Abstract: The widespread use of large language models (LLMs) is increasing the demand for methods that detect machine-generated text to prevent misuse. The goal of our study is to stress test the detectors' robustness to malicious attacks under realistic scenarios. We comprehensively study the robustness of popular machine-generated text detectors under attacks from diverse categories: editing, paraphrasing, prompting, and co-generating. Our attacks assume limited access to the generator LLMs, and we compare the performance of detectors …

abstract arxiv attacks cs.cl demand generated language language models large language large language models llms machine misuse robustness stress study test testing text type

More from arxiv.org / cs.CL updates on arXiv.org

Hyperparameter-Free Approach for Faster Minimum Bayes Risk Decoding 5 hours ago | arxiv.org

abstract alternative arxiv bayes +17

Behind the Magic, MERLIM: Multi-modal Evaluation Benchmark for Large Image-Language Models 5 hours ago | arxiv.org

abstract advances architectures arxiv +21

tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models 5 hours ago | arxiv.org

abstract arxiv audio audio generation +26

Model-Based Minimum Bayes Risk Decoding for Text Generation 5 hours ago | arxiv.org

abstract alternative arxiv bayes +15

Flexible, Model-Agnostic Method for Materials Data Extraction from Text Using General Purpose Language Models 5 hours ago | arxiv.org

abstract arxiv cond-mat.mtrl-sci cs.ai +28

Leveraging Large Language Models for NLG Evaluation: Advances and Challenges 5 hours ago | arxiv.org

abstract advances arxiv challenges +21

Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding 5 hours ago | arxiv.org

abstract algorithms arxiv bayes +15

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization 5 hours ago | arxiv.org

arxiv attacks cs.cl jailbreaking +7

Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment 5 hours ago | arxiv.org

abstract alignment arxiv bias +27

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Werkstudent Data Architecture & Governance (w/m/d)

@ E.ON | Essen, DE

View on ai-jobs.net

Data Architect, Data Lake, Professional Services

@ Amazon.com | Bogota, DC, COL

View on ai-jobs.net

Data Architect, Data Lake, Professional Services

@ Amazon.com | Buenos Aires City, Buenos Aires Autonomous City, ARG

View on ai-jobs.net

Data Architect

@ Bitful | United States - Remote

View on ai-jobs.net