March 21, 2024, 4:48 a.m. | Yijian Lu, Aiwei Liu, Dianzhi Yu, Jingjing Li, Irwin King

cs.CL updates on arXiv.org arxiv.org

arXiv:2403.13485v1 Announce Type: new
Abstract: Currently, text watermarking algorithms for large language models (LLMs) can embed hidden features to texts generated by LLMs to facilitate subsequent detection, thus alleviating the problem of misuse of LLMs. Although the current text watermarking algorithms perform well in most high-entropy scenarios, its performance in low-entropy scenarios still needs to be improved. In this work, we proposed that the influence of token entropy should be fully considered in the watermark detection process, that is, the …

abstract algorithms arxiv cs.cl current detection embed entropy features generated hidden language language models large language large language models llms low misuse performance text type watermarking

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Data Engineer (m/f/d)

@ Project A Ventures | Berlin, Germany

Principle Research Scientist

@ Analog Devices | US, MA, Boston