Statistics Department Seminar Series: Xianyang Zhang, Professor, Department of Statistics, Texas A&M University.
"Detecting and Segmenting Watermarked Texts from Language Models"
Abstract: The rapid adoption of large language models (LLMs), such as GPT-4 and Claude 3.5, underscores the need to distinguish LLM-generated text from human-written content to mitigate the spread of misinformation, misuse in education, and LLM training data contamination. One promising approach to address this issue is the watermark technique, which embeds subtle statistical signals into LLM-generated text to enable reliable identification. In this work, we enhance watermark detection using adaptive methods that assign higher weights to tokens with smaller next-token probabilities (NTPs), where NTPs quantify the likelihood of a token appearing based on its preceding context. We rigorously analyze the Type I and Type II error of the proposed method and demonstrate its superior detection power through numerical experiments. Due to the unavailability of true prompts and, thus, true NTPs, we introduce a prompt estimation method that identifies the most likely prompt from an instruction set to estimate NTPs. Furthermore, we develop a statistical framework for segmenting text into watermarked and non-watermarked substrings by framing it as a change point detection problem. Extensive experiments validate the proposed methods, demonstrating their effectiveness in detection, segmentation, and robustness.
https://zhangxiany-tamu.github.io/
https://zhangxiany-tamu.github.io/
Building: | West Hall |
---|---|
Website: | |
Event Type: | Workshop / Seminar |
Tags: | seminar |
Source: | Happening @ Michigan from Department of Statistics, Department of Statistics Seminar Series |