arXiv1d ago

LLMs Corrupt Your Documents When You Delegate

Philippe Laban, Tobias Schnabel, Jennifer Neville

View original ↗

Analysis

Viral velocity

low

Implementation gapYES

Novelty6/10

Categorypaper

Topics

agentsrag

Opportunity Brief

Create a stress-testing framework that evaluates LLM accuracy during long, complex delegated document workflows. This helps businesses measure if their AI agents actually maintain document integrity.

Suggested repo: delegate-bench

"Is your AI agent silently corrupting your company documents?"

Estimated effort: 20h