Lin Wang, Junfeng Fang, Dan Zhang, Fei Shen, Xiang Wang, Tat-Seng Chua
View original ↗Create an agent auditor library that performs latent-space safety checks on long-running trajectories. It effectively separates task execution from security verification to prevent jailbreaking or malicious tool use.
Suggested repo: draft-agent
"Decoupled safety auditing for autonomous LLM agents."
Estimated effort: 60h