Llm + Safety

16.0

Create a tool that simulates reader-grounded feedback loops to replace LLM-as-judge benchmarks for disinformation. This requires building a representative dataset of user perception.

+0

emergingimplementation gap

discussionalignmentllmsafety

Signals (2)

OpenAI22h ago

Responsible and safe use of AI

arXiv18h ago

Llm + Safety

Signals (2)

Responsible and safe use of AI

Beyond Surface Judgments: Human-Grounded Risk Evaluation of LLM-Generated Disinformation