Shubin Kim, Yejin Son, Junyeong Park, Keummin Ka, Seungbeen Lee, Jaeyoung Lee, Hyeju Jang, Alice Oh, Youngjae Yu
View original ↗Develop an automated evaluation suite that detects counterfactual unfairness in LLMs using humor as a probe. This can be a critical tool for safety-focused developers.
Suggested repo: fair-humor
"Test model bias using the ultimate litmus test: humor."
Estimated effort: 35h