Ankit Maloo
View original ↗Release an open-source evaluation suite for 'problem recognition.' This helps organizations identify if their internal LLM apps are actually solving problems or just executing blind tasks.
Suggested repo: kw-eval
"Does your AI just solve tasks, or does it understand the problem?"
Estimated effort: 40h