/u/hauhau901
View original ↗Developers should build an automated evaluation pipeline to stress-test these 'aggressive' uncensored multimodal models against safety-filtered counterparts. This tool would quantify the degradation of reasoning capabilities when safety weights are stripped from lightweight models.
Suggested repo: gemma-uncensored-bench
"Pushing Gemma 4 to the limit: How much reasoning intelligence do we lose when we kill the safety guardrails?"
Estimated effort: 20h