Build an evaluation framework to test model refusal alignment against 'moral' guidelines rather than just standard safety guidelines. Create a dataset of complex ethical dilemmas to benchmark refusal behavior.