Fine Tuning + Reasoning + Rl | hypedar