Han Song, Yucheng Zhou, Jianbing Shen, Yu Cheng
View original ↗Build an open-source training pipeline that combines CoT with entropy-guided RL for image generation. This allows researchers to visualize how exploration and reward optimization shape the final output.
Suggested repo: entropy-diffuse
"See exactly how RL guides your image generation path."
Estimated effort: 80h