Aug 19, 2025 ON THE GENERALIZATION OF SFT: A REINFORCEMENT LEARNING PERSPECTIVE WITH REWARD RECTIFICATION Mar 25, 2025 ReFT: Reasoning with Reinforced Fine-Tuning Jan 02, 2025 d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning