Transformer RL library
TRL (Transformer Reinforcement Learning) provides SFT, DPO, PPO, GRPO, and other modern RLHF methods for fine-tuning LLMs. Maintained by Hugging Face.