🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.
This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as: - Supervised Fine-Tuning (SFT) - Direct Preference Optimization (DPO) - Group Relative Policy Optimization (GRPO)
We’re excited to see what the community builds on top of this.
If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗
🚀 smolagents v1.21.0 is here! Now with improved safety in the local Python executor: dunder calls are blocked! ⚠️ Still, not fully isolated: for untrusted code, use a remote executor instead: Docker, E2B, Wasm. ✨ Many bug fixes: more reliable code. 👉 https://github.com/huggingface/smolagents/releases/tag/v1.21.0