TRL

https://github.com/huggingface/trl

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

ShirinYamani updated a dataset 11 days ago

trl-lib/documentation-images

qgallouedec updated a Space 16 days ago

trl-lib/recommend-vllm-memory

qgallouedec updated a Space 16 days ago

trl-lib/trackio

View all activity

sergiopaniego

posted an update 6 days ago

Post

340

It's now posible to do end-2-end ML without leaving the @huggingface Hub, by combining TRL + HF jobs + Trackio!!

🐡We just released a full guide explaining the process.

Go check it out!

📖 Guide: https://huggingface.co/docs/trl/main/en/jobs_training

💡 Reminder: HF Jobs is only available for Pro, Team, or Enterprise plans. Yet another reason to upgrade

ShirinYamani

updated a dataset 11 days ago

trl-lib/documentation-images

Viewer • Updated 11 days ago • 9 • 83.3k

qgallouedec

updated 2 Spaces 16 days ago

Recommend vLLM Memory

😻

Estimate GPU memory usage for model training

Trackio

🚀

Visualize project metrics and runs

qgallouedec

published a Space 16 days ago

Trackio

🚀

Visualize project metrics and runs

qgallouedec

updated a dataset 18 days ago

trl-lib/llava-instruct-mix

Viewer • Updated 18 days ago • 228k • 487 • 1

sergiopaniego

posted an update 21 days ago

Post

2868

So you can now SFT a model with hf jobs + TRL in ONE command lol 🏎️💨

Without worrying about infrastructure since it runs entirely on HF!

docs: https://huggingface.co/docs/huggingface_hub/main/en/guides/jobs
blog: https://huggingface.co/blog/hf-cli

sergiopaniego

posted an update 22 days ago

Post

391

New Zero-Shot Object Detectors in transformers! 🥽

We’ve added LLMDet and MM GroundingDINO, plus a demo Space to compare them with others 🖼️

Play with it: ariG23498/zero-shot-od

sergiopaniego

posted an update 22 days ago

Post

356

Missed last week's OpenAI GPT OSS release?

Here are 2 quick-start recipes we developed to get you up to speed:

🏃‍♀️ How to run gpt-oss-20b on Google Colab
https://cookbook.openai.com/articles/gpt-oss/run-colab

🧑‍🔧 Fine-tuning with gpt-oss and Hugging Face Transformers
https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers

qgallouedec

published a dataset 24 days ago

trl-lib/llava-instruct-mix

Viewer • Updated 18 days ago • 228k • 487 • 1

sergiopaniego

posted an update 26 days ago

Post

442

Latest TRL release brings major upgrades for multimodal alignment!

We dive into 3 new techniques to improve VLM post-training in our new blog:

🌋 GRPO
🎞️ GSPO
🐙 MPO
➕ vLLM integration for online training w/ transformers backend\

🐡 Blog: https://huggingface.co/blog/trl-vlm-alignment

sergiopaniego

posted an update 28 days ago

Post

2171

OpenAI's open models are out! 💃

Try: https://www.gpt-oss.com/
Learn: https://huggingface.co/blog/welcome-openai-gpt-oss

1 reply

sergiopaniego

posted an update 29 days ago

Post

3404

Want to learn how to align a Vision Language Model (VLM) for reasoning using GRPO and TRL? 🌋

🧑‍🍳 We've got you covered!!

NEW multimodal post training recipe to align a VLM using TRL in @HuggingFace 's Cookbook.

Go to the recipe 👉https://huggingface.co/learn/cookbook/fine_tuning_vlm_grpo_trl

Powered by the latest TRL v0.20 release, this recipe shows how to teach Qwen2.5-VL-3B-Instruct to reason over images 🌋

sergiopaniego

posted an update 29 days ago

Post

4500

Just included example scripts for aligning models using GSPO (including VLM example) 🙆‍♂️🙆‍♂️

GSPO is the latest RL alignment algo by @Alibaba_Qwen and it's already supported in the latest TRL v0.20 release.

Super-easy-to-get-started example scripts below, GO run them!👩‍💻👩‍💻

🧑‍🎨 Script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo.py
🦄 VLM script: https://github.com/huggingface/trl/blob/main/examples/scripts/gspo_vlm.py
🧩 More TRL examples: https://huggingface.co/docs/trl/main/en/example_overview
🧙‍♂️ GSPO paper: Group Sequence Policy Optimization (2507.18071)

sergiopaniego

posted an update about 1 month ago

Post

342

Did you miss this? 👓

🧙‍♂️vLLM + transformers integration just got upgraded with direct VLM support.

Select a VLM + model_impl=transformers and play via vLLM!

sergiopaniego

posted an update about 1 month ago

Post

2664

We just released TRL v0.20 with major multimodal upgrades!

👁️ VLM support for GRPO (highly requested by the community!)
🎞️ New GSPO trainer (from @Qwen , released last week, VLM-ready)
🐙 New MPO trainer (multimodal by design, as in the paper)

📝 Full release notes here: https://github.com/huggingface/trl/releases/tag/v0.20.0

qgallouedec

updated a Space about 1 month ago

Dataset Length Profiler

👁

Analyze dataset and recommend max_length for model training

qgallouedec

updated a model about 1 month ago

trl-lib/Qwen3-4B-LoRA

Updated Jul 28 • 1

qgallouedec

published a model about 1 month ago

trl-lib/Qwen3-4B-LoRA

Updated Jul 28 • 1

sergiopaniego

posted an update about 1 month ago

Post

1203

Yet Another New Multimodal Fine-Tuning Recipe 🥧

🧑‍🍳 In this @HuggingFace Face Cookbook notebook, we demonstrate how to align a multimodal model (VLM) using Mixed Preference Optimization (MPO) using trl.

💡 This recipe is powered by the new MPO support in trl, enabled through a recent upgrade to the DPO trainer!

We align the multimodal model using multiple optimization objectives (losses), guided by a preference dataset (chosen vs. rejected multimodal pairs).

Check it out! ➡️ https://huggingface.co/learn/cookbook/fine_tuning_vlm_mpo

2 replies

AI & ML interests

Recent Activity

Team members 9

trl-lib's activity

Recommend vLLM Memory

Trackio

Trackio

Dataset Length Profiler