AI & ML interests

Aligning LLMs to be helpful, honest, harmless, and huggy (H4)

Recent Activity

IlyasMoutawwakilย 
posted an update about 2 hours ago
view post
Post
16
Transformers v5 just landed! ๐Ÿš€
It significantly unifies and reduces modeling code across architectures, while opening the door to a whole new class of performance optimizations.

My favorite new feature? ๐Ÿค”
The new dynamic weight loader + converter. Hereโ€™s why ๐Ÿ‘‡

Over the last few months, the core Transformers maintainers built an incredibly fast weight loader, capable of converting tensors on the fly while loading them in parallel threads. This means weโ€™re no longer constrained by how parameters are laid out inside the safetensors weight files.

In practice, this unlocks two big things:
- Much more modular modeling code. You can now clearly see how architectures build on top of each other (DeepSeek v2 โ†’ v3, Qwen v2 โ†’ v3 โ†’ MoE, etc.). This makes shared bottlenecks obvious and lets us optimize the right building blocks once, for all model families.
- Performance optimizations beyond what torch.compile can do alone. torch.compile operates on the computation graph, but it canโ€™t change parameter layouts. With the new loader, we can restructure weights at load time: fusing MoE expert projections, merging attention QKV projections, and enabling more compute-dense kernels that simply werenโ€™t possible before.

Personally, I'm honored to have contributed in this direction, including the work on optimizing MoE implementations and making modeling code more torch-exportable, so these optimizations can be ported cleanly across runtimes.

Overall, Transformers v5 is a strong signal of where the community and industry are converging: Modularity and Performance, without sacrificing Flexibility.

Transformers v5 makes its signature from_pretrained an entrypoint where you can mix and match:
- Parallelism
- Quantization
- Custom kernels
- Flash/Paged attention
- Continuous batching
- ...

Huge kudos to everyone involved! I highly recommend checking out the:
Release notes: github.com/huggingface/transformers/releases/tag/v5.0.0
and Blog post: huggingface.co/blog/transformer
sergiopaniegoย 
posted an update about 19 hours ago
IlyasMoutawwakilย 
posted an update 5 days ago
view post
Post
2238
After 2 months of refinement, I'm happy to announce that a lot of Transformers' modeling code is now significantly more torch-compile & export-friendly ๐Ÿ”ฅ

Why it had to be done ๐Ÿ‘‡
PyTorch's Dynamo compiler is increasingly becoming the default interoperability layer for ML systems. Anything that relies on torch.export or torch.compile, from model optimization to cross-framework integrations, benefits directly when models can be captured as a single dynamo-traced graph !

Transformers models are now easier to:
โš™๏ธ Compile end-to-end with torch.compile backends
๐Ÿ“ฆ Export reliably via torch.export and torch.onnx.export
๐Ÿš€ Deploy to ONNX / ONNX Runtime, Intel Corporation's OpenVINO, NVIDIA AutoDeploy (TRT-LLM), AMD's Quark, Meta's Executorch and more hardware-specific runtimes.

This work aims at unblocking entire TorchDynamo-based toolchains that rely on exporting Transformers across runtimes and accelerators.

We are doubling down on Transformers commitment to be a first-class citizen of the PyTorch ecosystem, more exportable, more optimizable, and easier to deploy everywhere.

There are definitely some edge-cases that we still haven't addressed so don't hesitate to try compiling / exporting your favorite transformers and to open issues / PRs.

PR in the comments ! More updates coming coming soon !
  • 1 reply
ยท
sergiopaniegoย 
posted an update 8 days ago
view post
Post
1523
FunctionGemma Tuning Lab is a new no-code tool by @google that lets you fine-tune a model directly from the browser, with no coding knowledge required, using TRL behind the scenes.

blog: https://developers.googleblog.com/a-guide-to-fine-tuning-functiongemma/

try it out: google/functiongemma-tuning-lab

This example builds on a more advanced one for learning fine-tuning with SFT using TRL: https://ai.google.dev/gemma/docs/functiongemma/finetuning-with-functiongemma
  • 1 reply
ยท
sergiopaniegoย 
posted an update 11 days ago
view post
Post
739
TRL v0.27.0 is out!! ๐Ÿฅณ

It includes GDPO, the latest variant of GRPO for multi-reward RL โœจ
GDPO decouples reward normalization to avoid reward collapse and improve per-reward convergence โ€” developed by
@sliuau @SimonX et al.

Explore the paper: GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization (2601.05242)

Explore the full set of changes here:
https://github.com/huggingface/trl/releases/tag/v0.27.0
sergiopaniegoย 
posted an update 14 days ago
view post
Post
2954
New REPL environment in OpenEnv available! โœจ
Used in the Recursive Language Models (RLM) paper by Alex Zhang.

Ready for inference & post-training using trajectories. Handles long contexts:

> Run Python code in a sandbox
> Make recursive calls to LMs
> Explore data programmatically
> Return final result

Docs: https://meta-pytorch.org/OpenEnv/environments/repl/
Inference script: https://github.com/meta-pytorch/OpenEnv/blob/main/examples/repl_oolong_simple.py
sergiopaniegoย 
posted an update 15 days ago
view post
Post
420
Recursive Language Models (RLM) is a new interface for LLMs with cool ideas by Alex Zhang!

โš ๏ธ LLMs struggle with long prompts โ†’ attention overload & lost info
๐Ÿ”„ RLMs inspect, split & call themselves on chunks, then aggregate results
โœ… Handles millions of tokens, reduces noise, improves reasoning
๐Ÿ’ก System prompt guides recursion
๐ŸŽฏ RLM trajectories can be used for RL training or distillation (OpenEnv+TRL!!)

We're adding it to OpenEnv (with Kashif Rasul): https://github.com/meta-pytorch/OpenEnv/pull/282

More resources:

> Paper: Recursive Language Models (2512.24601)
> Paper blog: https://alexzhang13.github.io/blog/2025/rlm/
> RLM repo: https://github.com/alexzhang13/rlm
  • 2 replies
ยท
sergiopaniegoย 
posted an update 19 days ago
sergiopaniegoย 
posted an update 25 days ago
view post
Post
2580
The list of hands-on notebooks (some beginner-friendly!) to get started with fine-tuning using TRL keeps growing!!

โ€ข SFT
โ€ข GRPO
โ€ข Tool calling & agents
โ€ข RL environments with OpenEnv
โ€ข LLMs and VLMs
โœจ Many run on FREE Colab, making it super easy to get started fast!

https://github.com/huggingface/trl/tree/main/examples/notebooks
sergiopaniegoย 
posted an update 28 days ago
sergiopaniegoย 
posted an update 29 days ago
sergiopaniegoย 
posted an update about 1 month ago
sergiopaniegoย 
posted an update about 1 month ago
view post
Post
2007
The Christmas holidays are here! ๐ŸŽ„
Thinking about learning something new in AI?

@huggingface offers 12 FREE courses covering all the relevant topics, for every level of experience. A great challenge for the holidays (and worth saving for later ๐Ÿ™„)

Letโ€™s explore them!

๐Ÿง  ๐—Ÿ๐—Ÿ๐—  ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: large language models with HF tools
https://huggingface.co/learn/llm-course

๐Ÿค– ๐—”๐—ด๐—ฒ๐—ป๐˜๐˜€ ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: build and deploy AI agents
https://huggingface.co/learn/agents-course

๐ŸŽจ ๐——๐—ถ๐—ณ๐—ณ๐˜‚๐˜€๐—ถ๐—ผ๐—ป ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: diffusion models with ๐Ÿค— Diffusers
https://huggingface.co/learn/diffusion-course

๐Ÿ”Š ๐—”๐˜‚๐—ฑ๐—ถ๐—ผ ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: transformers for audio tasks
https://huggingface.co/learn/audio-course

๐ŸŽฎ ๐——๐—ฒ๐—ฒ๐—ฝ ๐—ฅ๐—Ÿ ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: deep reinforcement learning
https://huggingface.co/learn/deep-rl-course

๐Ÿ‘๏ธ ๐—–๐—ผ๐—บ๐—บ๐˜‚๐—ป๐—ถ๐˜๐˜† ๐—–๐—ผ๐—บ๐—ฝ๐˜‚๐˜๐—ฒ๐—ฟ ๐—ฉ๐—ถ๐˜€๐—ถ๐—ผ๐—ป ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: modern computer vision with HF
https://huggingface.co/learn/computer-vision-course

๐Ÿฆพ ๐—ฅ๐—ผ๐—ฏ๐—ผ๐˜๐—ถ๐—ฐ๐˜€ ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ (๐—Ÿ๐—ฒ๐—ฅ๐—ผ๐—ฏ๐—ผ๐˜): learning-based robotics
https://huggingface.co/learn/robotics-course

๐Ÿงฉ ๐— ๐—–๐—ฃ ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: Model Context Protocol explained
https://huggingface.co/learn/mcp-course

๐Ÿงช ๐—” ๐—ฆ๐—บ๐—ผ๐—น ๐—–๐—ผ๐˜‚๐—ฟ๐˜€๐—ฒ: post-training AI models
https://huggingface.co/learn/a-smol-course

๐Ÿ•น๏ธ ๐— ๐—Ÿ ๐—ณ๐—ผ๐—ฟ ๐—š๐—ฎ๐—บ๐—ฒ๐˜€: AI in game development
https://huggingface.co/learn/ml-for-games-course

๐ŸงŠ ๐— ๐—Ÿ ๐—ณ๐—ผ๐—ฟ ๐Ÿฏ๐——: machine learning for 3D data
https://huggingface.co/learn/ml-for-3d-course

๐Ÿ“˜ ๐—ข๐—ฝ๐—ฒ๐—ป-๐—ฆ๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ฒ ๐—”๐—œ ๐—–๐—ผ๐—ผ๐—ธ๐—ฏ๐—ผ๐—ผ๐—ธ: practical AI notebooks
https://huggingface.co/learn/cookbook

All of them can be found here: https://huggingface.co/learn
sergiopaniegoย 
posted an update about 1 month ago
view post
Post
1897
Google DeepMind releases FunctionGemma, a 240M model specialized in ๐Ÿ”ง tool calling, built for fine-tuning

TRL has day-0 support. To celebrate, weโ€™re sharing 2 new resources:

> Colab guide to fine-tune it for ๐ŸŒ browser control with BrowserGym OpenEnv
> Standalone training script

> Colab notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_functiongemma_browsergym_openenv.ipynb
> Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/browsergym_llm.py (command to run it inside the script)
> More notebooks in TRL: https://huggingface.co/docs/trl/example_overview#notebooks
sergiopaniegoย 
posted an update about 1 month ago
sergiopaniegoย 
posted an update about 2 months ago
view post
Post
2145
๐ŸŽ„ last talk of the year about open AI and HF today at Universidad Rey Juan Carlos for undergrad students

always a pleasure to be back at my alma mater

๐ŸŽ… slides: https://github.com/sergiopaniego/talks
  • 1 reply
ยท