4 20 8

Julien Delavande

jdelavande

https://delavande.fr

AI & ML interests

None yet

Recent Activity

updated a Space about 15 hours ago

jdelavande/tgi-gpt-oss20b

upvoted an article about 19 hours ago

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

updated a dataset 8 days ago

jdelavande/benchlab-text2video-energy-benchmark

View all activity

Organizations

upvoted an article about 19 hours ago

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

and 1 other •

15 days ago

• 48

upvoted an article 28 days ago

Article

Welcome GPT OSS, the new open-source model family from OpenAI!

and 11 others •

28 days ago

• 481

upvoted an article about 1 month ago

Article

Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨

and 2 others •

Jul 25

• 80

upvoted 5 articles about 2 months ago

Article

Asynchronous Robot Inference: Decoupling Action Prediction and Execution

and 7 others •

Jul 10

• 41

Article

ScreenEnv: Deploy your full stack Desktop Agent

and 1 other •

Jul 10

• 64

Article

SmolLM3: smol, multilingual, long-context reasoner

and 22 others •

Jul 8

• 643

Article

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

and 1 other •

Jul 9

• 663

Article

Creating custom kernels for the AMD MI300

and 1 other •

Jul 9

• 44

upvoted 3 articles 2 months ago

Article

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

and 1 other •

Jul 1

• 113

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

and 5 others •

Sep 18, 2024

• 265

Article

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

and 6 others •

Jun 12

• 131

upvoted an article 3 months ago

Article

💥 Building a Vulnerable Bank MCP — Then Automating an Agent to Hack It

and 2 others •

Jun 18

• 8

upvoted a paper 3 months ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 130

upvoted 3 changelogs 3 months ago

Changelog

AI-generated Abstract summaries on Hugging Face Papers

May 22

• 73

Changelog

Xet is now the default storage option for new users and organizations

May 23

• 73

Changelog

Static Spaces can now have a build step

May 23

• 107

upvoted an article 3 months ago

Article

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

•

Apr 16

• 37

upvoted 3 articles 4 months ago

Article

Reduce, Reuse, Recycle: Why Open Source is a Win for Sustainability

and 1 other •

May 7

• 16

Article

Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models.

and 9 others •

May 15

• 35

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

and 1 other •

Jan 16

• 75

Julien Delavande

AI & ML interests

Recent Activity

Organizations

jdelavande's activity

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

Welcome GPT OSS, the new open-source model family from OpenAI!

Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨

Asynchronous Robot Inference: Decoupling Action Prediction and Execution

ScreenEnv: Deploy your full stack Desktop Agent

SmolLM3: smol, multilingual, long-context reasoner

Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders

Creating custom kernels for the AMD MI300

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

💥 Building a Vulnerable Bank MCP — Then Automating an Agent to Hack It

AI-generated Abstract summaries on Hugging Face Papers

Xet is now the default storage option for new users and organizations

Static Spaces can now have a build step

Prefill and Decode for Concurrent Requests - Optimizing LLM Performance

Reduce, Reuse, Recycle: Why Open Source is a Win for Sustainability

Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models.

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference