ZeroGPU Explorers

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

ShuhuaiRen authored a paper about 17 hours ago

Next Block Prediction: Video Generation via Semi-Autoregressive Modeling

ShuhuaiRen authored a paper about 17 hours ago

Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

wangfuyun authored a paper 4 days ago

Unleashing Vecset Diffusion Model for Fast Shape Generation

View all activity

zero-gpu-explorers's activity

Zhengyi

authored 2 papers 4 days ago

RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

Paper • 2410.07864 • Published Oct 10, 2024 • 1

DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published 5 days ago • 42

mlabonne

posted an update 7 days ago

Post

6395

✂️ AutoAbliteration

I made a Colab notebook to automatically abliterate models.

It's quite general, so you can do interesting stuff like blocking a given language in the model outputs.

💻 Colab: https://colab.research.google.com/drive/1RmLv-pCMBBsQGXQIM8yF-OdCNyoylUR1?usp=sharing

wren93

authored a paper 7 days ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published 10 days ago • 17

mlabonne

posted an update 8 days ago

Post

5943

✂️ Gemma 3 Abliterated

I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.

I experimented with different recipes and improved the abliteration technique I wrote about last year.

It's still experimental but the refusal rate is super low in my tests. Enjoy!

mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

1 reply

eienmojiki

in zero-gpu-explorers/README 8 days ago

Inference keeps failing because of "ZeroGPU worker error GPU task aborted" and I had my daily ZeroGPU usage consumed

#156 opened 8 days ago by

ManHinnn0509

BestWishYsh

authored a paper 11 days ago

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

Paper • 2503.10391 • Published 11 days ago • 10

lewtun

authored a paper 12 days ago

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published 14 days ago • 40

lewtun

posted an update 13 days ago

Post

2074

Introducing OlympicCoder: a series of open reasoning models that can solve olympiad-level programming problems 🧑‍💻

- 7B open-r1/OlympicCoder-7B
- 32B open-r1/OlympicCoder-32B

We find that OlympicCoder models outperform Claude 3.7 Sonnet, as well as others over 100x larger 💪

Together with the models, we are releasing:

📊CodeForces-CoTs: new dataset of code problems from the most popular competitive coding platform, with R1 traces in C++ and Python open-r1/codeforces-cots

🏆 IOI'2024: a new benchmark of VERY hard programming problems where even frontier models struggle to match human performance open-r1/ioi

For links to the models and datasets, check out our latest progress report from Open R1: https://huggingface.co/blog/open-r1/update-3

1 reply

EvanTHU

authored a paper 13 days ago

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Paper • 2503.07597 • Published 14 days ago • 2

wanghaofan

authored 4 papers 14 days ago

ZennyKenny

in zero-gpu-explorers/README 15 days ago

Multiple zeroGPU calls in same code

#155 opened 15 days ago by

hen

tianbaoxiexxx

authored a paper about 1 month ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 169

alielfilali01

posted an update about 1 month ago

Post

869

🚨 Arabic LLM Evaluation 🚨

Few models join the ranking of inceptionai/AraGen-Leaderboard Today.

The new MistralAI model, Saba, is quite impressive, Top10 ! Well done @arthurmensch and team.

Sadly Mistral did not follow its strategy about public weights this time, we hope this changes soon and we get the model with a permissive license.

We added other Mistral models and apparently, we have been sleeping on mistralai/Mistral-Large-Instruct-2411 !

Another impressive model that joined the ranking today is ALLaM-AI/ALLaM-7B-Instruct-preview. After a long wait finally ALLaM is here and it is IMPRESSIVE given its size !

ALLaM is ranked on OALL/Open-Arabic-LLM-Leaderboard as well.

vikhyatk

posted an update about 1 month ago

Post

2743

🚨 New VQA + captioning dataset! moondream/megalith-mdqa

Images from Megalith, captioned using Moondream, then transformed to short-form QA.

9M+ images, 6-10 QA pairs per image.

lewtun

posted an update about 1 month ago

Post

4976

Introducing OpenR1-Math-220k!

open-r1/OpenR1-Math-220k

The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪

What’s new compared to existing reasoning datasets?

♾ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.

🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.

📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.

⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser)

📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.

🔎 Read our blog post for all the nitty gritty details: https://huggingface.co/blog/open-r1/update-2

julien-c

in zero-gpu-explorers/README about 2 months ago

Update README.md

#152 opened about 2 months ago by

fdaudens

AI & ML interests

Recent Activity

Team members 753

zero-gpu-explorers's activity

Inference keeps failing because of "ZeroGPU worker error GPU task aborted" and I had my daily ZeroGPU usage consumed

Multiple zeroGPU calls in same code

Update README.md