Aton Mountlook

AtonMountlook

AI & ML interests

None yet

Recent Activity

reacted to Undi95's post with 👍 8 days ago

Hi there! If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public) Axolotl config : https://huggingface.co/Undi95/MistralThinker-v1.1/blob/main/axolotl_config_l6xu8_mg.yml The dataset : https://huggingface.co/datasets/Undi95/R1-RP-ShareGPT3 You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek. Hope you will use them!

liked a model 17 days ago

bartowski/Nitral-AI_Captain-Eris_Violet-GRPO-v0.420-GGUF

reacted to cogwheelhead's post with 👍 21 days ago

Me and my team have performed an in-depth investigation comparing o1 to R1 (and other reasoning models) Link: https://toloka.ai/blog/r1-is-not-on-par-with-o1-and-the-difference-is-qualitative-not-quantitative It started with us evaluating them on our own university-math benchmarks: U-MATH for problem-solving and μ-MATH for judging solution correctness (see the HF leaderboard: https://huggingface.co/spaces/toloka/u-math-leaderboard) tl;dr: R1 sure is amazing, but what we find is that it lags behind in novelty adaptation and reliability: * performance drops when updating benchmarks with fresh unseen tasks (e.g. AIME 2024 -> 2025) * R1-o1 gap widens when evaluating niche subdomains (e.g. university-specific math instead of the more common Olympiad-style contests) * same with going into altogether unconventional domains (e.g. chess) or skills (e.g. judgment instead of problem-solving) * R1 also runs into failure modes way more often (e.g. making illegal chess moves or falling into endless generation loops) Our point here is not to bash on DeepSeek — they've done exceptional work, R1 is a game-changer, and we have no intention to downplay that. R1's release is a perfect opportunity to study where all these models differ and gain understanding on how to move forward from here

View all activity

Organizations

None yet

AtonMountlook's activity

reacted to Undi95's post with 👍 8 days ago

Post

4549

Hi there!

If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public)

Axolotl config : Undi95/MistralThinker-v1.1

The dataset : Undi95/R1-RP-ShareGPT3

You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek.

Hope you will use them!

5 replies

liked a model 17 days ago

bartowski/Nitral-AI_Captain-Eris_Violet-GRPO-v0.420-GGUF

Text Generation • Updated 17 days ago • 5.05k • 6

reacted to cogwheelhead's post with 👍 21 days ago

Post

2514

Me and my team have performed an in-depth investigation comparing o1 to R1 (and other reasoning models)

Link: https://toloka.ai/blog/r1-is-not-on-par-with-o1-and-the-difference-is-qualitative-not-quantitative

It started with us evaluating them on our own university-math benchmarks: U-MATH for problem-solving and μ-MATH for judging solution correctness (see the HF leaderboard: toloka/u-math-leaderboard)

tl;dr: R1 sure is amazing, but what we find is that it lags behind in novelty adaptation and reliability:
* performance drops when updating benchmarks with fresh unseen tasks (e.g. AIME 2024 -> 2025)
* R1-o1 gap widens when evaluating niche subdomains (e.g. university-specific math instead of the more common Olympiad-style contests)
* same with going into altogether unconventional domains (e.g. chess) or skills (e.g. judgment instead of problem-solving)
* R1 also runs into failure modes way more often (e.g. making illegal chess moves or falling into endless generation loops)

Our point here is not to bash on DeepSeek — they've done exceptional work, R1 is a game-changer, and we have no intention to downplay that. R1's release is a perfect opportunity to study where all these models differ and gain understanding on how to move forward from here

liked a model 24 days ago

Nitral-AI/Captain-Eris_Violet-GRPO-v0.420

Text Generation • Updated 1 day ago • 1.26k • 19

liked a model about 1 month ago

deepseek-ai/DeepSeek-V3

Text Generation • Updated 18 days ago • 3.12M • • 3.63k

liked a model about 2 months ago

parler-tts/parler-tts-large-v1

Text-to-Speech • Updated Nov 22, 2024 • 22.5k • 244

upvoted a collection about 2 months ago

DeepSeek R1 (All Versions)

Collection

DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 1 day ago • 209

liked a model 2 months ago

Undi95/Phi4-abliterated

Updated Jan 9 • 92 • 10

liked a model 7 months ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Aug 16, 2024 • 2.73M • • 9.32k

upvoted a collection 8 months ago

Lumimaid 0.2

Collection

4 items • Updated Jul 26, 2024 • 18

reacted to Undi95's post with ❤️ 8 months ago

Post

15419

Hello there,

New model released, my goal was to try finetune on the last Llama-3.1-8B-Instruct but not a small train, I wanted to do something useful.
One of the rare model that I didn't made for RP, or in the goal to uncensor it (but I did anyway kek).

The model was trained on 9M Claude conversations ONLY, giving him another writting style.

Undi95/Meta-Llama-3.1-8B-Claude > OG release fp32, it's the epoch 2
Undi95/Meta-Llama-3.1-8B-Claude-bf16 > Base model resharded in bf16 waiting for available quant without issues

Since it's frustrating to be censored using a local model, orthogonal activation steering was used, trying to force the model to never refuse a prompt.

Undi95/Meta-Llama-3.1-8B-Claude-68fail-3000total > Uncensored model, refuse 68 times on 3000 toxic prompt
Undi95/Meta-Llama-3.1-8B-Claude-39fail-3000total > Uncensored model, refuse 39 times on 3000 toxic prompt

It still refuse some prompt but the majority of them is uncensored. OAS can make a model more dumb or make the base perplexity go higher, so I didn't snipe for 0 refusal.

I don't do non-RP model a lot so any feedback is welcome, I would like to re-use this base for some others future project if needed.

4 replies

liked 2 models 10 months ago

Undi95/Llama-3-Chatty-2x8B-GGUF

Updated May 22, 2024 • 10 • 9

Undi95/Llama-3-Chatty-2x8B

Text Generation • Updated May 22, 2024 • 29 • 11

reacted to Undi95's post with 🤗 10 months ago

Post

16536

Hey everyone,

Just wanted to shout out a massive thank you to all 2000 of you who've followed me on Hugging Face! 🎉 It's incredible to have such an awesome crew backing me up as I dive into all these LLM experiments.

Even though not all my models turn out perfect, I've found some real gems and methods along the way 💎. It's like digging for treasure – sometimes you found nothing, but sometimes you find a pearl, and sometimes you find a new method to try.

Your support and encouragement mean the world to me, and I'm really stoked to keep experimenting and learning. If you told me some years ago I would have so much people following me for what I do, I wouldn't have believed it. Here's to more discoveries and adventures ahead! 🚀

Also, big thanks once again, and a huge shoutout to @IkariDev for being there through this journey and supporting me. I'm excited for our future work together and hope we will continue to make people happy! 👏

I want to thank @Gryphe too, since my early work was heavily inspired from MythoMax and the RP/ERP vibe of it. If I'm here today it's probably because of you 😂

I was so close to forget @chargoddard and his amazing tool too! What will we do without mergekit in our life? Thank you! 🙏

See y'all at 3k!

5 replies

reacted to abhishek's post with 🔥 11 months ago

Post

3481

With AutoTrain, you can already finetune the latest llama3 models without writing a single line of code. Here's an example finetune of llama3 8b model: abhishek/autotrain-llama3-no-robots

2 replies

reacted to Undi95's post with 👍 12 months ago

Post

Hey, it took some time but I finally moved out and got internet back, so here I am again!
A lot of things to get updated on, I will try to reply to each of you ASAP.
See you soon!

1 reply

reacted to DmitryRyumin's post with 👍 12 months ago

Post

1609

🚀🎭🌟 New Research Alert! 🌟 🎭🚀
📄 Title: FlashFace: Human Image Personalization with High-fidelity Identity Preservation 🔝

📝 Description: FlashFace is a personalized photo editing tool that focuses on high-fidelity identity preservation and improved compliance through advanced encoding and integration strategies.

👥 Authors: Shilong Zhang, Lianghua Huang, @xichenhku et al.

🔗 Paper: FlashFace: Human Image Personalization with High-fidelity Identity Preservation (2403.17008)

🌐 Github Page: https://jshilong.github.io/flashface-page

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #FlashFace #Personalization #HighFidelityIdentity #DeepLearning #Innovation

liked 3 models 12 months ago