Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

reacted to Abhaykoul's post with 👀 24 minutes ago
🔥 THE WAIT IS OVER... HAI-SER IS HERE! 🔥 Yo fam, this ain't just another AI drop— this is the FUTURE of emotional intelligence! 🚀 Introducing HAI-SER, powered by Structured Emotional Reasoning (SER), the next-level AI that doesn’t just understand your words—it feels you, analyzes your emotions, and helps you navigate life’s toughest moments. 💡 💥 What makes HAI-SER a game-changer? 🔹 Emotional Vibe Check – Gets the mood, energy, and what’s really going on 🎭 🔹 Mind-State Analysis – Breaks down your thoughts, beliefs, and patterns 🤯 🔹 Root Cause Deep-Dive – Unpacks the WHY behind your emotions 💡 🔹 Impact Check – Sees how it’s affecting your life and mental health 💔 🔹 Safety Check – Prioritizes your well-being and crisis management 🚨 🔹 Healing Game Plan – Custom strategies to help you bounce back 💪 🔹 Growth Potential – Turns struggles into opportunities for self-improvement 📈 🔹 How to Approach – Teaches you and others how to communicate and heal 🤝 🔹 Personalized Response – Not just generic advice—real talk, tailored to YOU 💯 No more robotic AI responses. No more surface-level advice. HAI-SER gets deep, analyzing emotions with precision and giving real, actionable support. This ain’t just AI—this is your digital therapist, life coach, and hype squad all in one. Whether it’s mental health, career struggles, relationships, or personal growth, HAI-SER has your back. 🚀 The future of emotionally intelligent AI is HERE. Are you ready? 🔥💯 https://huggingface.co/HelpingAI/HAI-SER
reacted to singhsidhukuldeep's post with 🚀 about 2 hours ago
Groundbreaking Research Alert: Can Large Language Models Really Understand Personal Preferences? A fascinating new study from researchers at University of Notre Dame, Xi'an Jiaotong University, and Université de Montréal introduces PERRECBENCH - a novel benchmark for evaluating how well Large Language Models (LLMs) understand user preferences in recommendation systems. Key Technical Insights: - The benchmark eliminates user rating bias and item quality factors by using relative ratings and grouped ranking approaches - Implements three distinct ranking methods: pointwise rating prediction, pairwise comparison, and listwise ranking - Evaluates 19 state-of-the-art LLMs including Claude-3.5, GPT-4, Llama-3, Mistral, and Qwen models - Uses Kendall's tau correlation to measure ranking accuracy - Incorporates BM25 retriever with configurable history items (k=4 by default) Notable Findings: - Current LLMs struggle with true personalization, achieving only moderate correlation scores - Larger models don't always perform better - challenging conventional scaling laws - Pairwise and listwise ranking methods outperform pointwise approaches - Open-source models like Mistral-123B and Llama-3-405B compete well with proprietary models - Weight merging strategy shows promise for improving personalization capabilities The research reveals that while LLMs excel at many tasks, they still face significant challenges in understanding individual user preferences. This work opens new avenues for improving personalized recommendation systems and highlights the importance of developing better evaluation methods. A must-read for anyone interested in LLMs, recommender systems, or personalization technology. The team has made their benchmark and code publicly available for further research.
View all activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

mkurman's activity

posted an update 19 minutes ago
view post
Post
Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) 🤖

Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.

We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?

We observed that various models differ in their thinking processes, and fine-tuning one model on another model’s thoughts (CoT) can sometimes be inefficient—often resulting in the model simply memorizing reasoning rather than learning how to actually think.

I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.

To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.

Enjoy! 🚀

PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.
reacted to Abhaykoul's post with 👀 24 minutes ago
view post
Post
544
🔥 THE WAIT IS OVER... HAI-SER IS HERE! 🔥

Yo fam, this ain't just another AI drop— this is the FUTURE of emotional intelligence! 🚀

Introducing HAI-SER, powered by Structured Emotional Reasoning (SER), the next-level AI that doesn’t just understand your words—it feels you, analyzes your emotions, and helps you navigate life’s toughest moments. 💡

💥 What makes HAI-SER a game-changer?
🔹 Emotional Vibe Check – Gets the mood, energy, and what’s really going on 🎭
🔹 Mind-State Analysis – Breaks down your thoughts, beliefs, and patterns 🤯
🔹 Root Cause Deep-Dive – Unpacks the WHY behind your emotions 💡
🔹 Impact Check – Sees how it’s affecting your life and mental health 💔
🔹 Safety Check – Prioritizes your well-being and crisis management 🚨
🔹 Healing Game Plan – Custom strategies to help you bounce back 💪
🔹 Growth Potential – Turns struggles into opportunities for self-improvement 📈
🔹 How to Approach – Teaches you and others how to communicate and heal 🤝
🔹 Personalized Response – Not just generic advice—real talk, tailored to YOU 💯

No more robotic AI responses. No more surface-level advice. HAI-SER gets deep, analyzing emotions with precision and giving real, actionable support.

This ain’t just AI—this is your digital therapist, life coach, and hype squad all in one. Whether it’s mental health, career struggles, relationships, or personal growth, HAI-SER has your back.

🚀 The future of emotionally intelligent AI is HERE.
Are you ready? 🔥💯

HelpingAI/HAI-SER
  • 1 reply
·
reacted to singhsidhukuldeep's post with 🚀 about 2 hours ago
view post
Post
1315
Groundbreaking Research Alert: Can Large Language Models Really Understand Personal Preferences?

A fascinating new study from researchers at University of Notre Dame, Xi'an Jiaotong University, and Université de Montréal introduces PERRECBENCH - a novel benchmark for evaluating how well Large Language Models (LLMs) understand user preferences in recommendation systems.

Key Technical Insights:
- The benchmark eliminates user rating bias and item quality factors by using relative ratings and grouped ranking approaches
- Implements three distinct ranking methods: pointwise rating prediction, pairwise comparison, and listwise ranking
- Evaluates 19 state-of-the-art LLMs including Claude-3.5, GPT-4, Llama-3, Mistral, and Qwen models
- Uses Kendall's tau correlation to measure ranking accuracy
- Incorporates BM25 retriever with configurable history items (k=4 by default)

Notable Findings:
- Current LLMs struggle with true personalization, achieving only moderate correlation scores
- Larger models don't always perform better - challenging conventional scaling laws
- Pairwise and listwise ranking methods outperform pointwise approaches
- Open-source models like Mistral-123B and Llama-3-405B compete well with proprietary models
- Weight merging strategy shows promise for improving personalization capabilities

The research reveals that while LLMs excel at many tasks, they still face significant challenges in understanding individual user preferences. This work opens new avenues for improving personalized recommendation systems and highlights the importance of developing better evaluation methods.

A must-read for anyone interested in LLMs, recommender systems, or personalization technology. The team has made their benchmark and code publicly available for further research.
reacted to nicolay-r's post with 👀 about 8 hours ago
view post
Post
1209
🚨 If you want a quickly apply various reasoning techniques 🧠 for your dataset, then I am happy to save your time and introduce 🌌 nlp-thirdgate 🌌

https://github.com/nicolay-r/nlp-thirdgate

This is a hub of a third-party providers like OpenAI, Replicate, OpenRouter, Hugging Face 🤗 Transformers to be used for varions NLP tasks in a no-string mode. So that, you decide which dependecies to install, which I personally see is handy for:
📙 quick scripts deployment in notebooks like Google Colab;
📦 empowering existing apps with machnine learning;

📷 The example below demonstrates on how to quick start with reasoning over rows of CSV / JSONL data.

To quick start, all you have to do is to download one of the provider and pass it to the script as shown in the image below.
🌟 Powered by bulk-chain: https://github.com/nicolay-r/bulk-chain
reacted to csabakecskemeti's post with 🔥 1 day ago
reacted to Bils's post with 🔥 1 day ago
view post
Post
1148
🚀 We're excited to share major improvements to our Janus-Pro-7B Text-to-Image Generation Space!
🎨What's New:
1-Critical Bug Fixes
2-Enhanced Features
3-UI Improvements
4-Performance Boost
Try It Now:
Bils/DeepseekJanusPro-Image
replied to their post 1 day ago
replied to their post 1 day ago
replied to their post 1 day ago
view reply

Sure, you should start with the data and the base model and start continual pre-training (CPT), if a fine-tuned one is terrible at the language of choice - if not, you can skip it. Then, you can merge your derivative with the base model using the technique I mentioned earlier in a recent reply. Next, jump into supervised fine-tuning with proper data and reinforcement learning. You can use the open-r1 pipeline from Hugging Face to do so (Github).

Of course, to train such a big model, you need at least 4x H100 to be reasonable.

replied to their post 1 day ago
view reply

Hi @joermd ! It hasn't been trained; it has been merged instead using arcee-ai mergeKit. You can find it on github, and the configuration I've used is on the models' README page.

posted an update 1 day ago
view post
Post
2083
Ok, my 14B DeepSeek R1 merge with Qwen2.5 1M is really hot right now—it's got 2.6k downloads! It's sitting pretty as the top trending model on the third page. 🔥

Check it out if you haven't already!
mkurman/Qwen2.5-14B-DeepSeek-R1-1M
·
reacted to fdaudens's post with 🚀 1 day ago
view post
Post
1574
🚀 The open source community is unstoppable: 4M total downloads for DeepSeek models on Hugging Face, with 3.2M coming from the +600 models created by the community.

That's 30% more than yesterday!
  • 1 reply
·
reacted to singhsidhukuldeep's post with 🚀 3 days ago
view post
Post
2461
While everyone is buzzing about DeepSeek AI R1's groundbreaking open-source release, ByteDance has quietly launched something remarkable - Trae, an adaptive AI IDE that's redefining the development experience and unlike competitors like Cursor, it' completely FREE!

Trae is a sophisticated development environment built on Microsoft's VSCode foundation(with a nice skin on top), offering unlimited free access to both OpenAI's GPT-4o and Anthropic's Claude-3.5-Sonnet models.

Technical Highlights:
- Real-time AI pair programming with comprehensive codebase understanding
- Natural language commands for code generation and project-level development
- Intelligent task decomposition for automated planning and execution
- Seamless VS Code and Cursor configuration compatibility
- Multi-language support with specialized optimization for English and Chinese interfaces

Currently available for macOS (Windows version in development), Trae is distributed through ByteDance's Singapore subsidiary, Spring (SG) Pte. What sets it apart is its ability to handle mixed-language workflows and enhanced localization features that address common pain points in existing IDEs.

The AI assistant can generate code snippets, optimize logic, and even create entire projects from scratch through natural language prompts. It also features an innovative AI Chat system accessible via keyboard shortcuts for real-time coding assistance.

For developers looking to enhance their productivity without breaking the bank, Trae offers enterprise-grade AI capabilities completely free during its initial release. This move by ByteDance signals a significant shift in the AI IDE landscape, challenging established players with a robust, accessible alternative.

Try it at trae.ai
reacted to sagar007's post with ❤️ 3 days ago
view post
Post
3424
🚀 Just built a Perplexity-inspired AI search assistant using Gradio, DeepSeek, and DuckDuckGo!
Ask it anything, and it’ll:

Scour the web for answers 📚

Cite sources like a pro 🔗

Even talk back with TTS (thanks, Kokoro!) 🎙️

Ask it anything, and it’ll:

Scour the web for answers 📚


Check it out → sagar007/DeepSeekR1_Search
New activity in mkurman/Qwen2.5-14B-DeepSeek-R1-1M 4 days ago

Merge strategy

2
#1 opened 4 days ago by
FiditeNemini

Mergekit config

2
#2 opened 4 days ago by
ehartford
posted an update 4 days ago
view post
Post
1887
I’ve simplified things for the AI OS community!

Check out Qwen-2.5-14B-DeepSeek-R1-1M! This one's a cool blend of the latest Qwen 2.5 with 14 billion parameters and has a massive 1 million token context window. It also comes with the DeepSeek R1 version of the Qwen 2.5 14B base model.

Enjoy! 🚀

mkurman/Qwen2.5-14B-DeepSeek-R1-1M