Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

·

AI & ML interests

AI Tech Lead | MD

Recent Activity

posted an update 24 minutes ago

Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) 🤖 Can we teach a model to think completely on its own without reinforcement learning? Actually, yes. We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective? We observed that various models differ in their thinking processes, and fine-tuning one model on another model’s thoughts (CoT) can sometimes be inefficient—often resulting in the model simply memorizing reasoning rather than learning how to actually think. I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer. To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset. Enjoy! 🚀 PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.

reacted to Abhaykoul's post with 👀 29 minutes ago

🔥 THE WAIT IS OVER... HAI-SER IS HERE! 🔥 Yo fam, this ain't just another AI drop— this is the FUTURE of emotional intelligence! 🚀 Introducing HAI-SER, powered by Structured Emotional Reasoning (SER), the next-level AI that doesn’t just understand your words—it feels you, analyzes your emotions, and helps you navigate life’s toughest moments. 💡 💥 What makes HAI-SER a game-changer? 🔹 Emotional Vibe Check – Gets the mood, energy, and what’s really going on 🎭 🔹 Mind-State Analysis – Breaks down your thoughts, beliefs, and patterns 🤯 🔹 Root Cause Deep-Dive – Unpacks the WHY behind your emotions 💡 🔹 Impact Check – Sees how it’s affecting your life and mental health 💔 🔹 Safety Check – Prioritizes your well-being and crisis management 🚨 🔹 Healing Game Plan – Custom strategies to help you bounce back 💪 🔹 Growth Potential – Turns struggles into opportunities for self-improvement 📈 🔹 How to Approach – Teaches you and others how to communicate and heal 🤝 🔹 Personalized Response – Not just generic advice—real talk, tailored to YOU 💯 No more robotic AI responses. No more surface-level advice. HAI-SER gets deep, analyzing emotions with precision and giving real, actionable support. This ain’t just AI—this is your digital therapist, life coach, and hype squad all in one. Whether it’s mental health, career struggles, relationships, or personal growth, HAI-SER has your back. 🚀 The future of emotionally intelligent AI is HERE. Are you ready? 🔥💯 https://huggingface.co/HelpingAI/HAI-SER

reacted to singhsidhukuldeep's post with 🚀 about 2 hours ago

Groundbreaking Research Alert: Can Large Language Models Really Understand Personal Preferences? A fascinating new study from researchers at University of Notre Dame, Xi'an Jiaotong University, and Université de Montréal introduces PERRECBENCH - a novel benchmark for evaluating how well Large Language Models (LLMs) understand user preferences in recommendation systems. Key Technical Insights: - The benchmark eliminates user rating bias and item quality factors by using relative ratings and grouped ranking approaches - Implements three distinct ranking methods: pointwise rating prediction, pairwise comparison, and listwise ranking - Evaluates 19 state-of-the-art LLMs including Claude-3.5, GPT-4, Llama-3, Mistral, and Qwen models - Uses Kendall's tau correlation to measure ranking accuracy - Incorporates BM25 retriever with configurable history items (k=4 by default) Notable Findings: - Current LLMs struggle with true personalization, achieving only moderate correlation scores - Larger models don't always perform better - challenging conventional scaling laws - Pairwise and listwise ranking methods outperform pointwise approaches - Open-source models like Mistral-123B and Llama-3-405B compete well with proprietary models - Weight merging strategy shows promise for improving personalization capabilities The research reveals that while LLMs excel at many tasks, they still face significant challenges in understanding individual user preferences. This work opens new avenues for improving personalized recommendation systems and highlights the importance of developing better evaluation methods. A must-read for anyone interested in LLMs, recommender systems, or personalization technology. The team has made their benchmark and code publicly available for further research.

View all activity

Organizations

mkurman's activity

New activity in mkurman/Qwen2.5-14B-DeepSeek-R1-1M 4 days ago

Merge strategy

#1 opened 4 days ago by

Mergekit config

#2 opened 4 days ago by

New activity in mkurman/llama-3.2-MEDIT-3B-o1 20 days ago

space

#1 opened 26 days ago by

New activity in UCSC-VLAA/o1_medical about 1 month ago

a runnable script to download all the dataset to colab or kaggle notebook

#1 opened 4 months ago by

New activity in Datou1111/shou_xin about 2 months ago

Add generated example

#3 opened about 2 months ago by

New activity in meditsolutions/SmolLM2-MedIT-Upscale-2B about 2 months ago

Adding Evaluation Results

#1 opened about 2 months ago by

leaderboard-pr-bot

New activity in meditsolutions/Llama-3.2-SUN-1B-Instruct 2 months ago

Adding Evaluation Results

#1 opened 2 months ago by

New activity in meditsolutions/Llama-3.1-MedIT-SUN-8B 3 months ago

Adding Evaluation Results

#1 opened 3 months ago by

leaderboard-pr-bot

New activity in Weyaxi/leaderboard-results-to-modelcard 3 months ago

evaluation

#18 opened 3 months ago by

New activity in mkurman/Llama-3.2-MEDIT-3B-o1 3 months ago

Good one, But 😶‍🌫️

#1 opened 3 months ago by

New activity in meditsolutions/Llama-3.2-SUN-2.5B-chat 3 months ago

Adding Evaluation Results

#5 opened 3 months ago by

Adding Evaluation Results

#4 opened 3 months ago by

Adding Evaluation Results

#3 opened 3 months ago by

leaderboard-pr-bot

Adding Evaluation Results

#2 opened 3 months ago by

Adding Evaluation Results

#1 opened 3 months ago by

leaderboard-pr-bot

New activity in open-llm-leaderboard/open_llm_leaderboard 4 months ago

Sign in button not working - cannot submit model

#963 opened 4 months ago by

Sign in button not working - cannot submit model

#963 opened 4 months ago by