Mariusz Kurman PRO
mkurman
AI & ML interests
AI Tech Lead | MD
Recent Activity
posted
an
update
24 minutes ago
Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) 🤖
Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.
We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?
We observed that various models differ in their thinking processes, and fine-tuning one model on another model’s thoughts (CoT) can sometimes be inefficient—often resulting in the model simply memorizing reasoning rather than learning how to actually think.
I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.
To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.
Enjoy! 🚀
PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.
reacted
to
singhsidhukuldeep's
post
with 🚀
about 2 hours ago
Groundbreaking Research Alert: Can Large Language Models Really Understand Personal Preferences?
A fascinating new study from researchers at University of Notre Dame, Xi'an Jiaotong University, and Université de Montréal introduces PERRECBENCH - a novel benchmark for evaluating how well Large Language Models (LLMs) understand user preferences in recommendation systems.
Key Technical Insights:
- The benchmark eliminates user rating bias and item quality factors by using relative ratings and grouped ranking approaches
- Implements three distinct ranking methods: pointwise rating prediction, pairwise comparison, and listwise ranking
- Evaluates 19 state-of-the-art LLMs including Claude-3.5, GPT-4, Llama-3, Mistral, and Qwen models
- Uses Kendall's tau correlation to measure ranking accuracy
- Incorporates BM25 retriever with configurable history items (k=4 by default)
Notable Findings:
- Current LLMs struggle with true personalization, achieving only moderate correlation scores
- Larger models don't always perform better - challenging conventional scaling laws
- Pairwise and listwise ranking methods outperform pointwise approaches
- Open-source models like Mistral-123B and Llama-3-405B compete well with proprietary models
- Weight merging strategy shows promise for improving personalization capabilities
The research reveals that while LLMs excel at many tasks, they still face significant challenges in understanding individual user preferences. This work opens new avenues for improving personalized recommendation systems and highlights the importance of developing better evaluation methods.
A must-read for anyone interested in LLMs, recommender systems, or personalization technology. The team has made their benchmark and code publicly available for further research.
Organizations
mkurman's activity
Merge strategy
2
#1 opened 4 days ago
by
FiditeNemini
Mergekit config
2
#2 opened 4 days ago
by
ehartford
a runnable script to download all the dataset to colab or kaggle notebook
1
#1 opened 4 months ago
by
actualbrain
Add generated example
1
#3 opened about 2 months ago
by
mkurman
Adding Evaluation Results
#1 opened about 2 months ago
by
leaderboard-pr-bot
Adding Evaluation Results
#1 opened 2 months ago
by
mkurman
Adding Evaluation Results
#1 opened 3 months ago
by
leaderboard-pr-bot
evaluation
2
#18 opened 3 months ago
by
ldwang
Good one, But 😶🌫️
1
#1 opened 3 months ago
by
prithivMLmods
Adding Evaluation Results
#5 opened 3 months ago
by
mkurman
Adding Evaluation Results
#4 opened 3 months ago
by
mkurman
Adding Evaluation Results
#3 opened 3 months ago
by
leaderboard-pr-bot
Adding Evaluation Results
#2 opened 3 months ago
by
mkurman
Adding Evaluation Results
#1 opened 3 months ago
by
leaderboard-pr-bot
Sign in button not working - cannot submit model
1
#963 opened 4 months ago
by
mkurman
Sign in button not working - cannot submit model
1
#963 opened 4 months ago
by
mkurman