BigScience Biomedical Datasets

non-profit

AI & ML interests

We aim to unify the schema across many different biomedical NLP resources.

Recent Activity

jfries  updated a dataset 16 days ago
bigbio/bc5cdr
phlobo  updated a dataset about 2 months ago
bigbio/craft
View all activity

bigbio's activity

mkurman 
posted an update about 3 hours ago
view post
Post
137
Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) 🤖

Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.

We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?

We observed that various models differ in their thinking processes, and fine-tuning one model on another model’s thoughts (CoT) can sometimes be inefficient—often resulting in the model simply memorizing reasoning rather than learning how to actually think.

I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.

To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.

Enjoy! 🚀

PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.
not-lain 
posted an update 1 day ago
mkurman 
posted an update 1 day ago
view post
Post
2193
Ok, my 14B DeepSeek R1 merge with Qwen2.5 1M is really hot right now—it's got 2.6k downloads! It's sitting pretty as the top trending model on the third page. 🔥

Check it out if you haven't already!
mkurman/Qwen2.5-14B-DeepSeek-R1-1M
·
AtAndDev 
posted an update 2 days ago
view post
Post
1436
everywhere i go i see his face
prithivMLmods 
posted an update 2 days ago
view post
Post
3146
Deepswipe by
.
.
.
. Deepseek🐬🗿






Everything is now in recovery. 📉📈
·
Tonic 
posted an update 3 days ago
view post
Post
2113
🙋🏻‍♂️ Hey there folks ,

our team made a game during the @mistral-game-jam and we're trying to win the community award !

try our game out and drop us a ❤️ like basically to vote for us !

Mistral-AI-Game-Jam/TextToSurvive

hope you like it !
mmaguero 
posted an update 4 days ago
view post
Post
1437
🚀 Multidimensional Affective Analysis for Guarani/Jopara! 🌎

This project explored affective computing for low-resource languages, focusing on emotion recognition, humor detection, and offensive language identification in Guarani and Jopara (a code-switching mix of Guarani and Spanish).

Highlights:
🧵 Corpora:
- Emotion Recognition
- Humor Detection
- Offensive Language Identification
💻 Base Models for Fine-Tuning (trained on Guarani Wiki):
- From scratch: BERT-based tiny, small, base and large models
- Continuously pre-trained models: Multilingual-BERT and BETO
📓 Baseline Notebooks:
- Fine-tuning BERT-based models
- NCRF++ models via GitHub

💡 Check the repo!
https://github.com/mmaguero/guarani-multi-affective-analysis

📖 Check out the publication here:
- https://digibug.ugr.es/handle/10481/98843
- https://link.springer.com/article/10.1007/s12559-023-10165-0

#NLP #AffectiveComputing #LowResourceLanguages #Guarani #Jopara #SentimentAnalysis #AIForAll
Delta-Vector 
posted an update 4 days ago
mkurman 
posted an update 4 days ago
view post
Post
1888
I’ve simplified things for the AI OS community!

Check out Qwen-2.5-14B-DeepSeek-R1-1M! This one's a cool blend of the latest Qwen 2.5 with 14 billion parameters and has a massive 1 million token context window. It also comes with the DeepSeek R1 version of the Qwen 2.5 14B base model.

Enjoy! 🚀

mkurman/Qwen2.5-14B-DeepSeek-R1-1M
AtAndDev 
posted an update 8 days ago
view post
Post
473
Deepseek gang on fire fr fr
prithivMLmods 
posted an update 10 days ago
AtAndDev 
posted an update 11 days ago
view post
Post
1574
R1 is out! And with a lot of other R1 releated models...
mkurman 
posted an update 13 days ago
not-lain 
posted an update 14 days ago
view post
Post
1193
we now have more than 2000 public AI models using ModelHubMixin🤗
prithivMLmods 
posted an update 14 days ago
view post
Post
3066
ChemQwen-vL [ Qwen for Chem Vision ] 🧑🏻‍🔬

🧪Model : prithivMLmods/ChemQwen-vL

📝ChemQwen-vL is a vision-language model fine-tuned based on the Qwen2VL-2B Instruct model. It has been trained using the International Chemical Identifier (InChI) format for chemical compounds and is optimized for chemical compound identification. The model excels at generating the InChI and providing descriptions of chemical compounds based on their images. Its architecture operates within a multi-modal framework, combining image-text-text capabilities. It has been fine-tuned using datasets from: https://iupac.org/projects/

📒Colab Demo: https://tinyurl.com/2pn8x6u7, Collection : https://tinyurl.com/2mt5bjju

Inference with the documentation is possible with the help of the ReportLab library. https://pypi.org/project/reportlab/

🤗: @prithivMLmods
  • 1 reply
·
Tonic 
posted an update 15 days ago
view post
Post
1506
🙋🏻‍♂️ Hey there folks ,

Facebook AI just released JASCO models that make music stems .

you can try it out here : Tonic/audiocraft

hope you like it
jfries 
in bigbio/bc5cdr 16 days ago
Tonic 
posted an update 17 days ago
view post
Post
2402
🙋🏻‍♂️Hey there folks , Open LLM Europe just released Lucie 7B-Instruct model , a billingual instruct model trained on open data ! You can check out my unofficial demo here while we wait for the official inference api from the group : Tonic/Lucie-7B hope you like it 🚀
not-lain 
posted an update 19 days ago
view post
Post
3930
Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co/blog/not-lain/tensor-dims
some interesting takeaways :