Ansari123 (ansari parvez)

New activity in black-forest-labs/FLUX.1-dev 15 days ago

Pktiwari

#273 opened 15 days ago by

Ansari123

Mky

#272 opened 15 days ago by

Ansari123

Golu

1

#271 opened 15 days ago by

Ansari123

reacted to chansung's post with 🚀 16 days ago

Post

2009

Simple Summarization on DeepSeek-R1 from DeepSeek AI

The RL stage is very important.
↳ However, it is difficult to create a truly helpful AI for people solely through RL.
↳ So, we applied a learning pipeline consisting of four stages: providing a good starting point, reasoning RL, SFT, and safety RL, and achieved performance comparable to o1.
↳ Simply fine-tuning other open models with the data generated by R1-Zero (distillation) resulted in performance comparable to o1-mini.

Of course, this is just a brief overview and may not be of much help. All models are accessible on Hugging Face, and the paper can be read through the GitHub repository.

Model: https://huggingface.co/deepseek-ai
Paper: https://github.com/deepseek-ai/DeepSeek-R1

1 reply

·

published a model 16 days ago

Ansari123/G

Updated 16 days ago

ansari parvez

AI & ML interests

Recent Activity

Organizations

Ansari123's activity

Pktiwari

Mky

Golu

Ansari123/G