dinachen
DeeLearning
AI & ML interests
None yet
Recent Activity
published
a model
28 days ago
DeeLearning/Qwen2.5-Math-1.5B-Distill-114k
published
a model
about 1 month ago
DeeLearning/DeepSeek-R1-Distill-Qwen-1.5B-GRPO
updated
a model
about 1 month ago
DeeLearning/Qwen2.5-1.5B-Open-R1-Distill
Organizations
None yet
DeeLearning's activity
CheckpointingException | nvidia/Llama3-70B-SteerLM-RM NOT a distributed checkpoint of Megatron
1
#4 opened 5 months ago
by
DeeLearning
