True, but it seems like there’s nothing to be evaluated as of right now. I assume the ultimate goal is to train a new reasoning model and then use the same evaluation metrics as o1 and the DeepSeek-R1.
Leo
elephant-bear
AI & ML interests
None yet
Recent Activity
commented on
an
article
10 days ago
Open-R1: a fully open reproduction of DeepSeek-R1
Organizations
None yet
elephant-bear's activity
commented on
Open-R1: a fully open reproduction of DeepSeek-R1
10 days ago