Spaces:
Running
on
CPU Upgrade
How to update the v1&v2 leaderboard?
There are actually two issues here. First, I have updated my model in the original repository, but since I cannot modify the repository name, I am unable to submit to the v2 leaderboard. Second, I noticed some discrepancies between my local evaluation results and the online results, so I would like to request your help in updating the evaluation results on the v1 leaderboard. Please contact me as soon as possible. Thank you, and have a great day!
Hi
@StarscreamDeceptions
regarding the 1st issue, you can submit the model through the submission page. We don't require models submitted to v2 to be part v1 as the leaderboard is always open for new released models!
Can you kindly elaborate further on the issue? However, we suggest you start by submitting the model to v2, as v1 is now on archive status!
Hi,Bro
@amztheory
Regarding the first issue, I have modified the model name in my repository and submitted it, which is great. As for the second issue, due to my oversight, I failed to submit my model to the v1 leaderboard in time. I realize that the v1 leaderboard is now archived. Would it be possible for you to help evaluate my model and update its results on the v1 leaderboard? This is crucial for my project progress as I need to obtain accurate results of my trained model on the v1 leaderboard. I sincerely hope you can help me update the v1 leaderboard. Thank you again for your assistance.
Hi
@StarscreamDeceptions
We thank you for showing interest in v1 benchmarks, however, we no longer support evaluating models on v1 benchmarks. It's worth noting that benchmarks in v2 should be a better representative of your models performance in Arabic tasks.
judging by the date of your response, I believe you are the one who has submitted AIDC-AI/Marco-LLM-AR-V2. As you might have noticed the model failed to get evaluated, which is due to not being able to load the model. The issues seem to stem from the fact that your model.safetensors.index.json is missing 'metadata' property.
let me know once you have these issues fixed, and I will launch the eval again.
Hi,
@amztheory
I have fixed these issues,thanks for your help~
Hi
@amztheory
I am wondering know how you guys eval the model,Why is there such a big gap between online and offline results?
@amztheory Hope to get your reply soon,thank you
Hi
@StarscreamDeceptions
scores above are for AIDC-AI/Marco-LLM-AR-V2 correct?
I will look into it! Can you just confirm that you have reproduced the scores with --use-chat-template enabled since you requested using the chat template upon submission
Hi
@amztheory
Thanks for your response, It seems that chat template cause this problem happended, Can you remove our results and let me resubmit them?
Hi
@amztheory
Looking forward to your reply!
@StarscreamDeceptions
I'll relaunch the eval on your model with chat_template being disabled. Current scores will be removed.
@amztheory
thank you and have a nice day!
@StarscreamDeceptions
the leaderboard should now reflect the updated scores for your model!
In future submissions, please ensure you select chat_template only if desired.
keep up the good work!