Spaces:

Intel
/

low_bit_open_llm_leaderboard

Running

App Files Files Community

lvkaokao commited on May 11, 2024

Commit

4686e27

1 Parent(s): 1aaa48e

update description.

Browse files

Files changed (1) hide show

src/display/about.py +13 -7

src/display/about.py CHANGED Viewed

@@ -99,7 +99,7 @@ My model disappeared from all the queues, what happened?
 - *A model disappearing from all the queues usually means that there has been a failure. You can check if that is the case by looking for your model [here](https://huggingface.co/datasets/Intel/ld_requests).*
 What causes an evaluation failure?
-- *Most of the failures we get come from problems in the submissions (not quantization model, corrupted files, config problems, wrong parameters selected for eval ...), so we'll be grateful if you first make sure you have followed the steps in `About`. However, from time to time, we have failures on our side (hardware/node failures, problem with an update of our backend, connectivity problem ending up in the results not being saved, ...).*
 How can I report an evaluation failure?
 - *As we store the logs for all models, feel free to create an issue, **where you link to the requests file of your model** (look for it [here](https://huggingface.co/datasets/Intel/ld_requests)), so we can investigate! If the model failed due to a problem on our side, we'll relaunch it right away!*
@@ -117,6 +117,8 @@ What kind of information can I find?
 Why do models appear several times in the leaderboard?
 - *We run evaluations with user selected precision and model commit. Sometimes, users submit specific models at different commits and at different precisions (for example, in float16 and 4bit to see how quantization affects performance). You should be able to verify this by displaying the `precision` and `model sha` columns in the display. If, however, you see models appearing several time with the same precision and hash commit, this is not normal.*
 ---------------------------
@@ -157,17 +159,21 @@ If this step fails, follow the error messages to debug your model before submitt
 Note: make sure your model is public!
 Note: if your model needs `use_remote_code=True`, we do not support this option yet but we are working on adding it, stay posted!
-### 2) Convert your model weights to [safetensors](https://huggingface.co/docs/safetensors/index)
-It's a new format for storing weights which is safer and faster to load and use. It will also allow us to add the number of parameters of your model to the `Extended Viewer`!
-### 3) Make sure your model has an open license!
 This is a leaderboard for Open LLMs, and we'd love for as many people as possible to know they can use your model 🤗
-### 4) Fill up your model card
 When we add extra information about models to the leaderboard, it will be automatically taken from the model card
-### 5) Select the compute dtype
-The compute dtype will pass to `lm-eval` for the inference. Currently, we only support float16 and bfloat16. Other selections will supported soon.
 """

 - *A model disappearing from all the queues usually means that there has been a failure. You can check if that is the case by looking for your model [here](https://huggingface.co/datasets/Intel/ld_requests).*
 What causes an evaluation failure?
+- *Most of the failures we get come from problems in the submissions (corrupted files, config problems, wrong parameters selected for eval ...), so we'll be grateful if you first make sure you have followed the steps in `About`. And some quantized models can't be evaluated with some issues (runtime error which requires manual checking (like [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ)...), so you can push your questions to the [Community Pages](https://huggingface.co/spaces/Intel/low_bit_open_llm_leaderboard/discussions). However, from time to time, we have failures on our side (hardware/node failures, problem with an update of our backend, connectivity problem ending up in the results not being saved, ...).*
 How can I report an evaluation failure?
 - *As we store the logs for all models, feel free to create an issue, **where you link to the requests file of your model** (look for it [here](https://huggingface.co/datasets/Intel/ld_requests)), so we can investigate! If the model failed due to a problem on our side, we'll relaunch it right away!*
 Why do models appear several times in the leaderboard?
 - *We run evaluations with user selected precision and model commit. Sometimes, users submit specific models at different commits and at different precisions (for example, in float16 and 4bit to see how quantization affects performance). You should be able to verify this by displaying the `precision` and `model sha` columns in the display. If, however, you see models appearing several time with the same precision and hash commit, this is not normal.*
+Why are the llama series models marked with *?
+- *We are evaluating llama.cpp models with `lm-eval` [code](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/models/gguf.py) , and we find that some tasks results are abnormal even though we make some modifications. So we mark them and will verify them further.
 ---------------------------
 Note: make sure your model is public!
 Note: if your model needs `use_remote_code=True`, we do not support this option yet but we are working on adding it, stay posted!
+### 2) Confirm your model weights format!
+Your model weights format should be [safetensors](https://huggingface.co/docs/safetensors/index), and if your model is llama.cpp, it should be end of `Q4_0.gguf` because we only support this currently.
+### 3) Confirm your model config!
+If your model is one of the following quantization types: `AutoRound`, `GPTQ`, `AWQ`, `bitsandbytes`, there should be a `quantization_config` in your model config, like [this](https://huggingface.co/TheBloke/SOLAR-10.7B-Instruct-v1.0-GPTQ/blob/main/config.json#L28).
+### 4) Make sure your model has an open license!
 This is a leaderboard for Open LLMs, and we'd love for as many people as possible to know they can use your model 🤗
+### 5) Fill up your model card
 When we add extra information about models to the leaderboard, it will be automatically taken from the model card
+### 6) Select the compute dtype
+The compute dtype will pass to `lm-eval` for the inference. Currently, we support `float16/bfloat16/float32` for `AutoRound`, `GPTQ`, `AWQ`, `bitsandbytes` and support `int8` for `llama.cpp`. The defaule value is `float16`.
 """