HirCoir
/

MiniChat-1.5-3B-Sorah

@@ -7,115 +7,7 @@ license: apache-2.0
 library_name: transformers
 widget:
 - text: <s> [|User|] Hola  </s>[|Assistant|]
-model-index:
-- name: MiniChat-2-3B
-  results:
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: AI2 Reasoning Challenge (25-Shot)
-      type: ai2_arc
-      config: ARC-Challenge
-      split: test
-      args:
-        num_few_shot: 25
-    metrics:
-    - type: acc_norm
-      value: 44.88
-      name: normalized accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-2-3B
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: HellaSwag (10-Shot)
-      type: hellaswag
-      split: validation
-      args:
-        num_few_shot: 10
-    metrics:
-    - type: acc_norm
-      value: 67.69
-      name: normalized accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-2-3B
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: MMLU (5-Shot)
-      type: cais/mmlu
-      config: all
-      split: test
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 47.59
-      name: accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-2-3B
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: TruthfulQA (0-shot)
-      type: truthful_qa
-      config: multiple_choice
-      split: validation
-      args:
-        num_few_shot: 0
-    metrics:
-    - type: mc2
-      value: 49.64
-    source:
-      url: >-
-        https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-2-3B
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: Winogrande (5-shot)
-      type: winogrande
-      config: winogrande_xl
-      split: validation
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 66.46
-      name: accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-2-3B
-      name: Open LLM Leaderboard
-  - task:
-      type: text-generation
-      name: Text Generation
-    dataset:
-      name: GSM8k (5-shot)
-      type: gsm8k
-      config: main
-      split: test
-      args:
-        num_few_shot: 5
-    metrics:
-    - type: acc
-      value: 32.68
-      name: accuracy
-    source:
-      url: >-
-        https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=GeneZC/MiniChat-2-3B
-      name: Open LLM Leaderboard
 tags:
 - unsloth
 - Sorah
@@ -124,16 +16,6 @@ tags:
 ## MiniChat-2-3B
-📑 [arXiv](https://arxiv.org/abs/2311.07052) | 👻 [GitHub](https://github.com/GeneZC/MiniMA) | 🤗 [HuggingFace-MiniMA](https://huggingface.co/GeneZC/MiniMA-3B) | 🤗 [HuggingFace-MiniChat](https://huggingface.co/GeneZC/MiniChat-3B) | 🤖 [ModelScope-MiniMA](https://modelscope.cn/models/GeneZC/MiniMA-3B) | 🤖 [ModelScope-MiniChat](https://modelscope.cn/models/GeneZC/MiniChat-3B) | 🤗 [HuggingFace-MiniChat-1.5](https://huggingface.co/GeneZC/MiniChat-1.5-3B) | 🤗 [HuggingFace-MiniMA-2](https://huggingface.co/GeneZC/MiniMA-2-3B) | 🤗 [HuggingFace-MiniChat-2](https://huggingface.co/GeneZC/MiniChat-2-3B)
-🆕 **Updates from MiniChat-3B**:
-- better base model MiniMA-2-3B;
-- better data mixture;
-- use of [NEFTune](https://arxiv.org/abs/2310.05914);
-- use of [DPO](https://arxiv.org/abs/2305.18290).
-❗ Must comply with LICENSE of LLaMA2 since it is derived from LLaMA2.
 A language model continued from MiniMA-3B and finetuned on both instruction and preference data.
 Surpassing Vicuna-7B and approximating LLaMA-2-Chat-7B on MT-Bench.
@@ -173,26 +55,3 @@ output = tokenizer.decode(output_ids, skip_special_tokens=True).strip()
 # output: "def common_elements(arr1, arr2):\n    if len(arr1) == 0:\n        return []\n    if len(arr2) == 0:\n        return arr1\n\n    common_elements = []\n    for element in arr1:\n        if element in arr2:\n            common_elements.append(element)\n\n    return common_elements"
 # Multiturn conversation could be realized by continuously appending questions to `conv`.
 ```
-## Bibtex
-```bibtex
-@article{zhang2023law,
-    title={Towards the Law of Capacity Gap in Distilling Language Models},
-    author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},
-    year={2023},
-    url={https://arxiv.org/abs/2311.07052}
-}
-```
-# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
-Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_GeneZC__MiniChat-2-3B)
-|             Metric              |Value|
-|---------------------------------|----:|
-|Avg.                             |51.49|
-|AI2 Reasoning Challenge (25-Shot)|44.88|
-|HellaSwag (10-Shot)              |67.69|
-|MMLU (5-Shot)                    |47.59|
-|TruthfulQA (0-shot)              |49.64|
-|Winogrande (5-shot)              |66.46|
-|GSM8k (5-shot)                   |32.68|

 library_name: transformers
 widget:
 - text: <s> [|User|] Hola  </s>[|Assistant|]
 tags:
 - unsloth
 - Sorah
 ## MiniChat-2-3B
 A language model continued from MiniMA-3B and finetuned on both instruction and preference data.
 Surpassing Vicuna-7B and approximating LLaMA-2-Chat-7B on MT-Bench.
 # output: "def common_elements(arr1, arr2):\n    if len(arr1) == 0:\n        return []\n    if len(arr2) == 0:\n        return arr1\n\n    common_elements = []\n    for element in arr1:\n        if element in arr2:\n            common_elements.append(element)\n\n    return common_elements"
 # Multiturn conversation could be realized by continuously appending questions to `conv`.
 ```