File size: 5,710 Bytes
40ed737 2abfb9d 40ed737 2abfb9d df7e2c9 2abfb9d d5fc5b4 2abfb9d d5fc5b4 2abfb9d d5fc5b4 2abfb9d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
---
base_model:
- CohereForAI/c4ai-command-r-plus
library_name: transformers
tags:
- mergekit
- merge
language:
- en
- fr
- de
- es
- it
- pt
- ja
- ko
- zh
- ar
pipeline_tag: text-generation
license: cc-by-nc-4.0
---
# Megac4ai-command-r-plus
[GGUF version is here.](https://huggingface.co/nitky/Megac4ai-command-r-plus-gguf)
๐จ **This model is created using the special mergekit that supports c4ai-command-r-plus.**
This is a 160b frankenmerge model created by interleaving layers of [CohereForAI/c4ai-command-r-plus](https://huggingface.co/CohereForAI/c4ai-command-r-plus) with itself using mergekit.
## Output comparison
### Test Case Details
Condition: temperature=0.3
```
<|START_OF_TURN_TOKEN|><|USER_TOKEN|>ใใฃใ : ใใใ่ชฟๅญใฏใฉใ๏ผ
ใญใ : ใใใใใใใใจใใฆใใใ ใใฉใใพใๅ
ๅปถใฐใใซใใกใใฃใใใ
ใใฃใ : ไฝใใใใใจใใฆใใใฎ๏ผ
ใญใ : ๅคงๅญฆใฎ่ชฒ้กใ ใใใฉใใซใใใๆฐใๅบใชใใฆใญใ
ใใฃใ : ้ไธญใงใใชใใชใใใใขใใผใญใปใใฏใใใฏใใใใจใใใใ
ใญใ : ไฝใใ๏ผ
ใใฃใ : 25ๅไฝๆฅญใใฆใ5ๅไผๆฉใใใฎใ็นฐใ่ฟใใใ ใใไธๅใใใใฎไฝๆฅญๆ้ใ็ญใใฆ้ไธญใงใใใใ
ใญใ : ใใผใใ้ไธญใฃใฆใใใใใใใชใใใ ใใญ
ใใฃใ : ใใใ1ๆฅใซ5ๅใ ใใงใใใใๆบใง่ชฒ้กใใใใฃใฆใใใฎใฏใฉใ๏ผ
ใญใ : 5ๅใใไฝใใงใใชใใชใ๏ผ
ใใฃใ : ็ญใๆ้ใงใใใใใๆบใงไฝๆฅญใใใฃใฆใใใฎใใใคใณใใชใใ ใใใใใใใฃใจใใใใใใใใใงๆญขใใฆใใใจๆฏๆฅ็ถใใใฃใจ้ทใๆ้ใงใใใใใซใชใใใ
ใญใ : ็ขบใใซใใใชใใงใใใใใใใใใจใ๏ผ
Q: ใญใ ใฏไฝใใใฃใฆใฟใใใจใใฆใใพใใ๏ผใพใไฝๆ
ใใใใใใใจใใฆใใพใใ๏ผ<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>
```
This task is included in [elyza/ELYZA-tasks-100](https://huggingface.co/datasets/elyza/ELYZA-tasks-100).
### Output Example
| Model | Output |
|-------|--------|
| CohereForAI/c4ai-command-r-plus | ใญใ ใฏๅคงๅญฆใฎ่ชฒ้กใใใใใจใใฆใใพใใใใใใใใๆฐใ่ตทใใใๅ
ๅปถใฐใใซใใฆใใพใฃใใใใงใใ |
| nitky/Megac4ai-command-r-plus | ใญใ ใฏๅคงๅญฆใฎ่ชฒ้กใใใใใจใใฆใใพใใใใๆฐใๅบใใ้ไธญใใฆไฝๆฅญใใใใจใใงใใชใใใใใใฃใ ใซใขใใใคในใๆฑใใฆใใพใใใใฃใ ใๆๆกใใใใขใใผใญใปใใฏใใใฏใใ1ๆฅใซ5ๅใ ใๆบใง่ชฒ้กใใใใจใใๆนๆณใ่ฉฆใใใจใงใ่ชฒ้กใซๅใ็ตใ็ฟๆ
ฃใ่บซใซใคใใใใจใใฆใใพใใ |
## Test environment
This model was tested using [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main). I use preset `min_p` and `Null preset` with temperature=0.3 for Generation.
## Usage
Please install `transformers` from the source repository that includes the necessary changes for this model.
```python
# pip install 'git+https://github.com/huggingface/transformers.git'
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "nitky/megac4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```
### Quantized model through bitsandbytes, 4-bit precision
```python
# pip install 'git+https://github.com/huggingface/transformers.git' bitsandbytes accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(load_in_4bit=True)
model_id = "nitky/megac4ai-command-r-plus"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)
# Format message with the command-r-plus chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>
gen_tokens = model.generate(
input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.3,
)
gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```
## Merge Details
### Merge Method
This model was merged using the passthrough merge method.
### Models Merged
The following models were included in the merge:
* [CohereForAI/c4ai-command-r-plus](https://huggingface.co/CohereForAI/c4ai-command-r-plus)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
dtype: float16
merge_method: passthrough
slices:
- sources:
- layer_range: [0, 20]
model: CohereForAI/c4ai-command-r-plus
- sources:
- layer_range: [11, 31]
model: CohereForAI/c4ai-command-r-plus
- sources:
- layer_range: [22, 42]
model: CohereForAI/c4ai-command-r-plus
- sources:
- layer_range: [33, 53]
model: CohereForAI/c4ai-command-r-plus
- sources:
- layer_range: [44, 64]
model: CohereForAI/c4ai-command-r-plus
```
|