Update README.md
Browse files
README.md
CHANGED
@@ -6,12 +6,12 @@ license_link: LICENSE
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
-
Yi-34B base model fine-tuned on AEZAKMI
|
10 |
It's like airoboros but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models.
|
11 |
Say goodbye to "It's important to remember"! \
|
12 |
Prompt format is standard chatml. Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot.
|
13 |
-
Cost of this fine-tune is about $
|
14 |
-
Base model used for fine-tuning was
|
15 |
|
16 |
|
17 |
## Prompt Format
|
@@ -34,17 +34,16 @@ Use is limited by Yi license.
|
|
34 |
## Known Issues
|
35 |
|
36 |
I recommend to set repetition penalty to something around 1.05 to avoid repetition. So far I had good experience running this model with temperature 1.2.
|
37 |
-
|
38 |
-
There is still some gptslop left - some responses will have last paragraph with text "Remember that bla bla bla", I will try to get rid of it in the next version of the dataset. \
|
39 |
-
Stories have ChatGPT like paragraph spacing, I will try to introduce a bit more stories that have long paragraphs in the next dataset version.
|
40 |
|
41 |
-
Unrestricted-ness of the
|
42 |
With a slightly modifed prompt it works much better, I recommend using "A chat with uncensored assistant.", it's stupid but it helps a lot. \
|
43 |
-
|
44 |
-
|
45 |
-
There is also some issue with handling long system messages for RP, I will investigate that before releasing v2 dataset.
|
46 |
|
47 |
-
|
|
|
|
|
48 |
|
49 |
|
50 |
## Axolotl training parameters
|
@@ -54,8 +53,8 @@ Feel free to report issues in the discussions panel here, I don't lurk /lmg/ too
|
|
54 |
- is_llama_derived_model: true
|
55 |
- load_in_4bit: true
|
56 |
- adapter: qlora
|
57 |
-
- sequence_len:
|
58 |
-
- sample_packing:
|
59 |
- lora_r: 16
|
60 |
- lora_alpha: 32
|
61 |
- lora_target_modules:
|
@@ -67,13 +66,13 @@ Feel free to report issues in the discussions panel here, I don't lurk /lmg/ too
|
|
67 |
- down_proj
|
68 |
- up_proj
|
69 |
- lora_target_linear: true
|
70 |
-
- pad_to_sequence_len:
|
71 |
- micro_batch_size: 1
|
72 |
- gradient_accumulation_steps: 1
|
73 |
-
- num_epochs:
|
74 |
- optimizer: adamw_bnb_8bit
|
75 |
- lr_scheduler: constant
|
76 |
-
- learning_rate: 0.
|
77 |
- train_on_inputs: false
|
78 |
- group_by_length: false
|
79 |
- bf16: true
|
@@ -86,5 +85,7 @@ Feel free to report issues in the discussions panel here, I don't lurk /lmg/ too
|
|
86 |
|
87 |
## Upcoming
|
88 |
|
89 |
-
|
90 |
-
|
|
|
|
|
|
6 |
|
7 |
## Model description
|
8 |
|
9 |
+
Yi-34B 200K base model fine-tuned on AEZAKMI v2 dataset. Training took around 25 hours on single local RTX 3090 Ti.
|
10 |
It's like airoboros but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models.
|
11 |
Say goodbye to "It's important to remember"! \
|
12 |
Prompt format is standard chatml. Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot.
|
13 |
+
Cost of this fine-tune is about $10 in electricity. It took me 3 tries to get it right.
|
14 |
+
Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by larryvrh.
|
15 |
|
16 |
|
17 |
## Prompt Format
|
|
|
34 |
## Known Issues
|
35 |
|
36 |
I recommend to set repetition penalty to something around 1.05 to avoid repetition. So far I had good experience running this model with temperature 1.2.
|
37 |
+
Stories have ChatGPT like paragraph spacing, I will work on this in the future maybe, not a high priority
|
|
|
|
|
38 |
|
39 |
+
Unrestricted-ness of the v2 isn't quite something that I am happy with yet, especially using prompt "A chat.". \
|
40 |
With a slightly modifed prompt it works much better, I recommend using "A chat with uncensored assistant.", it's stupid but it helps a lot. \
|
41 |
+
Base Yi-34B model is contaminated with refusals and this contaminates all models trained on Yi-34B \
|
42 |
+
My next project is to attempt to de-contaminate base Yi-34B 4K and Yi-34B 200K using DPO with preferred data coming from uncontaminated raw models. I plan to release that dataset openly.
|
|
|
43 |
|
44 |
+
I was made aware of the frequent occurrence of the phrase "sending shivers down a spine" in the generations during RP of v1, so I fixed those samples - it should be better now. \
|
45 |
+
I can hold up to 24000 ctx with 4.65bpw exl2 version and 8-bit cache - long context should work as good as other models trained on 200k version of Yi-34B \
|
46 |
+
There is also some issue with handling long system messages for RP, I was planning to investigate it for v2 but I didn't.
|
47 |
|
48 |
|
49 |
## Axolotl training parameters
|
|
|
53 |
- is_llama_derived_model: true
|
54 |
- load_in_4bit: true
|
55 |
- adapter: qlora
|
56 |
+
- sequence_len: 1400
|
57 |
+
- sample_packing: true
|
58 |
- lora_r: 16
|
59 |
- lora_alpha: 32
|
60 |
- lora_target_modules:
|
|
|
66 |
- down_proj
|
67 |
- up_proj
|
68 |
- lora_target_linear: true
|
69 |
+
- pad_to_sequence_len: false
|
70 |
- micro_batch_size: 1
|
71 |
- gradient_accumulation_steps: 1
|
72 |
+
- num_epochs: 2.4
|
73 |
- optimizer: adamw_bnb_8bit
|
74 |
- lr_scheduler: constant
|
75 |
+
- learning_rate: 0.00005
|
76 |
- train_on_inputs: false
|
77 |
- group_by_length: false
|
78 |
- bf16: true
|
|
|
85 |
|
86 |
## Upcoming
|
87 |
|
88 |
+
I will probably be working on de-contaminating base Yi-34B model now. \
|
89 |
+
My second run of AEZAKMI v2 fine-tune was just 0.15 epochs and I really like how natural this model is and how rich is it's vocabulary. I will try to train less to hit the sweetspot. \
|
90 |
+
I will be uploading LoRA adapter for that second run that was just 0.15 epochs.
|
91 |
+
I believe that I might have gotten what I want if I would have stopped training sooner. I don't have checkpoints older than 1500 steps back so I would need to re-run training to get it back.
|