|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: conversational |
|
tags: |
|
- not-for-all-audiences |
|
base_model: mistralai/Mistral-7B-Instruct-v0.2 |
|
--- |
|
|
|
Base: [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) |
|
|
|
Test model, do **not** use. |
|
|
|
It employs a different prompting format than the base's, and _not_ Alpaca. Not intended for |
|
public consumption yet, so no information will be given here in that regard. |
|
|
|
It's unlikely that the model will produce the intended outputs without the specific format it's |
|
been trained on. |
|
|
|
# Dataset |
|
Similar to LimaRP, but more niche. Flexible training sample length (from 4k to 32k tokens, |
|
at least). Might or might not be released in the future. |
|
|
|
# Training details |
|
## Hardware |
|
1x NVidia RTX 3090 24GB |
|
|
|
## Software |
|
[Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) |
|
|
|
## Training hyperparameters |
|
- load_in_4bit: true |
|
- adapter: qlora |
|
- sequence_len: 16384 |
|
- sample_packing: true |
|
- pad_to_sequence_len: false |
|
- gradient_accumulation_steps: 4 |
|
- micro_batch_size: 1 |
|
- eval_batch_size: 1 |
|
- num_epochs: 2 |
|
- optimizer: adamw_bnb_8bit |
|
- lr_scheduler: constant |
|
- learning_rate: 0.000085 |
|
- weight_decay: 0.05 |
|
- train_on_inputs: true |
|
- bf16: true |
|
- fp16: false |
|
- tf32: true |
|
- lora_r: 20 |
|
- lora_alpha: 16 |
|
- lora_dropout: 0.1 |
|
- lora_target_linear: true |