metadata

license: apache-2.0
language:
  - en
pipeline_tag: conversational
tags:
  - not-for-all-audiences
base_model: mistralai/Mistral-7B-Instruct-v0.2

Test model, do not use.

It employs a different prompting format than the base's, and not Alpaca. Not intended for public consumption yet, so no information will be given here in that regard.

It's unlikely that the model will produce the intended outputs without the specific format it's been trained on.

Dataset

Similar to LimaRP, but more niche. Flexible training sample length (from 4k to 32k tokens, at least). Might or might not be released in the future.

Training details

Hardware

1x NVidia RTX 3090 24GB

Software

Axolotl

Training hyperparameters

load_in_4bit: true
adapter: qlora
sequence_len: 16384
sample_packing: true
pad_to_sequence_len: false
gradient_accumulation_steps: 4
micro_batch_size: 1
eval_batch_size: 1
num_epochs: 2
optimizer: adamw_bnb_8bit
lr_scheduler: constant
learning_rate: 0.000085
weight_decay: 0.05
train_on_inputs: true
bf16: true
fp16: false
tf32: true
lora_r: 20
lora_alpha: 16
lora_dropout: 0.1
lora_target_linear: true