metadata

base_model: unsloth/llama-3-8b-Instruct-bnb-4bit
library_name: peft
license: llama3.1
tags:
  - trl
  - sft
  - unsloth
  - generated_from_trainer
model-index:
  - name: outputs
    results: []

outputs

This model is a fine-tuned version of unsloth/llama-3-8b-Instruct-bnb-4bit on the None dataset.

Model description

This fine-tuning model is a large language model using the unsloth library, which focuses on memory efficiency and speed. It demonstrates data preparation, model configuration with LoRA, training with SFTTrainer, and inference with optimized settings. The unsloth models, especially the 4-bit quantized versions, enable efficient and faster training and inference, making them suitable for various AI and ML applications.

How to use

Install Required Libraries

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from unsloth import FastLanguageModel
from unsloth.chat_templates import get_chat_template
from peft import PeftModel, PeftConfig

Load the Model and Tokenizer

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("m00bs/llama-3-8b-inst-CausalRelationship-finetune-tokenizer")

# Load the model
config = PeftConfig.from_pretrained("m00bs/outputs")
base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3-8b-Instruct-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "m00bs/outputs")

# Move model to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

Prepare Inputs

# Prepare the input text
input_text = """As a finance expert, answer the following question about the following market event about Market Event:
Given that China's full reopening announcement on December 26, 2022 caused an immediate jump in Chinese stock prices, What was the impact of China's full reopening announcement on December 26, 2022 on Chinese stock prices?"""

# Tokenize the input text
inputs = tokenizer(input_text, return_tensors="pt").to(device)

Run Inference

# Generate the response
outputs = model.generate(**inputs, max_new_tokens=300, use_cache=True)

# Decode the output
response = tokenizer.batch_decode(outputs, skip_special_tokens=True)

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 1
eval_batch_size: 8
seed: 3407
gradient_accumulation_steps: 4
total_train_batch_size: 4
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 5
training_steps: 60
mixed_precision_training: Native AMP

Framework versions

PEFT 0.12.0
Transformers 4.43.3
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1