Update README.md
Browse files
README.md
CHANGED
@@ -21,23 +21,9 @@ This model is a fine-tuned version of [unsloth/llama-3-8b-Instruct-bnb-4bit](htt
|
|
21 |
|
22 |
## Model description
|
23 |
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
- Imports necessary libraries and sets up the environment.
|
28 |
-
- Configures GPU settings and initializes the Jupyter Widgets.
|
29 |
-
|
30 |
-
2. **Data Preparation**:
|
31 |
-
- Loads and preprocesses the dataset.
|
32 |
-
- Splits the data into training and validation sets.
|
33 |
-
|
34 |
-
3. **Model Initialization**:
|
35 |
-
- Loads the pre-trained model.
|
36 |
-
- Configures the model for fine-tuning.
|
37 |
-
|
38 |
-
4. **Training Loop**:
|
39 |
-
- Implements the training loop with real-time progress updates.
|
40 |
-
- Displays training metrics and updates the progress bar widget.
|
41 |
|
42 |
## How to use
|
43 |
|
@@ -54,14 +40,41 @@ from peft import PeftModel, PeftConfig
|
|
54 |
|
55 |
2. **Load the Model and Tokenizer**
|
56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
3. **Prepare Inputs**
|
58 |
|
59 |
-
|
|
|
|
|
|
|
60 |
|
61 |
-
|
|
|
|
|
62 |
|
63 |
-
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
|
67 |
### Training hyperparameters
|
@@ -79,8 +92,6 @@ The following hyperparameters were used during training:
|
|
79 |
- training_steps: 60
|
80 |
- mixed_precision_training: Native AMP
|
81 |
|
82 |
-
### Training results
|
83 |
-
|
84 |
|
85 |
|
86 |
### Framework versions
|
|
|
21 |
|
22 |
## Model description
|
23 |
|
24 |
+
This fine-tuning model is a large language model using the unsloth library, which focuses on memory efficiency and speed.
|
25 |
+
It demonstrates data preparation, model configuration with LoRA, training with SFTTrainer, and inference with optimized settings.
|
26 |
+
The unsloth models, especially the 4-bit quantized versions, enable efficient and faster training and inference, making them suitable for various AI and ML applications.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
|
28 |
## How to use
|
29 |
|
|
|
40 |
|
41 |
2. **Load the Model and Tokenizer**
|
42 |
|
43 |
+
```python
|
44 |
+
# Load the tokenizer
|
45 |
+
tokenizer = AutoTokenizer.from_pretrained("m00bs/llama-3-8b-inst-CausalRelationship-finetune-tokenizer")
|
46 |
+
|
47 |
+
# Load the model
|
48 |
+
config = PeftConfig.from_pretrained("m00bs/outputs")
|
49 |
+
base_model = AutoModelForCausalLM.from_pretrained("unsloth/llama-3-8b-Instruct-bnb-4bit")
|
50 |
+
model = PeftModel.from_pretrained(base_model, "m00bs/outputs")
|
51 |
+
|
52 |
+
# Move model to GPU if available
|
53 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
54 |
+
model.to(device)
|
55 |
+
|
56 |
+
```
|
57 |
+
|
58 |
3. **Prepare Inputs**
|
59 |
|
60 |
+
```python
|
61 |
+
# Prepare the input text
|
62 |
+
input_text = """As a finance expert, answer the following question about the following market event about Market Event:
|
63 |
+
Given that China's full reopening announcement on December 26, 2022 caused an immediate jump in Chinese stock prices, What was the impact of China's full reopening announcement on December 26, 2022 on Chinese stock prices?"""
|
64 |
|
65 |
+
# Tokenize the input text
|
66 |
+
inputs = tokenizer(input_text, return_tensors="pt").to(device)
|
67 |
+
```
|
68 |
|
69 |
+
4. **Run Inference**
|
70 |
|
71 |
+
```python
|
72 |
+
# Generate the response
|
73 |
+
outputs = model.generate(**inputs, max_new_tokens=300, use_cache=True)
|
74 |
+
|
75 |
+
# Decode the output
|
76 |
+
response = tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
77 |
+
```
|
78 |
|
79 |
|
80 |
### Training hyperparameters
|
|
|
92 |
- training_steps: 60
|
93 |
- mixed_precision_training: Native AMP
|
94 |
|
|
|
|
|
95 |
|
96 |
|
97 |
### Framework versions
|