File size: 3,324 Bytes
5231047
 
 
 
 
 
8c63fb0
4aafc45
 
a18d9ee
 
eb395f8
 
a18d9ee
d820301
a18d9ee
 
3b9ffc4
a18d9ee
 
 
 
 
 
 
 
 
eb395f8
 
 
 
 
 
a18d9ee
 
 
 
 
ba2b748
a18d9ee
3b9ffc4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22dae32
 
 
 
 
 
 
 
 
 
3b9ffc4
 
4bf0f10
22dae32
4bf0f10
 
 
d820301
 
 
 
3b9ffc4
 
eb395f8
 
 
 
74b3d43
eb395f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
license: llama3.1
language:
- en
base_model:
- meta-llama/Meta-Llama-3-8B
pipeline_tag: question-answering
metrics:
- rouge
---

# LLaMA 3.1-8B Fine-Tuned on ChatDoctor Dataset

## Model Overview
This model is a fine-tuned version of the LLaMA 3.1-8B model, trained on a curated selection of 1,122 samples from the **ChatDoctor (HealthCareMagic-100k)** dataset. It has been optimized for task related to medical consultations.

- **Base Model**: LLaMA 3.1-8B
- **Fine-tuning Dataset**: 1,122 samples from ChatDoctor dataset
- **Output Format**: GGUF (Grok-Generated Universal Format)
- **Quantization**: Q4_0 for efficient inference

## Applications
This model is designed to assist in:
- Medical question-answering
- Providing health-related advice
- Assisting in basic diagnostic reasoning (non-clinical use)

## Datasets
- **Training Data**: ChatDoctor-HealthCareMagic-100k
  - **Training Set**: 900 samples
  - **Validation Set**: 100 samples
  - **Test Set**: 122 samples

## Model Details
| **Feature**                  | **Details**                |
|------------------------------|----------------------------|
| **Model Type**               | Causal Language Model      |
| **Architecture**             | LLaMA 3.1-8B              |
| **Training Data**            | ChatDoctor (1,122 samples) |
| **Quantization**             | Q4_0                      |
| **Deployment Format**        | GGUF                      |

## Training Configuration
The model was fine-tuned with the following hyperparameters:

- **Output Directory**: `output_model`
- **Per-Device Batch Size**: 2
- **Gradient Accumulation Steps**: 16 (Effective batch size: 32)
- **Learning Rate**: 2e-4
- **Scheduler**: Cosine Annealing
- **Optimizer**: AdamW (paged with 32-bit precision)
- **Number of Epochs**: 16
- **Evaluation Strategy**: Per epoch
- **Save Strategy**: Per epoch
- **Logging Steps**: 1
- **Mixed Precision**: FP16
- **Best Model Criteria**: `eval_loss`, with `greater_is_better=False`

### LoRA Hyperparameters

The fine-tuning process also included the following LoRA (Low-Rank Adaptation) configuration:

- **Rank (r)**: 8
- **Alpha**: 16
- **Dropout**: 0.05
- **Bias**: None
- **Task Type**: Causal Language Modeling (CAUSAL_LM)

Validation was performed using a separate subset of the dataset. The final training and validation loss are as follows:

<p align="center">
  <img src="train-val-curve.png" alt="Training and Validation Loss" width="85%"/>
</p>

## Evaluation Results
| Model               | ROUGE-1 | ROUGE-2 | ROUGE-L |
|---------------------|---------|---------|---------|
| Original Model      | 0.1726  | 0.0148  | 0.0980  |
| Fine-Tuned Model    | 0.2177  | 0.0337  | 0.1249  |

## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from bitsandbytes import BitsAndBytesConfig

model_id="Yassinj/Llama-3.1-8B_medical"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token

# Configure quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, 
    bnb_4bit_quant_type="nf4", 
    bnb_4bit_compute_dtype="float16", 
    bnb_4bit_use_double_quant=True
)

# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
    model_id, 
    quantization_config=bnb_config, 
    device_map="auto"
)
```