File size: 3,324 Bytes
5231047 8c63fb0 4aafc45 a18d9ee eb395f8 a18d9ee d820301 a18d9ee 3b9ffc4 a18d9ee eb395f8 a18d9ee ba2b748 a18d9ee 3b9ffc4 22dae32 3b9ffc4 4bf0f10 22dae32 4bf0f10 d820301 3b9ffc4 eb395f8 74b3d43 eb395f8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
license: llama3.1
language:
- en
base_model:
- meta-llama/Meta-Llama-3-8B
pipeline_tag: question-answering
metrics:
- rouge
---
# LLaMA 3.1-8B Fine-Tuned on ChatDoctor Dataset
## Model Overview
This model is a fine-tuned version of the LLaMA 3.1-8B model, trained on a curated selection of 1,122 samples from the **ChatDoctor (HealthCareMagic-100k)** dataset. It has been optimized for task related to medical consultations.
- **Base Model**: LLaMA 3.1-8B
- **Fine-tuning Dataset**: 1,122 samples from ChatDoctor dataset
- **Output Format**: GGUF (Grok-Generated Universal Format)
- **Quantization**: Q4_0 for efficient inference
## Applications
This model is designed to assist in:
- Medical question-answering
- Providing health-related advice
- Assisting in basic diagnostic reasoning (non-clinical use)
## Datasets
- **Training Data**: ChatDoctor-HealthCareMagic-100k
- **Training Set**: 900 samples
- **Validation Set**: 100 samples
- **Test Set**: 122 samples
## Model Details
| **Feature** | **Details** |
|------------------------------|----------------------------|
| **Model Type** | Causal Language Model |
| **Architecture** | LLaMA 3.1-8B |
| **Training Data** | ChatDoctor (1,122 samples) |
| **Quantization** | Q4_0 |
| **Deployment Format** | GGUF |
## Training Configuration
The model was fine-tuned with the following hyperparameters:
- **Output Directory**: `output_model`
- **Per-Device Batch Size**: 2
- **Gradient Accumulation Steps**: 16 (Effective batch size: 32)
- **Learning Rate**: 2e-4
- **Scheduler**: Cosine Annealing
- **Optimizer**: AdamW (paged with 32-bit precision)
- **Number of Epochs**: 16
- **Evaluation Strategy**: Per epoch
- **Save Strategy**: Per epoch
- **Logging Steps**: 1
- **Mixed Precision**: FP16
- **Best Model Criteria**: `eval_loss`, with `greater_is_better=False`
### LoRA Hyperparameters
The fine-tuning process also included the following LoRA (Low-Rank Adaptation) configuration:
- **Rank (r)**: 8
- **Alpha**: 16
- **Dropout**: 0.05
- **Bias**: None
- **Task Type**: Causal Language Modeling (CAUSAL_LM)
Validation was performed using a separate subset of the dataset. The final training and validation loss are as follows:
<p align="center">
<img src="train-val-curve.png" alt="Training and Validation Loss" width="85%"/>
</p>
## Evaluation Results
| Model | ROUGE-1 | ROUGE-2 | ROUGE-L |
|---------------------|---------|---------|---------|
| Original Model | 0.1726 | 0.0148 | 0.0980 |
| Fine-Tuned Model | 0.2177 | 0.0337 | 0.1249 |
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from bitsandbytes import BitsAndBytesConfig
model_id="Yassinj/Llama-3.1-8B_medical"
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
# Configure quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype="float16",
bnb_4bit_use_double_quant=True
)
# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
model_id,
quantization_config=bnb_config,
device_map="auto"
)
``` |