Yassinj commited on
Commit
4bf0f10
verified
1 Parent(s): ba2b748

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -4
README.md CHANGED
@@ -7,8 +7,6 @@ base_model:
7
  pipeline_tag: question-answering
8
  ---
9
 
10
- # LLaMA 3.1-8B Fine-Tuned on ChatDoctor Dataset
11
-
12
  ## Model Overview
13
  This model is a fine-tuned version of the LLaMA 3.1-8B model, trained on a curated selection of 1,122 samples from the **ChatDoctor (HealthCareMagic-100k)** dataset. It has been optimized for tasks related to medical consultations.
14
 
@@ -50,8 +48,62 @@ The model was fine-tuned with the following hyperparameters:
50
 
51
  Validation was performed using a separate subset of the dataset. The final training and validation loss are as follows:
52
 
53
- ![Training and Validation Loss](train-val-curve.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  ## Usage
56
  ### Loading the Model
57
- This model is hosted in **GGUF format** for optimal deployment. You can load and run the model using **LLaMA.cpp**.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pipeline_tag: question-answering
8
  ---
9
 
 
 
10
  ## Model Overview
11
  This model is a fine-tuned version of the LLaMA 3.1-8B model, trained on a curated selection of 1,122 samples from the **ChatDoctor (HealthCareMagic-100k)** dataset. It has been optimized for tasks related to medical consultations.
12
 
 
48
 
49
  Validation was performed using a separate subset of the dataset. The final training and validation loss are as follows:
50
 
51
+ <p align="center">
52
+ <img src="train-val-curve.png" alt="Training and Validation Loss" width="50%"/>
53
+ </p>
54
+
55
+ ## Evaluation Results
56
+ ### Original Model
57
+ - **ROUGE-1**: 0.1726
58
+ - **ROUGE-2**: 0.0148
59
+ - **ROUGE-L**: 0.0980
60
+
61
+ ### Fine-Tuned Model
62
+ - **ROUGE-1**: 0.2177
63
+ - **ROUGE-2**: 0.0337
64
+ - **ROUGE-L**: 0.1249
65
 
66
  ## Usage
67
  ### Loading the Model
68
+ This model is hosted in **GGUF format** for optimal deployment. You can load and run the model using **LLaMA.cpp**.
69
+
70
+ #### Steps to Use
71
+ 1. Clone the LLaMA.cpp repository:
72
+ ```bash
73
+ git clone https://github.com/ggerganov/llama.cpp
74
+ cd llama.cpp
75
+ make
76
+ ```
77
+
78
+ 2. Download the model from Hugging Face:
79
+ ```bash
80
+ huggingface-cli login
81
+ wget https://huggingface.co/your-username/llama-3.1-8B-gguf/resolve/main/output_model.gguf
82
+ ```
83
+
84
+ 3. Run inference:
85
+ ```bash
86
+ ./main -m output_model.gguf -p "What are the symptoms of a common cold?" -t 4 -n 100
87
+ ```
88
+
89
+ ### Quantization Details
90
+ The model is quantized to **Q4_0** for faster inference while maintaining reasonable accuracy. You can run it efficiently on CPUs with low memory requirements.
91
+
92
+ ## Limitations and Intended Use
93
+ - **Not for Clinical Use**: This model is intended for educational purposes and general health advice. It should not replace professional medical consultation.
94
+ - **Bias and Errors**: The model might exhibit biases present in the training data. Outputs should be interpreted with caution.
95
+
96
+ ## Acknowledgments
97
+ - **Dataset**: ChatDoctor (HealthCareMagic-100k)
98
+ - **Base Model**: LLaMA 3.1-8B
99
+ - **Quantization Tools**: LLaMA.cpp
100
+
101
+ ## Citation
102
+ If you use this model, please cite:
103
+ ```
104
+ @article{yourcitation,
105
+ title={Fine-tuned LLaMA 3.1-8B on ChatDoctor Dataset},
106
+ author={Your Name},
107
+ year={2025},
108
+ publisher={Hugging Face}
109
+ }