PEFT
Safetensors
monkeypostulate commited on
Commit
f86f6d7
·
verified ·
1 Parent(s): 6f2cf2f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -4
README.md CHANGED
@@ -1,17 +1,31 @@
1
  ---
2
  base_model: meta-llama/Llama-3.2-1B-Instruct
3
  library_name: peft
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
 
11
 
12
- ## Model Details
13
 
14
- ### Model Description
15
 
16
  <!-- Provide a longer summary of what this model is. -->
17
 
 
1
  ---
2
  base_model: meta-llama/Llama-3.2-1B-Instruct
3
  library_name: peft
4
+ datasets:
5
+ - mlabonne/orpo-dpo-mix-40k
6
  ---
7
 
 
8
 
9
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-1B, optimized with ORPO (Optimized Regularization for Prompt Optimization) Trainer. Fine-tuning was performed using a subset of the [meta-llama/Llama-3.2-1B dataset, with only 100 samples selected to enable rapid training with ORPO’s efficient approach.
10
+
11
+ **Fine-tuning Method:** ORPO
12
+ **Dataset:** mlabonne/orpo-dpo-mix-40k
13
+
14
+
15
+ **Evaluation**
16
+
17
+ The model was evaluated on the following benchmarks, with the following performance metrics:
18
+
19
+
20
+ | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
21
+ |---------|------:|------|-----:|--------|---|-----:|---|-----:|
22
+ |hellaswag| 1|none | 0|acc |↑ | 0.4772 |± | 0.0050 |
23
+ | | |none | 0|acc_norm|↑ |0.6366 |± | 0.0048 |
24
+
25
 
26
 
27
 
 
28
 
 
29
 
30
  <!-- Provide a longer summary of what this model is. -->
31