OpenMOSE commited on
Commit
122dd56
·
verified ·
1 Parent(s): 33fd284

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -1,3 +1,44 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # RWKV-x070-2B9-CJE-Instruct Model Card
5
+
6
+ ## Model Overview
7
+ - **Model Name**: RWKV-x070-2B9-CJE-Instruct
8
+ - **Description**: An instruction-tuned model specialized for Japanese, Chinese, and English languages
9
+ - **Base Model**: rwkv-x070-2b9-world-v3-40%trained-20250113-ctx4k.pth
10
+ - **Architecture**: RWKV x070 "Goose"
11
+ - **Parameters**: 2.9B
12
+ - **Model Dimension**: 2048
13
+ - **Number of Layers**: 32
14
+
15
+ ## Fine-tuning Details
16
+
17
+ ### Training Configuration
18
+ - **Trainer**: RWKV-LM-RLHF
19
+ - **PEFT Mode**: Hybrid learning combining frozen embeddings and Bone (Block Affine Transformation) + full parameter training
20
+ - **SFT Method**: SmoothingLoss SFT
21
+ - **Context Window**: 5120 (trained with 1024 token overlap)
22
+
23
+ ### Dataset Specifications
24
+ - **Size**: 800k pairs
25
+ - **Content**:
26
+ - Mixed data in Japanese, Chinese, and English
27
+ - Conversations
28
+ - Programming code
29
+ - Translation tasks
30
+ - Chain-of-Thought reasoning tasks
31
+
32
+ ### Important Note
33
+ - Set the end token as '\n\n\x17'
34
+
35
+ ### Limitations and Considerations
36
+ - This is an experimental model; inference stability is not fully guaranteed
37
+ - Unexpected behaviors may occur
38
+ - Continuous improvements are being made; feedback is welcome
39
+
40
+ ## License
41
+ Apache License 2.0
42
+
43
+ ## Acknowledgments
44
+ We express our gratitude to the RWKV base model and the RWKV community for their support in developing this model.