OpenMOSE
/

RWKV-x070-2B9-CJE-Instruct

Model card Files Files and versions

OpenMOSE commited on Jan 17

Commit

122dd56

·

verified ·

1 Parent(s): 33fd284

Update README.md

Files changed (1) hide show

README.md +44 -3

README.md CHANGED Viewed

@@ -1,3 +1,44 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+# RWKV-x070-2B9-CJE-Instruct Model Card
+## Model Overview
+- **Model Name**: RWKV-x070-2B9-CJE-Instruct
+- **Description**: An instruction-tuned model specialized for Japanese, Chinese, and English languages
+- **Base Model**: rwkv-x070-2b9-world-v3-40%trained-20250113-ctx4k.pth
+- **Architecture**: RWKV x070 "Goose"
+- **Parameters**: 2.9B
+- **Model Dimension**: 2048
+- **Number of Layers**: 32
+## Fine-tuning Details
+### Training Configuration
+- **Trainer**: RWKV-LM-RLHF
+- **PEFT Mode**: Hybrid learning combining frozen embeddings and Bone (Block Affine Transformation) + full parameter training
+- **SFT Method**: SmoothingLoss SFT
+- **Context Window**: 5120 (trained with 1024 token overlap)
+### Dataset Specifications
+- **Size**: 800k pairs
+- **Content**:
+  - Mixed data in Japanese, Chinese, and English
+  - Conversations
+  - Programming code
+  - Translation tasks
+  - Chain-of-Thought reasoning tasks
+### Important Note
+- Set the end token as '\n\n\x17'
+### Limitations and Considerations
+- This is an experimental model; inference stability is not fully guaranteed
+- Unexpected behaviors may occur
+- Continuous improvements are being made; feedback is welcome
+## License
+Apache License 2.0
+## Acknowledgments
+We express our gratitude to the RWKV base model and the RWKV community for their support in developing this model.