OpenMOSE commited on
Commit
bfe5add
·
verified ·
1 Parent(s): fb074ec

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -36,8 +36,8 @@ HRWKV7-Reka-Flash3-Preview is an experimental hybrid architecture model that com
36
  The model implements several key improvements over standard RWKV architectures:
37
 
38
  1. **Token Shift Removal**: Unlike traditional RWKV, the hxa079 variant removes token shifting mechanisms
39
- 2. **GroupNorm Removal**: Eliminates GroupNorm layers for training stability
40
- 3. **k_first Introduction**: Implements a novel k_first mechanism optimized for attention conversion
41
 
42
  ### Hybrid Design Benefits
43
 
 
36
  The model implements several key improvements over standard RWKV architectures:
37
 
38
  1. **Token Shift Removal**: Unlike traditional RWKV, the hxa079 variant removes token shifting mechanisms
39
+ 2. **GroupNorm Removal**: Eliminates GroupNorm for training stability
40
+ 3. **k_first Introduction**: Experimentally adopted the approach of residually connecting k layers in layer 0.
41
 
42
  ### Hybrid Design Benefits
43