maciek-pioro commited on
Commit
97518ad
verified
1 Parent(s): 8eb627b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -4
README.md CHANGED
@@ -1,18 +1,37 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
 
8
  <!-- Provide a quick summary of what the model is/does. -->
9
- XXX is a [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) model fine-tuned using 2.2B Polish
 
10
  tokens selected from the [SpeakLeash](https://speakleash.org/).
11
  This is, to our knowledge, the first open-weights MoE model fine-tuned on Polish data.
12
  In order to preserve English capabilities, we include about 600M tokens from the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T).
13
 
14
  The training was made possible thanks to TPU Research Cloud program. The model was trained on a TPUv3-256.
15
 
 
 
 
 
 
 
 
 
 
16
 
17
 
18
  ## Model Details
@@ -202,4 +221,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
202
 
203
  ## Model Card Contact
204
 
205
- [More Information Needed]
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - MoE
5
+ - Mixtral
6
+ license: apache-2.0
7
+ datasets:
8
+ - togethercomputer/RedPajama-Data-1T
9
+ - SpeakLeash
10
+ language:
11
+ - pl
12
+ - en
13
  ---
14
 
15
+ # Mixtral-8x7B-v0.1-pl
16
 
17
  <!-- Provide a quick summary of what the model is/does. -->
18
+
19
+ Mixtral-8x7B-v0.1-pl is a [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) model fine-tuned using 2.2B Polish
20
  tokens selected from the [SpeakLeash](https://speakleash.org/).
21
  This is, to our knowledge, the first open-weights MoE model fine-tuned on Polish data.
22
  In order to preserve English capabilities, we include about 600M tokens from the [RedPajama dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T).
23
 
24
  The training was made possible thanks to TPU Research Cloud program. The model was trained on a TPUv3-256.
25
 
26
+ Our codebase is based on [the Focused Transformer repo](https://github.com/CStanKonrad/long_llama) and [EasyLM](https://github.com/young-geng/EasyLM).
27
+
28
+ ## Example output
29
+
30
+ **Poni偶szy artyku艂 opisuje histori臋 Polski. Polska to kraj, kt贸ry** w ci膮gu wiek贸w by艂 艣wiadkiem wielu
31
+ wa偶nych wydarze艅, kt贸re mia艂y wp艂yw na jej rozw贸j. W artykule przedstawiono najwa偶niejsze
32
+ wydarzenia z historii Polski, od czas贸w staro偶ytnych do wsp贸艂czesno艣ci. Historia Polski - staro偶ytno艣膰 Polska to kraj,
33
+ kt贸ry ma d艂ug膮 i bogat膮 histori臋. Pierwsze wzmianki o Polsce pochodz膮 z czas贸w staro偶ytnych. W 966 roku Mieszko I
34
+ przyj膮艂 chrzest, co zapocz膮tkowa艂o proces ...
35
 
36
 
37
  ## Model Details
 
221
 
222
  ## Model Card Contact
223
 
224
+ maciej.pioro at ideas-ncbr.pl