Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,8 @@ widget:
|
|
22 |
|
23 |
The Bielik-7B-Instruct-v0.1 is an instruct fine-tuned version of the [Bielik-7B-v0.1](https://huggingface.co/speakleash/Bielik-7B-v0.1). Forementioned model stands as a testament to the unique collaboration between the open-science/open-souce project SpeakLeash and the High Performance Computing (HPC) center: ACK Cyfronet AGH. Developed and trained on Polish text corpora, which has been cherry-picked and processed by the SpeakLeash team, this endeavor leverages Polish large-scale computing infrastructure, specifically within the PLGrid environment, and more precisely, the HPC centers: ACK Cyfronet AGH. The creation and training of the Bielik-7B-Instruct-v0.1 was propelled by the support of computational grant number PLG/2024/016951, conducted on the Helios supercomputer, enabling the use of cutting-edge technology and computational resources essential for large-scale machine learning processes. As a result, the model exhibits an exceptional ability to understand and process the Polish language, providing accurate responses and performing a variety of linguistic tasks with high precision.
|
24 |
|
|
|
|
|
25 |
## Model
|
26 |
|
27 |
The [SpeakLeash](https://speakleash.org/) team is working on their own set of instructions in Polish, which is continuously being expanded and refined by annotators. A portion of these instructions, which had been manually verified and corrected, has been utilized for training purposes. Moreover, due to the limited availability of high-quality instructions in Polish, publicly accessible collections of instructions in English were used - [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) and [orca-math-word-problems-200k](https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k), which accounted for half of the instructions used in training. The instructions varied in quality, leading to a deterioration in model’s performance. To counteract this while still allowing ourselves to utilize forementioned datasets,several improvements were introduced:
|
@@ -31,6 +33,7 @@ The [SpeakLeash](https://speakleash.org/) team is working on their own set of in
|
|
31 |
|
32 |
Bielik-7B-Instruct-v0.1 has been trained with the use of an original open source framework called [ALLaMo](https://github.com/chrisociepa/allamo) implemented by [Krzysztof Ociepa](https://www.linkedin.com/in/krzysztof-ociepa-44886550/). This framework allows users to train language models with architecture similar to LLaMA and Mistral in fast and efficient way.
|
33 |
|
|
|
34 |
### Model description:
|
35 |
|
36 |
* **Developed by:** [SpeakLeash](https://speakleash.org/)
|
@@ -72,6 +75,20 @@ Bielik-7B-Instruct-v0.1 has been trained with the use of an original open source
|
|
72 |
| Precision | bfloat16 (mixed) |
|
73 |
|
74 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
75 |
### Instruction format
|
76 |
|
77 |
In order to leverage instruction fine-tuning, your prompt should be surrounded by `[INST]` and `[/INST]` tokens. The very first instruction should start with the beginning of a sentence token. The generated completion will be finished by the end-of-sentence token.
|
|
|
22 |
|
23 |
The Bielik-7B-Instruct-v0.1 is an instruct fine-tuned version of the [Bielik-7B-v0.1](https://huggingface.co/speakleash/Bielik-7B-v0.1). Forementioned model stands as a testament to the unique collaboration between the open-science/open-souce project SpeakLeash and the High Performance Computing (HPC) center: ACK Cyfronet AGH. Developed and trained on Polish text corpora, which has been cherry-picked and processed by the SpeakLeash team, this endeavor leverages Polish large-scale computing infrastructure, specifically within the PLGrid environment, and more precisely, the HPC centers: ACK Cyfronet AGH. The creation and training of the Bielik-7B-Instruct-v0.1 was propelled by the support of computational grant number PLG/2024/016951, conducted on the Helios supercomputer, enabling the use of cutting-edge technology and computational resources essential for large-scale machine learning processes. As a result, the model exhibits an exceptional ability to understand and process the Polish language, providing accurate responses and performing a variety of linguistic tasks with high precision.
|
24 |
|
25 |
+
[We have prepared quantized versions of the model as well as MLX format.](#quant-and-mlx-versions)
|
26 |
+
|
27 |
## Model
|
28 |
|
29 |
The [SpeakLeash](https://speakleash.org/) team is working on their own set of instructions in Polish, which is continuously being expanded and refined by annotators. A portion of these instructions, which had been manually verified and corrected, has been utilized for training purposes. Moreover, due to the limited availability of high-quality instructions in Polish, publicly accessible collections of instructions in English were used - [OpenHermes-2.5](https://huggingface.co/datasets/teknium/OpenHermes-2.5) and [orca-math-word-problems-200k](https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k), which accounted for half of the instructions used in training. The instructions varied in quality, leading to a deterioration in model’s performance. To counteract this while still allowing ourselves to utilize forementioned datasets,several improvements were introduced:
|
|
|
33 |
|
34 |
Bielik-7B-Instruct-v0.1 has been trained with the use of an original open source framework called [ALLaMo](https://github.com/chrisociepa/allamo) implemented by [Krzysztof Ociepa](https://www.linkedin.com/in/krzysztof-ociepa-44886550/). This framework allows users to train language models with architecture similar to LLaMA and Mistral in fast and efficient way.
|
35 |
|
36 |
+
|
37 |
### Model description:
|
38 |
|
39 |
* **Developed by:** [SpeakLeash](https://speakleash.org/)
|
|
|
75 |
| Precision | bfloat16 (mixed) |
|
76 |
|
77 |
|
78 |
+
### Quant and MLX versions:
|
79 |
+
We know that some people want to explore smaller models or don't have the resources to run a full model. Therefore, we have prepared quantized versions of the Bielik-7B-Instruct-v0.1 model. We are also mindful of Apple Silicon.
|
80 |
+
<br>
|
81 |
+
<br>
|
82 |
+
Quantized versions (for non-GPU / weaker GPU):
|
83 |
+
- https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF
|
84 |
+
- https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GPTQ
|
85 |
+
- https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-AWQ
|
86 |
+
- https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-EXL2
|
87 |
+
|
88 |
+
For Apple Silicon:
|
89 |
+
- https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-MLX
|
90 |
+
|
91 |
+
|
92 |
### Instruction format
|
93 |
|
94 |
In order to leverage instruction fine-tuning, your prompt should be surrounded by `[INST]` and `[/INST]` tokens. The very first instruction should start with the beginning of a sentence token. The generated completion will be finished by the end-of-sentence token.
|