|
--- |
|
license: apache-2.0 |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- tiiuae/falcon-11B |
|
language: |
|
- nl |
|
--- |
|
|
|
# Falcon-5.5B-Dutch |
|
|
|
Falcon-5.5B-Dutch is a merge of the following models using [mergekit](https://github.com/cg123/mergekit): |
|
* [tiiuae/falcon-11B](https://huggingface.co/tiiuae/falcon-11B) |
|
* [tiiuae/falcon-11B](https://huggingface.co/tiiuae/falcon-11B) |
|
|
|
## 🧩 Configuration |
|
|
|
\```yaml |
|
slices: |
|
- sources: |
|
- model: tiiuae/falcon-11B |
|
layer_range: [0, 25] |
|
- sources: |
|
- model: tiiuae/falcon-11B |
|
layer_range: [56,59] |
|
|
|
merge_method: passthrough |
|
dtype: bfloat16\``` |
|
|
|
PruneMe has been optimized using the AgentWaller/dutch-oasst1 dataset by investigating layer similarity with 4000 samples. The layer ranges for pruning were determined based on this analysis to maintain performance while reducing model size. |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/660c0a02cf274b3ab77dd6b7/PF3SzEhQRJPXyYi2KqS1A.png) |
|
|
|
Note: This is a base language model and has not been optimized for conversational or chat applications. Further fine-tuning may be required to adapt it for specific use cases. |