ilsp
/

Meltemi-7B-v1.5

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

droussis commited on Aug 1, 2024

Commit

9f48821

·

verified ·

1 Parent(s): 9fa6e51

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ Meltemi is built on top of [Mistral 7B](https://huggingface.co/mistralai/Mistral
 # Model Information
-- Vocabulary extension of the Mistral 7B tokenizer with Greek tokens
 - 8192 context length
 - We extend the pretraining of Mistral 7B with added proficiency for the Greek language, by utilizing a large corpus consisting of approximately **55 billion tokens**.
   * This corpus includes 43.3 billion monolingual Greek tokens, constructed from publicly available resources. Additionaly, to mitigate catastrophic forgetting and ensure that the model has bilingual capabilities, we use additional sub-corpora with monolingual English texts (10.5 billion tokens) and Greek-English parallel data (600 million tokens).

 # Model Information
+- Vocabulary extension of the Mistral 7B tokenizer with Greek tokens for lower costs and faster inference (**1.52** vs. 6.80 tokens/word for Greek)
 - 8192 context length
 - We extend the pretraining of Mistral 7B with added proficiency for the Greek language, by utilizing a large corpus consisting of approximately **55 billion tokens**.
   * This corpus includes 43.3 billion monolingual Greek tokens, constructed from publicly available resources. Additionaly, to mitigate catastrophic forgetting and ensure that the model has bilingual capabilities, we use additional sub-corpora with monolingual English texts (10.5 billion tokens) and Greek-English parallel data (600 million tokens).