File size: 1,795 Bytes
63138b7 7bbb120 63138b7 b8a290f 63138b7 b8a290f 63138b7 b8a290f 63138b7 b8a290f 63138b7 b8a290f 63138b7 b8a290f 63138b7 b8a290f 63138b7 b8a290f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
---
widget:
- text: አዲስ አበባ
example_title: Example 1
- text: በኢንግሊዝ ፕሪምየር ሊግ
example_title: Example 2
- text: ዶናልድ ትራምፕ
example_title: Example 3
language:
- am
metrics:
- perplexity
library_name: transformers
pipeline_tag: text-generation
base_model:
- meta-llama/Llama-3.2-1B-Instruct
---
# Llama-3.2-Amharic-1B
This model is a version of Meta's [Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) decoder transformer model that was continuously pretrained on an Amharic text corpus.
- 16k new amharic tokens were added to the Llama 3.2 tokenizer and the embdedding layer of the model was resized accordingly.
- The model was then trained on **300 million tokens** of **Amharic** text.
- This is a base model. The Amharic instruction following version is [Llama-3.2-1B-Amharic-Instruct](https://huggingface.co/rasyosef/Llama-3.2-1B-Amharic-Instruct)
### How to use
First, you need to install the latest version of transformers
```
pip install -Uq transformers
```
You can use this model directly with a pipeline for text generation:
```python
from transformers import pipeline
llama_am = pipeline(
"text-generation",
model="rasyosef/Llama-3.2-1B-Amharic",
device_map="auto"
)
prompt = "በኢንግሊዝ ፕሪምየር ሊግ"
llama_am(
prompt,
max_new_tokens=128,
temperature=0.3,
do_sample=True,
top_k=8,
top_p=0.8,
repetition_penalty=1.05
)
```
Output:
```python
[{'generated_text': 'በኢንግሊዝ ፕሪምየር ሊግ የ2017/18 የውድድር ዘመን ላይ ተሳታፊ የሆነው ሊቨርፑል ትናንት ምሽት 3 :45 ላይ ከዌስትሀም ዩናይትድ ጋር ባደረገው ጨዋታ በ2 ለ 1 ውጤት ተሸንፏል ።'}]
``` |