rasyosef's picture
Update README.md
7bbb120 verified
metadata
widget:
  - text: አዲስ አበባ
    example_title: Example 1
  - text: በኢንግሊዝ ፕሪምየር ሊግ
    example_title: Example 2
  - text: ዶናልድ ትራምፕ
    example_title: Example 3
language:
  - am
metrics:
  - perplexity
library_name: transformers
pipeline_tag: text-generation
base_model:
  - meta-llama/Llama-3.2-1B-Instruct

Llama-3.2-Amharic-1B

This model is a version of Meta's Llama-3.2-1B decoder transformer model that was continuously pretrained on an Amharic text corpus.

  • 16k new amharic tokens were added to the Llama 3.2 tokenizer and the embdedding layer of the model was resized accordingly.
  • The model was then trained on 300 million tokens of Amharic text.
  • This is a base model. The Amharic instruction following version is Llama-3.2-1B-Amharic-Instruct

How to use

First, you need to install the latest version of transformers

pip install -Uq transformers

You can use this model directly with a pipeline for text generation:

from transformers import pipeline

llama_am = pipeline(
    "text-generation",
    model="rasyosef/Llama-3.2-1B-Amharic",
    device_map="auto"
  )

prompt = "በኢንግሊዝ ፕሪምየር ሊግ"
llama_am(
    prompt,
    max_new_tokens=128,
    temperature=0.3,
    do_sample=True,
    top_k=8,
    top_p=0.8,
    repetition_penalty=1.05
  )

Output:

[{'generated_text': 'በኢንግሊዝ ፕሪምየር ሊግ የ2017/18 የውድድር ዘመን ላይ ተሳታፊ የሆነው ሊቨርፑል ትናንት ምሽት 3 :45 ላይ ከዌስትሀም ዩናይትድ ጋር ባደረገው ጨዋታ በ2 ለ 1 ውጤት ተሸንፏል ።'}]