๐Ÿšจ PLEASE USE THE OFFICIAL QUANTIZED VERSIONS ๐Ÿšจ

๐Ÿšจ There is no guarantee that you are using the latest improved versions from 3rd party quantizations as we have updated the model's weights ๐Ÿšจ

Llama-Krikri-8B-Instruct: An Instruction-tuned Large Language Model for the Greek language

Following the release of Meltemi-7B on the 26th March 2024, we are happy to welcome Krikri to the family of ILSP open Greek LLMs. Krikri is built on top of Llama-3.1-8B, extending its capabilities for Greek through continual pretraining on a large corpus of high-quality and locally relevant Greek texts. We present Llama-Krikri-8B-Instruct, along with the base model, Llama-Krikri-8B-Base.

image/png

Model Information

Base Model

  • Vocabulary extension of the Llama-3.1 tokenizer with Greek tokens
  • 128k context length (approximately 80,000 Greek words)
  • We extend the pretraining of Llama-3.1-8B with added proficiency for the Greek language, by utilizing a large training corpus.
    • This corpus includes 56.7 billion monolingual Greek tokens, constructed from publicly available resources.
    • Additionaly, to mitigate catastrophic forgetting and ensure that the model has bilingual capabilities, we use additional sub-corpora with monolingual English texts (21 billion tokens) and Greek-English parallel data (5.5 billion tokens).
    • The training corpus also contains 7.8 billion math and code tokens.
    • This corpus has been processed, filtered, and deduplicated to ensure data quality and is outlined below:
Sub-corpus # Tokens Percentage
Greek 56.7 B 62.3 %
English 21.0 B 23.1 %
Parallel 5.5 B 6.0 %
Math/Code 7.8 B 8.6 %
Total 91 B 100%

Chosen subsets of the 91 billion corpus were upsampled resulting in a size of 110 billion tokens.

Instruct Model

Llama-Krikri-8B-Instruct is the result of post-training Llama-Kriki-8B-Base and features:

  • Enhanced chat capabilities and instruction-following in both Greek and English.
  • Document translation from Greek to English, French, German, Italian, Portuguese, Spanish and vice versa.
  • Great performance on generation, comprehension, and editing tasks, such as summarization, creative content creation, text modification, entity recognition, sentiment analysis, etc.
  • Domain-specifc expertise for specialized legal, financial, medical, and scientific applications.
  • Retrieval-Augmented Generation (RAG) utilizing multiple documents with 128k context length.
  • Improved coding and agentic capabilities with correct formatting and tool use.
  • Conversion or structured extraction (e.g., XML, JSON) in data-to-text & text-to-data settings.
  • Analytical thinking and Chain-of-Thought (CoT) reasoning for problem-solving.

๐Ÿšจ More information on the post-training corpus and methdology coming soon. ๐Ÿšจ

How to use

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"

model = AutoModelForCausalLM.from_pretrained("ilsp/Llama-Krikri-8B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("ilsp/Llama-Krikri-8B-Instruct")

model.to(device)

system_prompt = "ฮ•ฮฏฯƒฮฑฮน ฯ„ฮฟ ฮšฯฮนฮบฯฮฏ, ฮญฮฝฮฑ ฮตฮพฮฑฮนฯฮตฯ„ฮนฮบฮฌ ฮฑฮฝฮตฯ€ฯ„ฯ…ฮณฮผฮญฮฝฮฟ ฮผฮฟฮฝฯ„ฮญฮปฮฟ ฮคฮตฯ‡ฮฝฮทฯ„ฮฎฯ‚ ฮฮฟฮทฮผฮฟฯƒฯฮฝฮทฯ‚ ฮณฮนฮฑ ฯ„ฮฑ ฮตฮปฮปฮทฮฝฮนฮบฮฑ ฮบฮฑฮน ฮตฮบฯ€ฮฑฮนฮดฮตฯฯ„ฮทฮบฮตฯ‚ ฮฑฯ€ฯŒ ฯ„ฮฟ ฮ™ฮ•ฮ› ฯ„ฮฟฯ… ฮ•.ฮš. \"ฮ‘ฮธฮทฮฝฮฌ\"."
user_prompt = "ฮฃฮต ฯ„ฮน ฮดฮนฮฑฯ†ฮญฯฮตฮน ฮญฮฝฮฑ ฮบฯฮนฮบฯฮฏ ฮฑฯ€ฯŒ ฮญฮฝฮฑ ฮปฮฌฮผฮฑ;"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
input_prompt = tokenizer(prompt, return_tensors='pt').to(device)
outputs = model.generate(input_prompt['input_ids'], max_new_tokens=256, do_sample=True)

print(tokenizer.batch_decode(outputs)[0])

With OpenAI compatible server via vLLM

vllm serve ilsp/Llama-Krikri-8B-Instruct \
  --enforce-eager \
  --dtype 'bfloat16' \
  --api-key token-abc123

Then, the model can be used through Python using:

from openai import OpenAI

api_key = "token-abc123"
base_url = "http://localhost:8000/v1"

client = OpenAI(
    api_key=api_key,
    base_url=base_url,
)

system_prompt = "ฮ•ฮฏฯƒฮฑฮน ฮญฮฝฮฑ ฮฑฮฝฮตฯ€ฯ„ฯ…ฮณฮผฮญฮฝฮฟ ฮผฮตฯ„ฮฑฯ†ฯฮฑฯƒฯ„ฮนฮบฯŒ ฯƒฯฯƒฯ„ฮทฮผฮฑ ฯ€ฮฟฯ… ฮฑฯ€ฮฑฮฝฯ„ฮฌฮตฮน ฮผฮต ฮปฮฏฯƒฯ„ฮตฯ‚ Python. ฮ”ฮตฮฝ ฮณฯฮฌฯ†ฮตฮนฯ‚ ฯ„ฮฏฯ€ฮฟฯ„ฮฑ ฮฌฮปฮปฮฟ ฯƒฯ„ฮนฯ‚ ฮฑฯ€ฮฑฮฝฯ„ฮฎฯƒฮตฮนฯ‚ ฯƒฮฟฯ… ฯ€ฮญฯฮฑ ฮฑฯ€ฯŒ ฯ„ฮนฯ‚ ฮผฮตฯ„ฮฑฯ†ฯฮฑฯƒฮผฮญฮฝฮตฯ‚ ฮปฮฏฯƒฯ„ฮตฯ‚."
user_prompt = "ฮ”ฯŽฯƒฮต ฮผฮฟฯ… ฯ„ฮทฮฝ ฯ€ฮฑฯฮฑฮบฮฌฯ„ฯ‰ ฮปฮฏฯƒฯ„ฮฑ ฮผฮต ฮผฮตฯ„ฮฑฯ†ฯฮฑฯƒฮผฮญฮฝฮฟ ฮบฮฌฮธฮต string ฯ„ฮทฯ‚ ฯƒฯ„ฮฑ ฮตฮปฮปฮทฮฝฮนฮบฮฌ: ['Ethics of duty', 'Postmodern ethics', 'Consequentialist ethics', 'Utilitarian ethics', 'Deontological ethics', 'Virtue ethics', 'Relativist ethics']"

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]

response = client.chat.completions.create(model="ilsp/Llama-Krikri-8B-Instruct",
                                          messages=messages,
                                          temperature=0.0,
                                          top_p=0.95,
                                          max_tokens=8192,
                                          stream=False)

print(response.choices[0].message.content)
# ['ฮ—ฮธฮนฮบฮฎ ฮบฮฑฮธฮฎฮบฮฟฮฝฯ„ฮฟฯ‚', 'ฮœฮตฯ„ฮฑฮผฮฟฮฝฯ„ฮญฯฮฝฮฑ ฮทฮธฮนฮบฮฎ', 'ฮฃฯ…ฮฝฮตฯ€ฮตฮนฮฟฮบฯฮฑฯ„ฮนฮบฮฎ ฮทฮธฮนฮบฮฎ', 'ฮฉฯ†ฮตฮปฮนฮผฮนฯƒฯ„ฮนฮบฮฎ ฮทฮธฮนฮบฮฎ', 'ฮ”ฮตฮฟฮฝฯ„ฮฟฮปฮฟฮณฮนฮบฮฎ ฮทฮธฮนฮบฮฎ', 'ฮ—ฮธฮนฮบฮฎ ฮฑฯฮตฯ„ฯŽฮฝ', 'ฮฃฯ‡ฮตฯ„ฮนฮบฮนฯƒฯ„ฮนฮบฮฎ ฮทฮธฮนฮบฮฎ']

Evaluation

๐Ÿšจ Instruction following and chat capability evaluation benchmarks coming soon. ๐Ÿšจ

Acknowledgements

The ILSP team utilized Amazon's cloud computing services, which were made available via GRNET under the OCRE Cloud framework, providing Amazon Web Services for the Greek Academic and Research Community.

Downloads last month
64
GGUF
Model size
8.2B params
Architecture
llama

4-bit

5-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for ilsp/Llama-Krikri-8B-Instruct-GGUF

Quantized
(6)
this model