Model Card for Model ID
Model Details
Llama-3-Open-Ko-8B model is continued pretrained language model based on Llama-3-8B.
This model is trained fully with publicily available resource, with 60GB+ of deduplicated texts.
With the new Llama-3 tokenizer, the pretraining conducted with 17.7B+ tokens, which slightly more than Korean tokenizer(Llama-2-Ko tokenizer).
Sample usage
from transformers import pipeline
import torch
pipe = pipeline(
task="text-generation",
model=model,
tokenizer=tokenizer,
model_kwargs={"torch_dtype": torch.bfloat16},
truncation=True
)
def extract_response_llama3(question):
messages = [
{"role": "system", "content": ""},
{"role": "user", "content": question},
]
prompt = pipe.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipe.tokenizer.eos_token_id,
pipe.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipe(
prompt,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.1,
top_p=0.9,
num_return_sequences=1
)
return outputs[0]['generated_text'].split('\n')[-1]
question = "μμ°μ λΆλ°°ν λ μ¬μ
μ μ°μ μμλ₯Ό μ ν΄μ μ°¨λ± μ§μνλ λ°©λ²μ λλΌκ³ νμ§"
response = extract_response_llama3(question)
print(response)
question = "λ―ΈμΈλ¨Όμ§ μμ±λ¬Όμ§μ λ°°μΆμ μ κ°νκ³ μ’
ν©μ μΌλ‘ κ΄λ¦¬νκΈ° μν λ²μ μ΄λμ μ μ νλ"
response = extract_response_llama3(question)
print(response)
question = "μ΄λ€ μ₯μμ λκΈ°μ€μΌμ λ°©μ§νκΈ° μν μ μ±
μ λ²μ κ·Όκ±°κ° νΉλ³λ²μ μ μ μΌλ‘ μ€λΉλμμ§"
response = extract_response_llama3(question)
print(response)
Sample Output
μ νκ³Ό μ§μ€
νκ²½λΆ
νλ§
- Downloads last month
- 2,337
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.