Llama-3-Motif-102B / README.md
Junghwan-moreh's picture
Table modification
efca08a verified
|
raw
history blame
3.24 kB
metadata
license: mit

image/png

Introduction

We introduce Motif, a new language model family of Moreh, specialized in Korean and English.
Motif-102B-Instruct is a chat model tuned from this model.

Training Platform

  • Motif-102B is trained on MoAI platform, refer to link for more information.

Quick Usage

base model is not served directly. Instead, you can chat directly with Motif-102B-Instruct through our Model hub.

Details

More details will be provided in the upcoming technical report.

Release Date

2024.11.01

Benchmark Results

Provider Model kmmlu_direct score
Moreh Motif-102B(pretrained) 72.36 +
Moreh Motif-102B-Instruct 72.11 +
Meta Llama3-70B-instruct 54.5*
Meta Llama3.1-70B-instruct 52.1*
Meta Llama3.1-405B-instruct 65.8*
Alibaba Qwen2-72B-instruct 64.1*
OpenAI GPT-4-0125-preview 59.95*
OpenAI GPT-4o-2024-05-13 64.11**
Google gemini pro 50.18*
LG exaone 3.0 44.5* +
Naver HyperCLOVA X 53.4* +
Upstage SOLAR-10.7B 41.65* +

* : Community report
** : Measured by Moreh
+ : Claimed to have better capability in Korean

How to use

We do not recommend using base model directly!

Use with vLLM

  • Refer to this link to install vllm
from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

# Change tensor_parallel_size to GPU numbers you can afford
model = LLM("moreh/Motif-102B", tensor_parallel_size=4)
tokenizer = AutoTokenizer.from_pretrained("moreh/Motif-102B")
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "์œ ์น˜์›์ƒ์—๊ฒŒ ๋น…๋ฑ… ์ด๋ก ์˜ ๊ฐœ๋…์„ ์„ค๋ช…ํ•ด๋ณด์„ธ์š”"},
]

messages_batch = [tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)]

# vllm does not support generation_config of hf. So we have to set it like below
sampling_params = SamplingParams(max_tokens=512, temperature=0, repetition_penalty=1.0, stop_token_ids=[tokenizer.eos_token_id])
responses = model.generate(messages_batch, sampling_params=sampling_params)

print(responses[0].outputs[0].text)

Use with transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "moreh/Motif-102B"

# all generation configs are set in generation_configs.json
model = AutoModelForCausalLM.from_pretrained(model_id).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_id)
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "์œ ์น˜์›์ƒ์—๊ฒŒ ๋น…๋ฑ… ์ด๋ก ์˜ ๊ฐœ๋…์„ ์„ค๋ช…ํ•ด๋ณด์„ธ์š”"},
]

messages_batch = tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)
input_ids = tokenizer(messages_batch, padding=True, return_tensors='pt')['input_ids'].cuda()

outputs = model.generate(input_ids)