์„ฑ๊ท ๊ด€๋Œ€ํ•™๊ต ์‚ฐํ•™ํ˜‘๋ ฅ ๋ฐ์ดํ„ฐ๋กœ ๋งŒ๋“  ํ…Œ์ŠคํŠธ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
๊ธฐ์กด 10๋งŒ 7์ฒœ๊ฐœ์˜ ๋ฐ์ดํ„ฐ + 2์ฒœ๊ฐœ ์ผ์ƒ๋Œ€ํ™” ์ถ”๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒจ๊ฐ€ํ•˜์—ฌ ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค.


๋ชจ๋ธ์€ EleutherAI/polyglot-ko-5.8b๋ฅผ base๋กœ ํ•™์Šต ๋˜์—ˆ์œผ๋ฉฐ
ํ•™์Šต parameter์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

batch_size: 128
micro_batch_size: 8
num_epochs: 3
learning_rate: 3e-4
cutoff_len: 1024
lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
weight_decay: 0.1


์ธก์ •ํ•œ kobest 10shot ์ ์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
score


๋ชจ๋ธ prompt template๋Š” kullm์˜ template๋ฅผ ์‚ฌ์šฉํ•˜์˜€์Šต๋‹ˆ๋‹ค.
ํ…Œ์ŠคํŠธ ์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
https://colab.research.google.com/drive/1xEHewqHnG4p3O24AuqqueMoXq1E3AlT0?usp=sharing

from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer

model_name="jojo0217/ChatSKKU5.8B"
model = AutoModelForCausalLM.from_pretrained(
            model_name,
            device_map="auto",
            load_in_8bit=True,#๋งŒ์•ฝ ์–‘์žํ™” ๋„๊ณ  ์‹ถ๋‹ค๋ฉด false
        )
tokenizer = AutoTokenizer.from_pretrained(model_name)
pipe = pipeline(
            "text-generation",
            model=model,
            tokenizer=model_name,
            device_map="auto"
        )

def answer(message):
  prompt=f"์•„๋ž˜๋Š” ์ž‘์—…์„ ์„ค๋ช…ํ•˜๋Š” ๋ช…๋ น์–ด์ž…๋‹ˆ๋‹ค. ์š”์ฒญ์„ ์ ์ ˆํžˆ ์™„๋ฃŒํ•˜๋Š” ์‘๋‹ต์„ ์ž‘์„ฑํ•˜์„ธ์š”.\n\n### ๋ช…๋ น์–ด:\n{message}"
  ans = pipe(
        prompt + "\n\n### ์‘๋‹ต:",
        do_sample=True,
        max_new_tokens=512,
        temperature=0.7,
        repetition_penalty = 1.0,
        return_full_text=False,
        eos_token_id=2,
    )
  msg = ans[0]["generated_text"]
  return msg
answer('์„ฑ๊ท ๊ด€๋Œ€ํ•™๊ต์—๋Œ€ํ•ด ์•Œ๋ ค์ค˜')
Downloads last month
2,183
Safetensors
Model size
5.89B params
Tensor type
FP16
ยท
F32
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Dataset used to train jojo0217/ChatSKKU5.8B