AtomicGPT-gemma3-27b

AtomicGPT-gemma3-27b is a bilingual (Korean–English) nuclear-domain large language model developed through continual pre-training (CPT) and instruction tuning (IT) on a curated set of nuclear engineering datasets. This model serves as an open-weight variant of the AtomicGPT architecture described in the paper, enabling reproducible research in domain-specific LLM adaptation.


Model Overview

  • Base model: Gemma3-27B-pt
  • Languages: Korean, English
  • Domain: Nuclear engineering (reactor physics, safety, materials, regulation, terminology)
  • Training Stages:
    • Continual Pre-training (CPT) on nuclear-domain corpora
    • Instruction Tuning (IT) using bilingual nuclear QA datasets

Evaluation

AtomicGPT-Gemma3-27B-pt was evaluated on a bilingual nuclear-domain benchmark:

Model MCQ (EM, max score is 100) Short Answer (F1, %) Descriptive (LLM Judge, 1-10 score)
Gemma3-27B-pt (base) 35 22.49 5.23
AtomicGPT-gemma3-27b (ours) 49 33.78 7.14
GPT-4 (OpenAI) 48 31.29 7.70

See Appendix A of the AtomicGPT paper for details.


How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "KAERI-MLP/AtomicGPT-gemma3-27b"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "Explain the purpose of a neutron moderator."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
3
Safetensors
Model size
27B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KAERI-MLP/AtomicGPT-gemma3-27b

Quantizations
2 models