--- datasets: - PrompTart/PTT_advanced_en_ko language: - en - ko base_model: - beomi/Llama-3-KoEn-8B-Instruct-preview - meta-llama/Meta-Llama-3-8B library_name: transformers --- # Llama-3-KoEn-8B-Instruct-preview Fine-Tuned on Parenthetical Terminology Translation (PTT) Dataset ## Model Overview This is a **Llama-3-KoEn-8B-Instruct-preview** model fine-tuned on the [**Parenthetical Terminology Translation (PTT)**](https://arxiv.org/abs/2410.00683) dataset. [The PTT dataset](https://huggingface.co/datasets/PrompTart/PTT_advanced_en_ko) focuses on translating technical terms accurately by placing the original English term in parentheses alongside its Korean translation, enhancing clarity and precision in specialized fields. This fine-tuned model is optimized for handling technical terminology in the **Artificial Intelligence (AI)** domain. ## Example Usage Here’s how to use this fine-tuned model with the Hugging Face `transformers` library: ```python import transformers from transformers import AutoTokenizer, AutoModelForCausalLM # Load Model and Tokenizer model_name = "PrompTartLAB/Llama3ko_8B_inst_PTT_enko" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto", ) tokenizer = AutoTokenizer.from_pretrained(model_name) # Example sentence text = "The model was fine-tuned using knowledge distillation techniques. The training dataset was created using a collaborative multi-agent framework powered by large language models." prompt = f"Translate input sentence to Korean \n### Input: {text} \n### Translated:" # Tokenize and generate translation input_ids = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**input_ids, max_new_tokens=1024) out_message = tokenizer.decode(outputs[0][len(input_ids["input_ids"][0]):], skip_special_tokens=True) # " 이 모델은 지식 증류 기법(knowledge distillation techniques)을 사용하여 미세 조정되었습니다. 훈련 데이터셋은 대형 언어 모델(large language models)로 구동되는 협력적 다중 에이전트 프레임워크(collaborative multi-agent framework)를 사용하여 생성되었습니다." ``` ## Limitations - **Out-of-Domain Accuracy**: While the model generalizes to some extent, accuracy may vary in domains that were not part of the training set. - **Incomplete Parenthetical Annotation**: Not all technical terms are consistently displayed in parentheses; in some cases, terms may be omitted or not annotated as expected. ## Citation If you use this model in your research, please cite the original dataset and paper: ```tex @misc{myung2024efficienttechnicaltermtranslation, title={Efficient Technical Term Translation: A Knowledge Distillation Approach for Parenthetical Terminology Translation}, author={Jiyoon Myung and Jihyeon Park and Jungki Son and Kyungro Lee and Joohyung Han}, year={2024}, eprint={2410.00683}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2410.00683}, } ``` ## Contact For questions or feedback, please contact [aeolian83@gmail.com](mailto:aeolian83@gmail.com).