--- license: cc-by-sa-4.0 language: - ja tags: - japanese input - kana kanji conversion --- # zenz-v1 Checkpoints [zenz-v1](https://huggingface.co/Miwa-Keita/zenz-v1) is a language model specialized for kana-kanji conversion tasks based on the GPT-2 architecture. It is intended for use in the neural kana-kanji conversion system "Zenzai." This repository publishes the checkpoints for zenz-v1. * 90M parameters * Character-level + byte-level BPE tokenizer * High performance in kana-kanji conversion tasks using greedy decoding ## Model Details ### Model Description The base model used is [ku-nlp/gpt2-small-japanese-char](https://huggingface.co/ku-nlp/gpt2-small-japanese-char) provided under [CC-BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/deed.ja). This model is provided under [CC-BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/deed.ja). - **Developed by:** Keita Miwa ([𝕏](https://twitter.com/miwa_ensan)) - **Model type:** GPT-2 - **Language(s) (NLP):** Japanese - **License:** CC-BY-SA 4.0 - **Finetuned from model:** [ku-nlp/gpt2-small-japanese-char](https://huggingface.co/ku-nlp/gpt2-small-japanese-char) ### Model Sources This model is intended for use with Zenzai (AzooKeyKanaKanjiConverter). - **Repository:** https://github.com/ensan-hcl/AzooKeyKanaKanjiConverter ## Acknowledgements The following libraries, tools, and language resources were utilized in constructing this model. * MeCab (https://taku910.github.io/mecab/) * ipadic-NEologd (https://github.com/neologd/mecab-ipadic-neologd) * torch (https://pypi.org/project/torch/) * transformers (https://pypi.org/project/transformers/) * datasets (https://pypi.org/project/datasets/) * jaconv (https://pypi.org/project/jaconv/) * llama.cpp (https://github.com/ggerganov/llama.cpp)