File size: 1,131 Bytes
00f78d7 fe5bb3c c61cfa3 00f78d7 5a53fb3 5f2c3d8 5a53fb3 aed7367 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
---
pipeline_tag: translation
language:
- ja
- ko
tags:
- python
- transformer
- pytorch
---
https://github.com/akpe12/JP-KR-ocr-translator-for-travel
- Usage
```
from transformers import(
EncoderDecoderModel,
PreTrainedTokenizerFast,
# XLMRobertaTokenizerFast,
BertTokenizerFast,
)
encoder_model_name = "cl-tohoku/bert-base-japanese-v2"
decoder_model_name = "skt/kogpt2-base-v2"
src_tokenizer = BertTokenizerFast.from_pretrained(encoder_model_name)
trg_tokenizer = PreTrainedTokenizerFast.from_pretrained(decoder_model_name)
model = EncoderDecoderModel.from_pretrained("figuringoutmine/translator-for-travel-jp-to-kr")
```
```
text = "豚骨ラーメン"
embeddings = src_tokenizer(text, return_attention_mask=False, return_token_type_ids=False, return_tensors='pt')
embeddings = {k: v for k, v in embeddings.items()}
output = model.generate(**embeddings)[0, 1:-1]
trg_tokenizer.decode(output.cpu())
```
- Quantitative evaluation using data related traveling in Japan
<br>
with BLEU score(1-gram)
<br>
Papago: 51.9
<br>
Google: 32.8
<br>
<strong>figuringoutmine/translator-for-travel-jp-to-kr: 52.7<strong/>
|