language: | |
- amh | |
tags: | |
- Amharic | |
- Word Piece Tokenizer | |
- Tokenizer | |
license: cc-by-4.0 | |
``` | |
from transformers import AutoTokenizer | |
tokenizer = AutoTokenizer.from_pretrained("israel/AmhWordPieceTokenizer") | |
encoding = tokenizer.encode("ኮሌጁ ቢያስተምርም ወደስራ የሚመድባቸው መንግስት ነው abcs") | |
encoding.tokens | |
``` |