language: en | |
tags: | |
- tokenizer | |
- pytorch | |
- streaming | |
library_name: nano | |
pipeline_tag: token-classification | |
datasets: | |
- Salesforce/wikitext | |
# Nano Tokenizer | |
This tokenizer was trained using a Python-only pipeline (no `transformers` or `tokenizers`), on a dataset streamed from the Hugging Face Hub. | |
## Usage | |
```python | |
from transformers import PreTrainedTokenizerFast | |
tokenizer = PreTrainedTokenizerFast.from_pretrained("goabonga/wikitext-2-raw-v1") | |
``` | |