ctoraman commited on
Commit
79222db
·
1 Parent(s): a796138

readme updated

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -18,8 +18,13 @@ Model architecture is similar to bert-medium (8 layers, 8 heads, and 512 hidden
18
  The details can be found at this paper:
19
  https://arxiv.org/...
20
 
21
- The following code segment can be used for initializing the tokenizer, example max length (514) can be changed:
22
  ```
 
 
 
 
 
23
  tokenizer = PreTrainedTokenizerFast(tokenizer_file=[file_path])
24
  tokenizer.mask_token = "[MASK]"
25
  tokenizer.cls_token = "[CLS]"
 
18
  The details can be found at this paper:
19
  https://arxiv.org/...
20
 
21
+ The following code can be used for model loading and tokenization, example max length (514) can be changed:
22
  ```
23
+ model = AutoModel.from_pretrained([model_path])
24
+ #for sequence classification:
25
+ #model = AutoModelForSequenceClassification.from_pretrained([model_path], num_labels=[num_classes])
26
+
27
+ tokenizer = ByT5Tokenizer.from_pretrained("google/byt5-small")
28
  tokenizer = PreTrainedTokenizerFast(tokenizer_file=[file_path])
29
  tokenizer.mask_token = "[MASK]"
30
  tokenizer.cls_token = "[CLS]"