phjhk
/

hklegal-xlm-r-large-t

Inference Endpoints

Model card Files Files and versions Community

phjhk commited on Jul 29, 2022

Commit

4e4df97

·

1 Parent(s): 8386232

Update README.md

Files changed (1) hide show

README.md +100 -3

README.md CHANGED Viewed

@@ -1,3 +1,100 @@
 # Model Description
 The XLM-RoBERTa model was proposed in [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116) by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. This model is [XLM-RoBERTa-large](https://huggingface.co/xlm-roberta-large) fine-tuned with the [conll2003](https://huggingface.co/datasets/conll2003) dataset in English.
@@ -14,9 +111,9 @@ Hong Kong Legal Information Institute [HKILL](https://www.hklii.hk/eng/) is a fr
 The model is a pretrained-finetuned language model. The model can be used for document classification, Named Entity Recognition (NER), especially on legal domain.
 ```python
->>> from transformers import pipeline
->>> tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-large-finetuned-conll03-english")
->>> model = AutoModelForTokenClassification.from_pretrained("xlm-roberta-large-finetuned-conll03-english")
 >>> classifier = pipeline("ner", model=model, tokenizer=tokenizer)
 >>> classifier("Alya told Jasmine that Andrew could pay with cash..")
 ```

+---
+language:
+- multilingual
+- af
+- am
+- ar
+- as
+- az
+- be
+- bg
+- bn
+- br
+- bs
+- ca
+- cs
+- cy
+- da
+- de
+- el
+- en
+- eo
+- es
+- et
+- eu
+- fa
+- fi
+- fr
+- fy
+- ga
+- gd
+- gl
+- gu
+- ha
+- he
+- hi
+- hr
+- hu
+- hy
+- id
+- is
+- it
+- ja
+- jv
+- ka
+- kk
+- km
+- kn
+- ko
+- ku
+- ky
+- la
+- lo
+- lt
+- lv
+- mg
+- mk
+- ml
+- mn
+- mr
+- ms
+- my
+- ne
+- nl
+- no
+- om
+- or
+- pa
+- pl
+- ps
+- pt
+- ro
+- ru
+- sa
+- sd
+- si
+- sk
+- sl
+- so
+- sq
+- sr
+- su
+- sv
+- sw
+- ta
+- te
+- th
+- tl
+- tr
+- ug
+- uk
+- ur
+- uz
+- vi
+- xh
+- yi
+- zh
+---
 # Model Description
 The XLM-RoBERTa model was proposed in [Unsupervised Cross-lingual Representation Learning at Scale](https://arxiv.org/abs/1911.02116) by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019. It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data. This model is [XLM-RoBERTa-large](https://huggingface.co/xlm-roberta-large) fine-tuned with the [conll2003](https://huggingface.co/datasets/conll2003) dataset in English.
 The model is a pretrained-finetuned language model. The model can be used for document classification, Named Entity Recognition (NER), especially on legal domain.
 ```python
+>>> from transformers import pipeline,AutoTokenizer,AutoModelForTokenClassification
+>>> tokenizer = AutoTokenizer.from_pretrained("hklegal-xlm-r-large-t")
+>>> model = AutoModelForTokenClassification.from_pretrained("hklegal-xlm-r-large-t")
 >>> classifier = pipeline("ner", model=model, tokenizer=tokenizer)
 >>> classifier("Alya told Jasmine that Andrew could pay with cash..")
 ```