manu
/

lilt-camembert-dit-base-hf

Token Classification

liltrobertalike

Model card Files Files and versions

manu commited on Apr 19, 2022

Commit

072a302

·

1 Parent(s): 3ce8ae7

Create README.md

Files changed (1) hide show

README.md +45 -0

README.md ADDED Viewed

	@@ -0,0 +1,45 @@

+---
+language:
+- fr
+tags:
+- token-classification
+- fill-mask
+license: mit
+datasets:
+- iit-cdip
+---
+This model is the combined camembert-base model, with the pretrained lilt checkpoint from the paper "LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding", with the visual backbone built from the pretrained checkpoint "microsoft/dit-base".
+ Original repository: https://github.com/jpWang/LiLT
+To use it, it is necessary to fork the modeling and configuration files from the original repository, and load the pretrained model from the corresponding classes (LiLTRobertaLikeVisionConfig, LiLTRobertaLikeVisionForRelationExtraction, LiLTRobertaLikeVisionForTokenClassification, LiLTRobertaLikeVisionModel).
+They can also be preloaded with the AutoConfig/model factories as such:
+```python
+from transformers import AutoModelForTokenClassification, AutoConfig, AutoModel
+from path_to_custom_classes import (
+    LiLTRobertaLikeVisionConfig,
+    LiLTRobertaLikeVisionForRelationExtraction,
+    LiLTRobertaLikeVisionForTokenClassification,
+    LiLTRobertaLikeVisionModel
+    )
+def patch_transformers():
+    AutoConfig.register("liltrobertalike", LiLTRobertaLikeVisionConfig)
+    AutoModel.register(LiLTRobertaLikeVisionConfig, LiLTRobertaLikeVisionModel)
+    AutoModelForTokenClassification.register(LiLTRobertaLikeVisionConfig, LiLTRobertaLikeVisionForTokenClassification)
+    # etc...
+ ```
+ To load the model, it is then possible to use:
+ ```python
+ # patch_transformers() must have been executed beforehand
+tokenizer = AutoTokenizer.from_pretrained("camembert-base")
+model = AutoModel.from_pretrained("manu/lilt-camembert-dit-base-hf")
+model = AutoModelForTokenClassification.from_pretrained("manu/lilt-camembert-dit-base-hf") # to be fine-tuned on a token classification task
+ ```