---
language: en
tags:
- exbert
license: apache-2.0
datasets:
- bookcorpus
- wikipedia
---

# VGCN-BERT (DistilBERT based, uncased)

This model is a VGCN-BERT model based on [DistilBert-base-uncased](https://huggingface.co/distilbert-base-uncased) version. The original paper is [VGCN-BERT](https://arxiv.org/abs/2004.05707).

### How to use

- First prepare WGraph symmetric adjacency matrix

```python
import transformers as tfr
from transformers.models.vgcn_bert.modeling_graph import WordGraph

tokenizer = tfr.AutoTokenizer.from_pretrained(
    "distilbert-base-uncased"
)
# 1st method: Build graph using NPMI statistical method from training corpus
wgraph = WordGraph(rows=train_valid_df["text"], tokenizer=tokenizer)
# 2nd method: Build graph from pre-defined entity relationship tuple with weight
entity_relations = [
    ("dog", "labrador", 0.6),
    ("cat", "garfield", 0.7),
    ("city", "montreal", 0.8),
    ("weather", "rain", 0.3),
]
wgraph = WordGraph(rows=entity_relations, tokenizer=tokenizer)
```

- Then instantiate VGCN-BERT model with your WGraphs (support multiple graphs).

```python
from transformers.models.vgcn_bert.modeling_vgcn_bert import VGCNBertModel
model = VGCNBertModel.from_pretrained(
    "zhibinlu/vgcn-bert-distilbert-base-uncased", trust_remote_code=True,
    wgraphs=[wgraph.to_torch_sparse()],
    wgraph_id_to_tokenizer_id_maps=[wgraph.wgraph_id_to_tokenizer_id_map]
)
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors="pt")
output = model(**encoded_input)
```


## Fine-tune model

It's better fin-tune vgcn-bert model for the specific tasks.