zhengbao commited on
Commit
f6eb9f5
·
1 Parent(s): c50f20d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - omnitab
5
+ datasets:
6
+ - wikitablequestions
7
+ ---
8
+
9
+ # OmniTab
10
+
11
+ OmniTab is a table-based QA model proposed in [OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering](https://arxiv.org/pdf/2207.03637.pdf). The original Github repository is [https://github.com/jzbjyb/OmniTab](https://github.com/jzbjyb/OmniTab).
12
+
13
+ ## Description
14
+
15
+ `neulab/omnitab-large-finetuned-wtq` (based on BART architecture) is initialized with `microsoft/tapex-large`, continuously pretrained on natural and synthetic data, and fine-tuned on [WikiTableQuestions](https://huggingface.co/datasets/wikitablequestions).
16
+
17
+ ## Usage
18
+
19
+ ```python
20
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
21
+ import pandas as pd
22
+
23
+ tokenizer = AutoTokenizer.from_pretrained("neulab/omnitab-large-finetuned-wtq")
24
+ model = AutoModelForSeq2SeqLM.from_pretrained("neulab/omnitab-large-finetuned-wtq")
25
+
26
+ data = {
27
+ "year": [1896, 1900, 1904, 2004, 2008, 2012],
28
+ "city": ["athens", "paris", "st. louis", "athens", "beijing", "london"]
29
+ }
30
+ table = pd.DataFrame.from_dict(data)
31
+
32
+ query = "In which year did beijing host the Olympic Games?"
33
+ encoding = tokenizer(table=table, query=query, return_tensors="pt")
34
+
35
+ outputs = model.generate(**encoding)
36
+
37
+ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
38
+ # [' 2008']
39
+ ```
40
+
41
+ ## Reference
42
+
43
+ ```bibtex
44
+ @inproceedings{jiang-etal-2022-omnitab,
45
+ title = "{O}mni{T}ab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering",
46
+ author = "Jiang, Zhengbao and Mao, Yi and He, Pengcheng and Neubig, Graham and Chen, Weizhu",
47
+ booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
48
+ month = jul,
49
+ year = "2022",
50
+ }
51
+ ```