yuyangdong commited on
Commit
496eed3
·
verified ·
1 Parent(s): fcc571f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +95 -0
README.md CHANGED
@@ -1,3 +1,98 @@
1
  ---
2
  license: cc-by-nc-4.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
  ---
6
+ # Jellyfish-8B
7
+ <!-- Provide a quick summary of what the model is/does. -->
8
+ <!--
9
+ <img src="https://i.imgur.com/d8Bl04i.png" alt="PicToModel" width="330"/>
10
+ -->
11
+ <img src="https://i.imgur.com/E1vqCIw.png" alt="PicToModel" width="330"/>
12
+
13
+
14
+ ## Model Details
15
+ Jellyfish-8B is a large language model equipped with 8 billion parameters.
16
+ We fine-tuned the [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) model using the datasets pertinent to data preprocessing tasks.
17
+ The training data include two parts:
18
+ * Jellyfish-13B training data
19
+ * GPT4 generated reasoning data for data preprocessing tasks.
20
+
21
+ <!-- Jellyfish-7B vs GPT-3.5-turbo wining rate by GPT4 evaluation is 56.36%. -->
22
+
23
+ More details about the model can be found in the [Jellyfish paper](https://arxiv.org/abs/2312.01678).
24
+
25
+ - **Developed by:** Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada
26
+ - **Contact: [email protected]**
27
+ - **Funded by:** NEC Corporation, Osaka University
28
+ - **Language(s) (NLP):** English
29
+ - **License:** Non-Commercial Creative Commons license (CC BY-NC-4.0)
30
+ - **Finetuned from model:** [Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
31
+ ## Citation
32
+
33
+ If you find our work useful, please give us credit by citing:
34
+
35
+ ```
36
+ @article{zhang2023jellyfish,
37
+ title={Jellyfish: A Large Language Model for Data Preprocessing},
38
+ author={Zhang, Haochen and Dong, Yuyang and Xiao, Chuan and Oyamada, Masafumi},
39
+ journal={arXiv preprint arXiv:2312.01678},
40
+ year={2023}
41
+ }
42
+ ```
43
+
44
+ ## Performance on seen tasks
45
+
46
+ | Task | Type | Dataset | Non-LLM SoTA<sup>1</sup> | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup> | Jellyfish-13B| Jellyfish-7B | Jellyfish-8B |
47
+ | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
48
+ | Entity Matching | Seen | Fodors-Zagats | 100 | 100 | 100 | 100 | 100 | 97.67 |
49
+ | Entity Matching | Seen | Beer | 94.37| 96.30 | 100 | 96.77 | 96.55| 100 |
50
+ | Entity Matching | Seen | iTunes-Amazon | 97.06| 96.43 | 100 | 98.11 | 96.30| 96.30 |
51
+ | Entity Matching | Seen | DBLP-ACM | 98.99| 96.99 | 97.44 | 98.98 | 98.88| 99.44 |
52
+ | Entity Matching | Seen | DBLP-GoogleScholar | 95.60| 76.12 | 91.87 | 98.51 | 95.15| 95.26 |
53
+ | Entity Matching | Seen | Amazon-Google | 75.58| 66.53 | 74.21 | 81.34 | 80.83 | 80.23 |
54
+ | Entity Matching | Unseen | Walmart-Amazon | 86.76| 86.17 | 90.27 | 89.42 | 85.64 | 90.81 |
55
+ | Entity Matching | Unseen | Abt-Buy | 89.33 | -- | 92.77 | 89.58 | 82.38 | 93.60 |
56
+ | Data Imputation | Seen | Restaurant | 77.20| 94.19 | 97.67 | 94.19 | 88.37 | 88.37 |
57
+ | Data Imputation | Seen | Buy | 96.50| 98.46 | 100 | 100 | 96.62 | 98.46 |
58
+ | Data Imputation | Unseen | Filpkart | 68.00 | -- | 89.94 | 81.68 | 79.44| 92.34 |
59
+ | Data Imputation | Unseen | Phone | 86.70| -- | 90.79 | 87.21 | 85.00| 86.93 |
60
+ | Error Detection | Seen | Hosptial | 94.40| 90.74 | 90.74 | 95.59 | 96.27 | 93.66|
61
+ | Error Detection | Seen | Adult | 99.10| 92.01 | 92.01 | 99.33 | 91.96 | 92.98|
62
+ | Error Detection | Unseen | Flights | 81.00 | -- | 83.48 | 82.52 | 66.92 |
63
+ | Error Detection | Unseen | Rayyan | 79.00| -- | 81.95 | 90.65 | 69.82 |
64
+ | Schema Matching | Seen | Sythea | 38.50| 57.14 | 66.67 | 36.36 | 44.44 | 40.00 |
65
+ | Schema Matching | Seen | MIMIC | 20.00| -- | 40.00 | 40.00 | 40.00 | 40.82|
66
+ | Schema Matching | Unseen | CMS | 50.00| -- | 19.35 | 59.29 | 13.79 | 68.49|
67
+
68
+ _For GPT-3.5 and GPT-4, we used the few-shot approach on all datasets. However, for Jellyfish-13B and Jellyfish-Interpreter, the few-shot approach is disabled on seen datasets and enabled on unseen datasets._
69
+ _Accuracy as the metric for data imputation and the F1 score for other tasks._
70
+
71
+ ## Performance on unseen tasks
72
+
73
+ ### Column Type Annotation
74
+
75
+ | Dataset | RoBERTa (159 shots)<sup>1</sup> | GPT-3.5<sup>1</sup> | GPT-4 | Jellfish-13B| Jellyfish-7B | Jellyfish-8B |
76
+ | ---- | ---- | ---- | ---- | ---- | ----|----|
77
+ | SOTAB | 79.20 | 89.47 | 91.55 | 82.00 | 80.89 |
78
+
79
+ _Few-shot is disabled for Jellyfish-13B._
80
+
81
+ 1. Results from [Column Type Annotation using ChatGPT](https://arxiv.org/abs/2306.00745)
82
+
83
+ ### Attribute Value Extraction
84
+
85
+ | Dataset |Stable Beluga 2 70B<sup>1</sup> | SOLAR 70B<sup>1</sup> | GPT-3.5<sup>1</sup> | GPT-4 <sup>1</sup>| Jellfish-13B | Jellyfish-7B| Jellyfish-8B |
86
+ | ---- | ---- | ---- | ---- | ---- | ---- | ----| ----|
87
+ | AE-110k | 52.10 | 49.20 | 61.30 | 55.50 | 58.12 | 76.85|
88
+ | OA-Mine | 50.80 | 55.20 | 62.70 | 68.90 | 55.96 | 76.04|
89
+
90
+
91
+ ## Prompt Template
92
+ ```
93
+ [INST]:
94
+
95
+ <prompt> (without the <>)
96
+
97
+ [\INST]]
98
+ ```