NECOUDBFM
/

Jellyfish-8B

@@ -43,39 +43,48 @@ If you find our work useful, please give us credit by citing:
 ## Performance on seen tasks
-| Task            | Type   | Dataset           | Non-LLM SoTA<sup>1</sup> | GPT-3  | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup>  | GPT-4o<sup>2</sup> | Table-GPT | Jellyfish-7B | Jellyfish-8B | Jellyfish-13B |
-|-----------------|--------|-------------------|-----------------|--------|---------|--------|--------|-----------|--------------|--------------|---------------|
-| Error Detection | Seen   | Adult             | *99.10*         | 99.10  | 92.01   | 92.01  | 83.58  | --        | 77.40        | 73.74        | **99.33**     |
-| Error Detection | Seen   | Hospital          | 94.40           | **97.80** | 90.74  | 90.74  | 44.76  | --        | 94.51        | 93.40        | *95.59*       |
-| Error Detection | Unseen | Flights           | 81.00           | --     | --     | **83.48** | 66.01  | --        | 69.15        | 66.21        | *82.52*       |
-| Error Detection | Unseen | Rayyan            | 79.00           | --     | --     | *81.95* | 68.53  | --        | 75.07        | 81.06        | **90.65**     |
-| Data Imputation | Seen   | Buy               | 96.50           | 98.50  | 98.46   | **100** | **100** | --        | 98.46        | 98.46        | **100**       |
-| Data Imputation | Seen   | Restaurant        | 77.20           | 88.40  | *94.19* | **97.67** | 90.70  | --        | 89.53        | 87.21        | 89.53         |
-| Data Imputation | Unseen | Flipkart          | 68.00           | --     | --     | **89.94** | 83.20  | --        | 87.14        | *87.48*      | 81.68         |
-| Data Imputation | Unseen | Phone             | 86.70           | --     | --     | **90.79** | 86.78  | --        | 86.52        | 85.68        | *87.21*       |
-| Schema Matching | Seen   | MIMIC-III         | 20.00           | --     | --     | 40.00   | 29.41  | --        | **53.33**    | *45.45*      | 40.00         |
-| Schema Matching | Seen   | Synthea           | 38.50           | 45.20  | *57.14* | **66.67** | 6.56   | --        | 55.56        | 47.06        | 56.00         |
-| Schema Matching | Unseen | CMS               | *50.00*         | --     | --     | 19.35   | 22.22  | --        | 42.86        | 38.10        | **59.29**     |
-| Entity Matching | Seen   | Amazon-Google     | 75.58           | 63.50  | 66.50   | 74.21  | 70.91  | 70.10     | **81.69**    | *81.42*      | 81.34         |
-| Entity Matching | Seen   | Beer              | 94.37           | **100** | 96.30  | **100** | 90.32  | 96.30     | **100.00**   | **100.00**   | 96.77         |
-| Entity Matching | Seen   | DBLP-ACM          | **98.99**       | 96.60  | 96.99   | 97.44  | 95.87  | 93.80     | 98.65        | 98.77        | *98.98*       |
-| Entity Matching | Seen   | DBLP-GoogleScholar| *95.70*         | 83.80  | 76.12   | 91.87  | 90.45  | 92.40     | 94.88        | 95.03        | **98.51**     |
-| Entity Matching | Seen   | Fodors-Zagats     | **100**         | **100** | **100** | **100** | 93.62  | **100**   | **100**      | **100**      | **100**       |
-| Entity Matching | Seen   | iTunes-Amazon     | 97.06           | *98.20*| 96.40   | **100** | 98.18  | 94.30     | 96.30        | 96.30        | 98.11         |
-| Entity Matching | Unseen | Abt-Buy           | 89.33           | --     | --     | **92.77** | 78.73  | --        | 86.06        | 88.84        | *89.58*       |
-| Entity Matching | Unseen | Walmart-Amazon    | 86.89           | 87.00  | 86.17   | **90.27** | 79.19  | 82.40     | 84.91        | 85.24        | *89.42*       |
-| Avg             |        |                   | 80.44           | -      | -      | *84.17* | 72.58  | -         | 82.74        | 81.55        | **86.02**     |
 _For GPT-3.5 and GPT-4, we used the few-shot approach on all datasets. However, for Jellyfish-13B and Jellyfish-Interpreter, the few-shot approach is disabled on seen datasets and enabled on unseen datasets._
 _Accuracy as the metric for data imputation and the F1 score for other tasks._
 ## Performance on unseen tasks
 ### Column Type Annotation
-| Dataset | RoBERTa (159 shots)<sup>1</sup> | GPT-3.5<sup>1</sup> | GPT-4 | Jellfish-13B| Jellyfish-7B |   Jellyfish-8B |
-| ---- | ---- | ---- | ---- | ---- | ----|----|
-| SOTAB | 79.20 | 89.47 | 91.55 | 82.00 | 80.89 | 67.21|
 _Few-shot is disabled for Jellyfish-13B._
@@ -88,6 +97,7 @@ _Few-shot is disabled for Jellyfish-13B._
 | AE-110k | 52.10 | 49.20 | 61.30 | 55.50 | 58.12 | 76.85| 69.78|
 | OA-Mine | 50.80 | 55.20 | 62.70 | 68.90 | 55.96 | 76.04| 78.83|
 ## Prompt Template
 ```

 ## Performance on seen tasks
+| Task            | Type   | Dataset           | Non-LLM SoTA<sup>1</sup> | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup>  | GPT-4o | Table-GPT | Jellyfish-7B | Jellyfish-8B | Jellyfish-13B |
+|-----------------|--------|-------------------|-----------------|--------|--------|--------|-----------|--------------|--------------|---------------|
+| Error Detection | Seen   | Adult             | *99.10*         | 99.10  | 92.01  | 83.58  | --        | 77.40        | 73.74        | **99.33**     |
+| Error Detection | Seen   | Hospital          | 94.40           | **97.80** | 90.74  | 44.76  | --        | 94.51        | 93.40        | *95.59*       |
+| Error Detection | Unseen | Flights           | 81.00           | --     | **83.48** | 66.01  | --        | 69.15        | 66.21        | *82.52*       |
+| Error Detection | Unseen | Rayyan            | 79.00           | --     | *81.95* | 68.53  | --        | 75.07        | 81.06        | **90.65**     |
+| Data Imputation | Seen   | Buy               | 96.50           | 98.50  | **100** | **100** | --        | 98.46        | 98.46        | **100**       |
+| Data Imputation | Seen   | Restaurant        | 77.20           | 88.40  | **97.67** | 90.70  | --        | 89.53        | 87.21        | 89.53         |
+| Data Imputation | Unseen | Flipkart          | 68.00           | --     | **89.94** | 83.20  | --        | 87.14        | *87.48*      | 81.68         |
+| Data Imputation | Unseen | Phone             | 86.70           | --     | **90.79** | 86.78  | --        | 86.52        | 85.68        | *87.21*       |
+| Schema Matching | Seen   | MIMIC-III         | 20.00           | --     | 40.00   | 29.41  | --        | **53.33**    | *45.45*      | 40.00         |
+| Schema Matching | Seen   | Synthea           | 38.50           | 45.20  | **66.67** | 6.56   | --        | 55.56        | 47.06        | 56.00         |
+| Schema Matching | Unseen | CMS               | *50.00*         | --     | 19.35   | 22.22  | --        | 42.86        | 38.10        | **59.29**     |
+| Entity Matching | Seen   | Amazon-Google     | 75.58           | 63.50  | 74.21  | 70.91  | 70.10     | **81.69**    | *81.42*      | 81.34         |
+| Entity Matching | Seen   | Beer              | 94.37           | **100** | **100** | 90.32  | 96.30     | **100.00**   | **100.00**   | 96.77         |
+| Entity Matching | Seen   | DBLP-ACM          | **98.99**       | 96.60  | 97.44  | 95.87  | 93.80     | 98.65        | 98.77        | *98.98*       |
+| Entity Matching | Seen   | DBLP-GoogleScholar| *95.70*         | 83.80  | 91.87  | 90.45  | 92.40     | 94.88        | 95.03        | **98.51**     |
+| Entity Matching | Seen   | Fodors-Zagats     | **100**         | **100** | **100** | 93.62  | **100**   | **100**      | **100**      | **100**       |
+| Entity Matching | Seen   | iTunes-Amazon     | 97.06           | *98.20*| **100** | 98.18  | 94.30     | 96.30        | 96.30        | 98.11         |
+| Entity Matching | Unseen | Abt-Buy           | 89.33           | --     | **92.77** | 78.73  | --        | 86.06        | 88.84        | *89.58*       |
+| Entity Matching | Unseen | Walmart-Amazon    | 86.89           | 87.00  | **90.27** | 79.19  | 82.40     | 84.91        | 85.24        | *89.42*       |
+| Avg             |        |                   | 80.44           | -      | *84.17* | 72.58  | -         | 82.74        | 81.55        | **86.02**     |
 _For GPT-3.5 and GPT-4, we used the few-shot approach on all datasets. However, for Jellyfish-13B and Jellyfish-Interpreter, the few-shot approach is disabled on seen datasets and enabled on unseen datasets._
 _Accuracy as the metric for data imputation and the F1 score for other tasks._
+1.
+  [Ditto](https://arxiv.org/abs/2004.00584) for Entity Matching
+  [SMAT](https://www.researchgate.net/publication/353920530_SMAT_An_Attention-Based_Deep_Learning_Solution_to_the_Automation_of_Schema_Matching) for Schema Matching
+  [HoloDetect](https://arxiv.org/abs/1904.02285) for Error Detection seen datasets
+  [RAHA](https://dl.acm.org/doi/10.1145/3299869.3324956) for Error Detection unseen datasets
+  [IPM](https://ieeexplore.ieee.org/document/9458712) for Data Imputation
+2.
+  [Large Language Models as Data Preprocessors](https://arxiv.org/abs/2308.16361)
 ## Performance on unseen tasks
 ### Column Type Annotation
+| Dataset           | RoBERTa (159 shots)<sup>1</sup> | GPT-3.5<sup>1</sup> | GPT-4  | GPT-4o | Jellyfish-7B | Jellyfish-8B | Jellyfish-13B |
+|--------|-----------------|--------|--------|--------|--------------|--------------|---------------|
+| SOTAB | 79.20 | 89.47 | 91.55 | 65.05 | 83 | 76.33 | 82 |
 _Few-shot is disabled for Jellyfish-13B._
 | AE-110k | 52.10 | 49.20 | 61.30 | 55.50 | 58.12 | 76.85| 69.78|
 | OA-Mine | 50.80 | 55.20 | 62.70 | 68.90 | 55.96 | 76.04| 78.83|
+1. Results from [Product Attribute Value Extraction using Large Language Models](https://arxiv.org/abs/2310.12537)
 ## Prompt Template
 ```