Marchanjo commited on
Commit
29a2544
·
verified ·
1 Parent(s): 135e30a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -3,7 +3,7 @@ license: apache-2.0
3
  ---
4
  more information: [mRAT-SQL](https://github.com/C4AI/gap-text2sql).
5
 
6
- # mRAT-SQL-FIT - A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-Attention
7
  Code and model from the paper [paper published in Springer-Nature - International Journal of Information Technology](https://doi.org/10.1007/s41870-023-01342-3), [here the SharedIt link](https://rdcu.be/dff19).
8
 
9
  ## A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-Attention
@@ -11,8 +11,8 @@ Marcelo Archanjo Jose, Fabio Gagliardi Cozman
11
 
12
  Long sequences of text are challenging in the context of transformers, due to quadratic memory increase in the self-attention mechanism. As this issue directly affects the translation from natural language to SQL queries (as techniques usually take as input a concatenated text with the question and the database schema), we present techniques that allow long text sequences to be handled by transformers with up to 512 input tokens. We propose a training process with database schema pruning (removal of tables and columns names that are useless for the query of interest). In addition, we used a multilingual approach with the mT5-large model fine-tuned with a data-augmented Spider dataset in four languages simultaneously: English, Portuguese, Spanish, and French. Our proposed technique used the Spider dataset and increased the exact set match accuracy results from 0.718 to 0.736 in a validation dataset (Dev). Source code, evaluations, and checkpoints are available at: [mRAT-SQL](https://github.com/C4AI/gap-text2sql). .
13
 
14
- # mRAT-SQL+GAP - Multilingual version of the RAT-SQL+GAP
15
- Code and model from our BRACIS 2021:
16
 
17
  ## mRAT-SQL+GAP:A Portuguese Text-to-SQL Transformer
18
  Marcelo Archanjo José, Fabio Gagliardi Cozman
 
3
  ---
4
  more information: [mRAT-SQL](https://github.com/C4AI/gap-text2sql).
5
 
6
+ # mRAT-SQL-FIT
7
  Code and model from the paper [paper published in Springer-Nature - International Journal of Information Technology](https://doi.org/10.1007/s41870-023-01342-3), [here the SharedIt link](https://rdcu.be/dff19).
8
 
9
  ## A Multilingual Translator to SQL with Database Schema Pruning to Improve Self-Attention
 
11
 
12
  Long sequences of text are challenging in the context of transformers, due to quadratic memory increase in the self-attention mechanism. As this issue directly affects the translation from natural language to SQL queries (as techniques usually take as input a concatenated text with the question and the database schema), we present techniques that allow long text sequences to be handled by transformers with up to 512 input tokens. We propose a training process with database schema pruning (removal of tables and columns names that are useless for the query of interest). In addition, we used a multilingual approach with the mT5-large model fine-tuned with a data-augmented Spider dataset in four languages simultaneously: English, Portuguese, Spanish, and French. Our proposed technique used the Spider dataset and increased the exact set match accuracy results from 0.718 to 0.736 in a validation dataset (Dev). Source code, evaluations, and checkpoints are available at: [mRAT-SQL](https://github.com/C4AI/gap-text2sql). .
13
 
14
+ # mRAT-SQL+GAP
15
+ Code and model from BRACIS 2021:
16
 
17
  ## mRAT-SQL+GAP:A Portuguese Text-to-SQL Transformer
18
  Marcelo Archanjo José, Fabio Gagliardi Cozman