Update README.md
Browse files
README.md
CHANGED
@@ -41,11 +41,11 @@ Native-Uttar Mini
|
|
41 |
- The idea is to have one common corpus and specific heads, rather than a separate model for every single task
|
42 |
- In particular, I want to evaluate whether it is really necessary to fine-tune the base model too, as it already contains a model of the language. Ideally, having task-specific heads could make up for the lacking fine-tuning of the base model.
|
43 |
- If the performance of the model is comparable, this could reduce training efforts and resources
|
44 |
-
- Either add another
|
45 |
3. Application - 10h
|
46 |
- GUI, that lets people enter a context (base text), question, and they will receive an answer.
|
47 |
- Will contain some SQuAD questions as examples.
|
48 |
-
4.
|
49 |
5. Presentation - 2h
|
50 |
|
51 |
## Goal
|
@@ -72,7 +72,7 @@ Now for the Question Answering model.
|
|
72 |
Amount of time for each task:
|
73 |
* DistilBERT model: ~20h (without training time). This was very similar to what I estimated, because I relied heavily on the Huggingface library. Loading the data was easy and the data is already very clean.
|
74 |
* QA model: ~40h (without training time). Was a lot of effort, as my first approach didn't work and it took me making up a basic POC model, to get to the final architecture.
|
75 |
-
* Application: 2h. Streamlit
|
76 |
|
77 |
## Data
|
78 |
- Aaron Gokaslan et al. OpenWebText Corpus. 2019. https://skylion007.github.io/OpenWebTextCorpus/: **OpenWebText**
|
|
|
41 |
- The idea is to have one common corpus and specific heads, rather than a separate model for every single task
|
42 |
- In particular, I want to evaluate whether it is really necessary to fine-tune the base model too, as it already contains a model of the language. Ideally, having task-specific heads could make up for the lacking fine-tuning of the base model.
|
43 |
- If the performance of the model is comparable, this could reduce training efforts and resources
|
44 |
+
- Either add another BERT Layer per task or just the multi-head self-attention layer
|
45 |
3. Application - 10h
|
46 |
- GUI, that lets people enter a context (base text), question, and they will receive an answer.
|
47 |
- Will contain some SQuAD questions as examples.
|
48 |
+
4. Documentation - 2h
|
49 |
5. Presentation - 2h
|
50 |
|
51 |
## Goal
|
|
|
72 |
Amount of time for each task:
|
73 |
* DistilBERT model: ~20h (without training time). This was very similar to what I estimated, because I relied heavily on the Huggingface library. Loading the data was easy and the data is already very clean.
|
74 |
* QA model: ~40h (without training time). Was a lot of effort, as my first approach didn't work and it took me making up a basic POC model, to get to the final architecture.
|
75 |
+
* Application: 2h. Streamlit is easy yet faced a lot of issues for the application
|
76 |
|
77 |
## Data
|
78 |
- Aaron Gokaslan et al. OpenWebText Corpus. 2019. https://skylion007.github.io/OpenWebTextCorpus/: **OpenWebText**
|