Updated to do list
Browse files
README.md
CHANGED
@@ -8,6 +8,19 @@ The main goals of this project are:
|
|
8 |
2. Release the top performing models for further research and enhancement
|
9 |
3. Release all of the preprocessing and postprocessing scripts and findings for future research.
|
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
## 1. Model
|
12 |
|
13 |
We will be using T5 model.
|
@@ -35,4 +48,4 @@ We can make use of :
|
|
35 |
|
36 |
## 4. Additional Reading
|
37 |
|
38 |
-
- [How Much Knowledge Can You Pack Into the Parameters of a Language Model?](https://arxiv.org/pdf/2002.08910.pdf)
|
|
|
8 |
2. Release the top performing models for further research and enhancement
|
9 |
3. Release all of the preprocessing and postprocessing scripts and findings for future research.
|
10 |
|
11 |
+
## TO DO LIST:
|
12 |
+
- [x] Team members met and the following was discussed:
|
13 |
+
- Data preparation script is prepared that mixes CORD-19 and Pubmed.
|
14 |
+
- Agreed to finalize the training scripts by 9pm PDT 7/9/2021.
|
15 |
+
- Tokenizer is now trained.
|
16 |
+
- [ ] Setup the pretraining script
|
17 |
+
- [ ] Prepare the finetuning tasks inspired from [T5 Trivia Colab](https://colab.research.google.com/github/google-research/text-to-text-transfer-transformer/blob/master/notebooks/t5-trivia.ipynb)
|
18 |
+
- What datasets we want to go with?
|
19 |
+
- [Covid-QA](https://huggingface.co/datasets/covid_qa_deepset) (Maybe as test set?)
|
20 |
+
- [Trivia](https://huggingface.co/datasets/covid_qa_deepset)
|
21 |
+
- [CDC-QA](https://www.cdc.gov/coronavirus/2019-ncov/faq.html) (We can scrape quickly using beautiful soup or something)
|
22 |
+
- [More Medical Datasets](https://aclanthology.org/2020.findings-emnlp.289.pdf) (See the dataset section for inspiratio
|
23 |
+
|
24 |
## 1. Model
|
25 |
|
26 |
We will be using T5 model.
|
|
|
48 |
|
49 |
## 4. Additional Reading
|
50 |
|
51 |
+
- [How Much Knowledge Can You Pack Into the Parameters of a Language Model?](https://arxiv.org/pdf/2002.08910.pdf)
|