Update README.md
Browse files
README.md
CHANGED
@@ -7,31 +7,8 @@ Our pre-print can be found here: https://arxiv.org/abs/2310.19727.
|
|
7 |
|
8 |
*Authors: Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Warren Del-Pinto, Goran Nenadic.*
|
9 |
|
10 |
-
##
|
11 |
-
|
12 |
-
- Python version: 3.11.7
|
13 |
-
- Pip verion: 23.3.1
|
14 |
-
|
15 |
-
Then, required packages must be installed by running the following command at the root of the reposity:
|
16 |
-
```pip install -r requirements.txt```
|
17 |
-
|
18 |
-
## Usage: Generate data
|
19 |
-
In order to generate syntethic data, you must follow the steps given below:
|
20 |
-
|
21 |
-
1. First, you should create/edit the file named `medications.txt` at the root of the repository. This file may include medication names as well as the number of prescriptions to generate for each medication. Each line must provide a medication name and its number of generations with the form `name:amount` (e.g, `aspirin:5` to get five different prescriptions of aspirin).
|
22 |
-
|
23 |
-
2. Then, the file `generate.py` must be called to generate the desired synthetic data. You can call this script using the command ```python3 generate.py``` where the following arguments can be added:
|
24 |
-
- -in (--input_path): path to input dictionnary's file (default: './medications.txt');
|
25 |
-
- -out (--output_path): path to output file (default: './generations.json');
|
26 |
-
|
27 |
-
- -bs (--beam_size): beam size for beam search decoding (default: 4);
|
28 |
-
- -mspd (--max_step_prob_diff): maximal step probability difference for beam search decoding (default: 1.0);
|
29 |
-
- -nrpl (--no_repetition_length): minimal length between repeated special characters in generations (default: 4);
|
30 |
-
- -a (--alpha): alpha value (hyperparameter) for beam search decoding (default: 0.6);
|
31 |
-
- -tlp (--tree_length_product): tree length product for beam search decoding (default: 3);
|
32 |
-
|
33 |
-
3. Finally, the generated prescriptions will be available in the desired output file. Note that the number of desired prescriptions may exceed the maximum possible generations if you asked for too many prescriptions and saturated the decoding tree, or have over-restricted the model with your chosen hyperparameters (arguments). In this case, some prescriptions may be given as `-` to signify that they could not be generated.
|
34 |
-
|
35 |
|
36 |
## Evaluation results
|
37 |
|
|
|
7 |
|
8 |
*Authors: Samuel Belkadi, Nicolo Micheletti, Lifeng Han, Warren Del-Pinto, Goran Nenadic.*
|
9 |
|
10 |
+
## Usage
|
11 |
+
In order to generate syntethic data, you can follow the instructions given on our Github repository: https://github.com/SamySam0/LT3 .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
## Evaluation results
|
14 |
|