README: Update as per Rizky's suggestions.
Browse files
README.md
CHANGED
@@ -14,32 +14,33 @@ https://www.biorxiv.org/content/10.1101/2023.08.08.552427v1.full
|
|
14 |
|
15 |
## Installation
|
16 |
|
17 |
-
First
|
|
|
|
|
18 |
|
19 |
```
|
20 |
-
|
21 |
-
$ git clone https://huggingface.co/wwood/aerobicity
|
22 |
```
|
23 |
|
24 |
Then setup the conda environment:
|
25 |
|
26 |
```
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
```
|
31 |
|
32 |
and download the eggNOG database. We use version 2.1.3, as specified in the `env-apply.yml` conda environment file, because this is what the predictor was trained on. The eggNOG database is large, so it is not included in the repository. To download it, run:
|
33 |
|
34 |
```
|
35 |
-
|
36 |
```
|
37 |
|
38 |
## Usage
|
39 |
To apply the predictor, run against a test genome, replacing `EGGNOG_DATA_DIR` with the path to the eggNOG data directory:
|
40 |
|
41 |
```
|
42 |
-
|
43 |
--models XGBoost.model --output-predictions predictions.csv
|
44 |
```
|
45 |
|
|
|
14 |
|
15 |
## Installation
|
16 |
|
17 |
+
First ensure you have installed git-lfs (including running `git lfs install`), as described at https://www.atlassian.com/git/tutorials/git-lfs#installing-git-lfs
|
18 |
+
|
19 |
+
Then clone this repository, using
|
20 |
|
21 |
```
|
22 |
+
git clone https://huggingface.co/wwood/aerobicity
|
|
|
23 |
```
|
24 |
|
25 |
Then setup the conda environment:
|
26 |
|
27 |
```
|
28 |
+
cd aerobicity
|
29 |
+
mamba env create -p env -f env-apply.yml
|
30 |
+
conda activate ./env
|
31 |
```
|
32 |
|
33 |
and download the eggNOG database. We use version 2.1.3, as specified in the `env-apply.yml` conda environment file, because this is what the predictor was trained on. The eggNOG database is large, so it is not included in the repository. To download it, run:
|
34 |
|
35 |
```
|
36 |
+
download_eggnog_data.py
|
37 |
```
|
38 |
|
39 |
## Usage
|
40 |
To apply the predictor, run against a test genome, replacing `EGGNOG_DATA_DIR` with the path to the eggNOG data directory:
|
41 |
|
42 |
```
|
43 |
+
./17_apply_to_proteome.py --protein-fasta data/RS_GCF_000515355.1_protein.faa --eggnog-data-dir EGGNOG_DATA_DIR
|
44 |
--models XGBoost.model --output-predictions predictions.csv
|
45 |
```
|
46 |
|