DianLiI commited on
Commit
cd5f23a
·
verified ·
1 Parent(s): aa383a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -1
README.md CHANGED
@@ -1,3 +1,49 @@
1
  ## AIDO.RNA 650M
2
 
3
- AIDO.RNA 650M is an RNA foundation model trained on 42 million non-coding RNA sequences at single-nucleotide resolution.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ## AIDO.RNA 650M
2
 
3
+ AIDO.RNA 650M is an RNA foundation model trained on 42 million non-coding RNA sequences at single-nucleotide resolution.
4
+
5
+ ## How to Use
6
+ ### Build any downstream models from this backbone
7
+ #### Embedding
8
+ ```python
9
+ from genbio_finetune.tasks import Embed
10
+ model = Embed.from_config({"model.backbone": "rnafm_650m_cds"}).eval()
11
+ collated_batch = model.collate({"sequences": ["ACGT", "AGCT"]})
12
+ embedding = model(collated_batch)
13
+ print(embedding.shape)
14
+ print(embedding)
15
+ ```
16
+ #### Sequence Level Classification
17
+ ```python
18
+ import torch
19
+ from genbio_finetune.tasks import SequenceClassification
20
+ model = SequenceClassification.from_config({"model.backbone": "rnafm_650m_cds", "model.n_classes": 2}).eval()
21
+ collated_batch = model.collate({"sequences": ["ACGT", "AGCT"]})
22
+ logits = model(collated_batch)
23
+ print(logits)
24
+ print(torch.argmax(logits, dim=-1))
25
+ ```
26
+ #### Token Level Classification
27
+ ```python
28
+ import torch
29
+ from genbio_finetune.tasks import TokenClassification
30
+ model = TokenClassification.from_config({"model.backbone": "rnafm_650m_cds", "model.n_classes": 3}).eval()
31
+ collated_batch = model.collate({"sequences": ["ACGT", "AGCT"]})
32
+ logits = model(collated_batch)
33
+ print(logits)
34
+ print(torch.argmax(logits, dim=-1))
35
+ ```
36
+ #### Regression
37
+ ```python
38
+ from genbio_finetune.tasks import SequenceRegression
39
+ model = SequenceRegression.from_config({"model.backbone": "rnafm_650m_cds"}).eval()
40
+ collated_batch = model.collate({"sequences": ["ACGT", "AGCT"]})
41
+ logits = model(collated_batch)
42
+ print(logits)
43
+ ```
44
+ #### Or use our one-liner CLI to finetune or evaluate any of the above!
45
+ ```
46
+ gbft fit --model SequenceClassification --model.backbone rnafm_650m_cds --data SequenceClassification --data.path <hf_or_local_path_to_your_dataset>
47
+ gbft test --model SequenceClassification --model.backbone rnafm_650m_cds --data SequenceClassification --data.path <hf_or_local_path_to_your_dataset>
48
+ ```
49
+ For more information, visit: [Model Generator](https://github.com/genbio-ai/modelgenerator)