Spaces:
Sleeping
Sleeping
Upload README.md
Browse files
README.md
CHANGED
@@ -9,86 +9,87 @@ app_file: app.py
|
|
9 |
pinned: true
|
10 |
---
|
11 |
|
12 |
-
|
13 |
-
markdown
|
14 |
-
Copiar
|
15 |
-
Editar
|
16 |
-
emoji: 🧬
|
17 |
-
colorFrom: indigo
|
18 |
-
colorTo: blue
|
19 |
-
sdk_version: "3.50.2"
|
20 |
-
app_file: app.py
|
21 |
-
pinned: true
|
22 |
-
|
23 |
# 🧬 GeneForgeLang: Symbolic-to-Sequence & Cross-Modality Biomolecular Design Toolkit
|
24 |
|
25 |
-
**GeneForgeLang** is a symbolic
|
|
|
26 |
|
27 |
-
This
|
28 |
-
-
|
29 |
-
-
|
30 |
-
-
|
31 |
-
-
|
|
|
32 |
|
|
|
33 |
|
34 |
-
## 🚀 Features
|
35 |
|
36 |
| Module | Description |
|
37 |
|----------------------------|-------------|
|
38 |
-
| 🧠 Phrase →
|
39 |
-
| 🔁 Transcode
|
40 |
-
|
|
41 |
-
|
|
42 |
-
|
|
|
|
|
|
43 |
|
|
|
44 |
|
45 |
## 🧪 Example Input Phrases
|
46 |
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
### RNA → Protein
|
56 |
-
|
57 |
-
:r:Ex1-Ex2 ↓ ^p:Dom(Kin)-Mot(NLS)
|
58 |
|
59 |
-
|
60 |
-
Copiar
|
61 |
-
Editar
|
62 |
|
|
|
63 |
|
64 |
-
|
|
|
|
|
|
|
|
|
65 |
|
66 |
-
1. Clone this repo
|
67 |
2. Install dependencies:
|
68 |
```bash
|
69 |
pip install -r requirements.txt
|
70 |
-
|
71 |
|
72 |
-
|
73 |
-
|
74 |
-
Editar
|
75 |
python app.py
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
pinned: true
|
10 |
---
|
11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
# 🧬 GeneForgeLang: Symbolic-to-Sequence & Cross-Modality Biomolecular Design Toolkit
|
13 |
|
14 |
+
**GeneForgeLang** is a symbolic and generative language for cross-modality biomolecular design.
|
15 |
+
It enables unified AI-powered workflows to **design, interpret and translate DNA, RNA, and protein sequences** using a compact, human-readable grammar.
|
16 |
|
17 |
+
This project provides:
|
18 |
+
- **A symbolic language** spanning all biological layers (genomic, transcriptomic, proteomic)
|
19 |
+
- **Realistic sequence generation** via AI models like ProtGPT2
|
20 |
+
- **Scientific interpretation** of symbolic phrases in natural language
|
21 |
+
- **Cross-modality transcoders** (e.g., DNA → RNA → Protein and vice versa)
|
22 |
+
- **An interactive Gradio-based UI** for easy use and integration
|
23 |
|
24 |
+
---
|
25 |
|
26 |
+
## 🚀 Key Features
|
27 |
|
28 |
| Module | Description |
|
29 |
|----------------------------|-------------|
|
30 |
+
| 🧠 Phrase → Sequence | Generate DNA, RNA, or protein from symbolic design |
|
31 |
+
| 🔁 Transcode Phrases | Translate GeneForgeLang phrases across modalities |
|
32 |
+
| 📖 Phrase → Description | Generate scientific English descriptions of symbolic inputs |
|
33 |
+
| 🔄 Sequence → Phrase | Infer functional phrases from real sequences |
|
34 |
+
| 🧬 Mutate Sequence (WIP) | Generate variants for symbolic seeds (under development) |
|
35 |
+
| 📦 Export to FASTA (WIP) | Save generated sequences to .fasta (to be implemented) |
|
36 |
+
| 📊 Analyze Sequence (WIP) | Visualize amino acid composition or base content |
|
37 |
|
38 |
+
---
|
39 |
|
40 |
## 🧪 Example Input Phrases
|
41 |
|
42 |
+
```text
|
43 |
+
~d:Prom[TATA]-Exon1-Intr1-Exon2
|
44 |
+
↓
|
45 |
+
:r:Cap5'-Ex1-Ex2-UTR3'
|
46 |
+
↓
|
47 |
+
^p:Dom(Kin)-Mot(NLS)*AcK@147
|
48 |
+
```
|
|
|
|
|
|
|
|
|
49 |
|
50 |
+
---
|
|
|
|
|
51 |
|
52 |
+
## ▶️ How to Use Locally
|
53 |
|
54 |
+
1. Clone this repo:
|
55 |
+
```bash
|
56 |
+
git clone https://github.com/Fundacion-de-Neurociencias/GeneForgeLang.git
|
57 |
+
cd GeneForgeLang
|
58 |
+
```
|
59 |
|
|
|
60 |
2. Install dependencies:
|
61 |
```bash
|
62 |
pip install -r requirements.txt
|
63 |
+
```
|
64 |
|
65 |
+
3. Launch the interface:
|
66 |
+
```bash
|
|
|
67 |
python app.py
|
68 |
+
```
|
69 |
+
|
70 |
+
4. Navigate to:
|
71 |
+
[http://127.0.0.1:7860](http://127.0.0.1:7860)
|
72 |
+
|
73 |
+
---
|
74 |
+
|
75 |
+
## 📁 File Structure
|
76 |
+
|
77 |
+
| File | Description |
|
78 |
+
|------------------------------|-------------|
|
79 |
+
| `app.py` | Full Gradio app (4 tabs) |
|
80 |
+
| `semillas.json` | Phrase-to-seed dictionary |
|
81 |
+
| `generate_from_phrase.py` | Symbolic-to-sequence generator |
|
82 |
+
| `describe_phrase.py` | Phrase interpreter to scientific English |
|
83 |
+
| `translate_to_geneforgelang.py` | Sequence-to-symbolic phrase translation |
|
84 |
+
| `transcoder.py` | Modality switcher (DNA ↔ RNA ↔ Protein) |
|
85 |
+
| `requirements.txt` | Python dependencies |
|
86 |
+
| `README.md` | This file |
|
87 |
+
|
88 |
+
---
|
89 |
+
|
90 |
+
## 🧠 Developed by
|
91 |
+
|
92 |
+
**Fundación de Neurociencias**
|
93 |
+
Licensed under the MIT License
|
94 |
+
|
95 |
+
> Join us in shaping the future of symbolic bio-AI. Contributions welcome!
|