GeneForgeLang / README.md
ManMenGon's picture
Upload README.md
75d5a4f verified
---
title: GeneForgeLang
emoji: 🧬
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: "3.50.2"
app_file: app.py
pinned: true
---
# 🧬 GeneForgeLang: Symbolic-to-Sequence & Cross-Modality Biomolecular Design Toolkit
**GeneForgeLang** is a symbolic and generative language for cross-modality biomolecular design.
It enables unified AI-powered workflows to **design, interpret and translate DNA, RNA, and protein sequences** using a compact, human-readable grammar.
This project provides:
- **A symbolic language** spanning all biological layers (genomic, transcriptomic, proteomic)
- **Realistic sequence generation** via AI models like ProtGPT2
- **Scientific interpretation** of symbolic phrases in natural language
- **Cross-modality transcoders** (e.g., DNA β†’ RNA β†’ Protein and vice versa)
- **An interactive Gradio-based UI** for easy use and integration
---
## πŸš€ Key Features
| Module | Description |
|----------------------------|-------------|
| 🧠 Phrase β†’ Sequence | Generate DNA, RNA, or protein from symbolic design |
| πŸ” Transcode Phrases | Translate GeneForgeLang phrases across modalities |
| πŸ“– Phrase β†’ Description | Generate scientific English descriptions of symbolic inputs |
| πŸ”„ Sequence β†’ Phrase | Infer functional phrases from real sequences |
| 🧬 Mutate Sequence (WIP) | Generate variants for symbolic seeds (under development) |
| πŸ“¦ Export to FASTA (WIP) | Save generated sequences to .fasta (to be implemented) |
| πŸ“Š Analyze Sequence (WIP) | Visualize amino acid composition or base content |
---
## πŸ§ͺ Example Input Phrases
```text
~d:Prom[TATA]-Exon1-Intr1-Exon2
↓
:r:Cap5'-Ex1-Ex2-UTR3'
↓
^p:Dom(Kin)-Mot(NLS)*AcK@147
```
---
## ▢️ How to Use Locally
1. Clone this repo:
```bash
git clone https://github.com/Fundacion-de-Neurociencias/GeneForgeLang.git
cd GeneForgeLang
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Launch the interface:
```bash
python app.py
```
4. Navigate to:
[http://127.0.0.1:7860](http://127.0.0.1:7860)
---
## πŸ“ File Structure
| File | Description |
|------------------------------|-------------|
| `app.py` | Full Gradio app (4 tabs) |
| `semillas.json` | Phrase-to-seed dictionary |
| `generate_from_phrase.py` | Symbolic-to-sequence generator |
| `describe_phrase.py` | Phrase interpreter to scientific English |
| `translate_to_geneforgelang.py` | Sequence-to-symbolic phrase translation |
| `transcoder.py` | Modality switcher (DNA ↔ RNA ↔ Protein) |
| `requirements.txt` | Python dependencies |
| `README.md` | This file |
---
## 🧠 Developed by
**FundaciΓ³n de Neurociencias**
Licensed under the MIT License
> Join us in shaping the future of symbolic bio-AI. Contributions welcome!