--- title: GeneForgeLang emoji: 🧬 colorFrom: indigo colorTo: blue sdk: gradio sdk_version: "3.50.2" app_file: app.py pinned: true --- # 🧬 GeneForgeLang: Symbolic-to-Sequence & Cross-Modality Biomolecular Design Toolkit **GeneForgeLang** is a symbolic and generative language for cross-modality biomolecular design. It enables unified AI-powered workflows to **design, interpret and translate DNA, RNA, and protein sequences** using a compact, human-readable grammar. This project provides: - **A symbolic language** spanning all biological layers (genomic, transcriptomic, proteomic) - **Realistic sequence generation** via AI models like ProtGPT2 - **Scientific interpretation** of symbolic phrases in natural language - **Cross-modality transcoders** (e.g., DNA β†’ RNA β†’ Protein and vice versa) - **An interactive Gradio-based UI** for easy use and integration --- ## πŸš€ Key Features | Module | Description | |----------------------------|-------------| | 🧠 Phrase β†’ Sequence | Generate DNA, RNA, or protein from symbolic design | | πŸ” Transcode Phrases | Translate GeneForgeLang phrases across modalities | | πŸ“– Phrase β†’ Description | Generate scientific English descriptions of symbolic inputs | | πŸ”„ Sequence β†’ Phrase | Infer functional phrases from real sequences | | 🧬 Mutate Sequence (WIP) | Generate variants for symbolic seeds (under development) | | πŸ“¦ Export to FASTA (WIP) | Save generated sequences to .fasta (to be implemented) | | πŸ“Š Analyze Sequence (WIP) | Visualize amino acid composition or base content | --- ## πŸ§ͺ Example Input Phrases ```text ~d:Prom[TATA]-Exon1-Intr1-Exon2 ↓ :r:Cap5'-Ex1-Ex2-UTR3' ↓ ^p:Dom(Kin)-Mot(NLS)*AcK@147 ``` --- ## ▢️ How to Use Locally 1. Clone this repo: ```bash git clone https://github.com/Fundacion-de-Neurociencias/GeneForgeLang.git cd GeneForgeLang ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. Launch the interface: ```bash python app.py ``` 4. Navigate to: [http://127.0.0.1:7860](http://127.0.0.1:7860) --- ## πŸ“ File Structure | File | Description | |------------------------------|-------------| | `app.py` | Full Gradio app (4 tabs) | | `semillas.json` | Phrase-to-seed dictionary | | `generate_from_phrase.py` | Symbolic-to-sequence generator | | `describe_phrase.py` | Phrase interpreter to scientific English | | `translate_to_geneforgelang.py` | Sequence-to-symbolic phrase translation | | `transcoder.py` | Modality switcher (DNA ↔ RNA ↔ Protein) | | `requirements.txt` | Python dependencies | | `README.md` | This file | --- ## 🧠 Developed by **FundaciΓ³n de Neurociencias** Licensed under the MIT License > Join us in shaping the future of symbolic bio-AI. Contributions welcome!