|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: |
|
|
- openai/gpt-oss-120b |
|
|
- deepseek-ai/DeepSeek-V3.1 |
|
|
tags: |
|
|
- chat-interface |
|
|
- gpt-oss-120b-chat-interface |
|
|
--- |
|
|
# Model Card: GPT-OSS-120B Chat Interface |
|
|
|
|
|
```markdown |
|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- mlx |
|
|
- chat-ui |
|
|
- local-ai |
|
|
- gpt |
|
|
- python |
|
|
- pyqt5 |
|
|
--- |
|
|
|
|
|
# GPT-OSS-120B Chat Interface |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This is a modern, feature-rich chat interface for the GPT-OSS-120B model running on Apple MLX framework. The interface provides a user-friendly way to interact with the 120-billion parameter open-source language model locally on Apple Silicon hardware. |
|
|
|
|
|
## Model Overview |
|
|
|
|
|
- **Model Name:** GPT-OSS-120B (4-bit quantized) |
|
|
- **Framework:** Apple MLX |
|
|
- **Interface:** PyQt5-based desktop application |
|
|
- **Hardware Requirements:** Apple Silicon with sufficient RAM (recommended: M3 Ultra with 512GB RAM) |
|
|
|
|
|
## Features |
|
|
|
|
|
- π¨ Modern, responsive UI with PyQt5 |
|
|
- π¬ Real-time chat interface with message history |
|
|
- β‘ Local inference on Apple Silicon |
|
|
- π Markdown support with syntax highlighting |
|
|
- πΎ Conversation export functionality |
|
|
- βοΈ Adjustable generation parameters |
|
|
- π― Code block detection and formatting |
|
|
- π Performance monitoring |
|
|
|
|
|
## UI Architecture Diagram |
|
|
|
|
|
``` |
|
|
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
|
|
β MAIN WINDOW β |
|
|
βββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββ€ |
|
|
β LEFT PANEL β CHAT AREA β |
|
|
β β β |
|
|
β βββββββββββββββββββββββββββ β βββββββββββββββββββββββββββββββββββββ β |
|
|
β β MODEL INFO β β β β β |
|
|
β β - Model details β β β βββββββββββββββββββββββββββββββ β β |
|
|
β β - Hardware specs β β β β CHAT MESSAGE (User) β β β |
|
|
β β - Performance metrics β β β β - Avatar + timestamp β β β |
|
|
β β β β β β - Formatted content β β β |
|
|
β βββββββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β |
|
|
β β β β β |
|
|
β βββββββββββββββββββββββββββ β β βββββββββββββββββββββββββββββββ β β |
|
|
β β GENERATION SETTINGS β β β β CHAT MESSAGE (Assistant) β β β |
|
|
β β - Max tokens control β β β β - Avatar + timestamp β β β |
|
|
β β β β β β - Formatted content β β β |
|
|
β βββββββββββββββββββββββββββ β β β - Generation time β β β |
|
|
β β β βββββββββββββββββββββββββββββββ β β |
|
|
β β β ... β β |
|
|
β βββββββββββββββββββββββββββ β β β β |
|
|
β β CONVERSATION TOOLS β β β βββββββββββββββββββββββββββββββ β β |
|
|
β β - Clear conversation β β β β INPUT AREA β β β |
|
|
β β - Export chat β β β β - Multi-line text input β β β |
|
|
β β β β β β - Character counter β β β |
|
|
β βββββββββββββββββββββββββββ β β β - Send button β β β |
|
|
β β β βββββββββββββββββββββββββββββββ β β |
|
|
β βββββββββββββββββββββββββββ β β β β |
|
|
β β STATUS INDICATOR β β β β β |
|
|
β β - Loading/ready state β β β β β |
|
|
β β β β β β β |
|
|
β βββββββββββββββββββββββββββ β β β β |
|
|
β β β β β |
|
|
βββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββ |
|
|
``` |
|
|
|
|
|
## Development |
|
|
|
|
|
This interface was developed using PyQt5 and integrates with the MLX-LM library for efficient inference on Apple Silicon. The UI features a responsive design with: |
|
|
|
|
|
1. **Threaded Operations:** Model loading and text generation run in background threads |
|
|
2. **Custom Widgets:** Specialized chat message widgets with formatting |
|
|
3. **Syntax Highlighting:** Code detection and highlighting in responses |
|
|
4. **Modern Styling:** Clean, professional interface with appropriate spacing and colors |
|
|
|
|
|
## DeepSeek Involvement |
|
|
|
|
|
The `gpt_oss_ui.py` Python script was created with assistance from DeepSeek's AI models, which helped design the architecture, implement the PyQt5 interface components, and ensure proper integration with the MLX inference backend. |
|
|
|
|
|
## Usage |
|
|
|
|
|
1. Install requirements: `pip install PyQt5 markdown mlx-lm` |
|
|
2. Run the application: `python gpt_oss_ui.py` |
|
|
3. Wait for model to load (first time will download the model) |
|
|
4. Start chatting with the GPT-OSS-120B model |
|
|
|
|
|
## Performance |
|
|
|
|
|
On an M3 Ultra with 512GB RAM: |
|
|
- Model load time: ~2-3 minutes (first time) |
|
|
- Inference speed: ~95 tokens/second |
|
|
- Memory usage: Optimized with 4-bit quantization |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Requires significant RAM for the 120B parameter model |
|
|
- Currently only supports Apple Silicon hardware |
|
|
- Model loading can be time-consuming on first run |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
This interface is designed for local use, ensuring privacy as all processing happens on-device. Users should still follow responsible AI practices when using the model. |
|
|
``` |