GPT_OSS_120B_Chat_Interface_PyQt5 / README.md

Create README.md

6705b44 verified 2 months ago

7.07 kB

	---
	license: apache-2.0
	base_model:
	- openai/gpt-oss-120b
	- deepseek-ai/DeepSeek-V3.1
	tags:
	- chat-interface
	- gpt-oss-120b-chat-interface
	---
	# Model Card: GPT-OSS-120B Chat Interface

	```markdown
	---
	license: apache-2.0
	tags:
	- mlx
	- chat-ui
	- local-ai
	- gpt
	- python
	- pyqt5
	---

	# GPT-OSS-120B Chat Interface

	## Model Description

	This is a modern, feature-rich chat interface for the GPT-OSS-120B model running on Apple MLX framework. The interface provides a user-friendly way to interact with the 120-billion parameter open-source language model locally on Apple Silicon hardware.

	## Model Overview

	- Model Name: GPT-OSS-120B (4-bit quantized)
	- Framework: Apple MLX
	- Interface: PyQt5-based desktop application
	- Hardware Requirements: Apple Silicon with sufficient RAM (recommended: M3 Ultra with 512GB RAM)

	## Features

	- 🎨 Modern, responsive UI with PyQt5
	- 💬 Real-time chat interface with message history
	- ⚡ Local inference on Apple Silicon
	- 📝 Markdown support with syntax highlighting
	- 💾 Conversation export functionality
	- ⚙️ Adjustable generation parameters
	- 🎯 Code block detection and formatting
	- 📊 Performance monitoring

	## UI Architecture Diagram

	```
	┌─────────────────────────────────────────────────────────────────────────┐
	│ MAIN WINDOW │
	├───────────────────────────────┬─────────────────────────────────────────┤
	│ LEFT PANEL │ CHAT AREA │
	│ │ │
	│ ┌─────────────────────────┐ │ ┌───────────────────────────────────┐ │
	│ │ MODEL INFO │ │ │ │ │
	│ │ - Model details │ │ │ ┌─────────────────────────────┐ │ │
	│ │ - Hardware specs │ │ │ │ CHAT MESSAGE (User) │ │ │
	│ │ - Performance metrics │ │ │ │ - Avatar + timestamp │ │ │
	│ │ │ │ │ │ - Formatted content │ │ │
	│ └─────────────────────────┘ │ │ └─────────────────────────────┘ │ │
	│ │ │ │ │
	│ ┌─────────────────────────┐ │ │ ┌─────────────────────────────┐ │ │
	│ │ GENERATION SETTINGS │ │ │ │ CHAT MESSAGE (Assistant) │ │ │
	│ │ - Max tokens control │ │ │ │ - Avatar + timestamp │ │ │
	│ │ │ │ │ │ - Formatted content │ │ │
	│ └─────────────────────────┘ │ │ │ - Generation time │ │ │
	│ │ │ └─────────────────────────────┘ │ │
	│ │ │ ... │ │
	│ ┌─────────────────────────┐ │ │ │ │
	│ │ CONVERSATION TOOLS │ │ │ ┌─────────────────────────────┐ │ │
	│ │ - Clear conversation │ │ │ │ INPUT AREA │ │ │
	│ │ - Export chat │ │ │ │ - Multi-line text input │ │ │
	│ │ │ │ │ │ - Character counter │ │ │
	│ └─────────────────────────┘ │ │ │ - Send button │ │ │
	│ │ │ └─────────────────────────────┘ │ │
	│ ┌─────────────────────────┐ │ │ │ │
	│ │ STATUS INDICATOR │ │ │ │ │
	│ │ - Loading/ready state │ │ │ │ │
	│ │ │ │ │ │ │
	│ └─────────────────────────┘ │ │ │ │
	│ │ │ │ │
	└───────────────────────────────┴───────────────────────────────────────┘
	```

	## Development

	This interface was developed using PyQt5 and integrates with the MLX-LM library for efficient inference on Apple Silicon. The UI features a responsive design with:

	1. Threaded Operations: Model loading and text generation run in background threads
	2. Custom Widgets: Specialized chat message widgets with formatting
	3. Syntax Highlighting: Code detection and highlighting in responses
	4. Modern Styling: Clean, professional interface with appropriate spacing and colors

	## DeepSeek Involvement

	The `gpt_oss_ui.py` Python script was created with assistance from DeepSeek's AI models, which helped design the architecture, implement the PyQt5 interface components, and ensure proper integration with the MLX inference backend.

	## Usage

	1. Install requirements: `pip install PyQt5 markdown mlx-lm`
	2. Run the application: `python gpt_oss_ui.py`
	3. Wait for model to load (first time will download the model)
	4. Start chatting with the GPT-OSS-120B model

	## Performance

	On an M3 Ultra with 512GB RAM:
	- Model load time: ~2-3 minutes (first time)
	- Inference speed: ~95 tokens/second
	- Memory usage: Optimized with 4-bit quantization

	## Limitations

	- Requires significant RAM for the 120B parameter model
	- Currently only supports Apple Silicon hardware
	- Model loading can be time-consuming on first run

	## Ethical Considerations

	This interface is designed for local use, ensuring privacy as all processing happens on-device. Users should still follow responsible AI practices when using the model.
	```