Spaces:

HGKo
/

vision_llm_agent

Running

App Files Files Community

vision_llm_agent / README.md

David Ko

Add login feature with Flask-Login

a2e8511 about 1 month ago

preview code

raw

history blame

4.36 kB

metadata

title: Vision Llm Agent
emoji: 🌖
colorFrom: blue
colorTo: blue
sdk: docker
pinned: false
license: gpl-3.0

Vision LLM Agent - Object Detection with AI Assistant

A multi-model object detection and image classification demo with LLM-based AI assistant for answering questions about detected objects. This project uses YOLOv8, DETR, and ViT models for vision tasks, and TinyLlama for natural language processing. The application includes a secure login system to protect access to the AI features.

Project Architecture

This project follows a phased development approach:

Phase 0: PoC with Gradio (Original)

Simple Gradio interface with multiple object detection models
Uses Hugging Face's free tier for model hosting
Easy to deploy to Hugging Face Spaces

Phase 1: Service Separation (Implemented)

Backend: Flask API with model inference endpoints
REST API endpoints for model inference
JSON responses with detection results and performance metrics

Phase 2: UI Upgrade (Implemented)

Modern React frontend with Material-UI components
Improved user experience with responsive design
Separate frontend and backend architecture

Phase 3: CI/CD & Testing (Planned)

GitHub Actions for automated testing and deployment
Comprehensive test suite with pytest and ESLint
Automatic rebuilds on Hugging Face Spaces

How to Run

Option 1: Original Gradio App

Install dependencies:
```
pip install -r requirements.txt
```
Run the Gradio app:
```
python app.py
```
Open your browser and go to the URL shown in the terminal (typically http://127.0.0.1:7860)

Option 2: React Frontend with Flask Backend

Install backend dependencies:
```
pip install -r requirements.txt
```
Start the Flask backend server:
```
python api.py
```
In a separate terminal, navigate to the frontend directory:
```
cd frontend
```
Install frontend dependencies:
```
npm install
```
Start the React development server:
```
npm start
```
Open your browser and go to http://localhost:3000

Models Used

YOLOv8: Fast and accurate object detection
DETR: DEtection TRansformer for object detection
ViT: Vision Transformer for image classification
TinyLlama: For natural language processing and question answering about detected objects

Authentication

The application includes a secure login system to protect access to all features:

Default Credentials:
- Username: admin / Password: admin123
- Username: user / Password: user123
Login Process:
- All routes and API endpoints are protected with Flask-Login
- Users must authenticate before accessing any features
- Session management handles login state persistence
Security Features:
- Password protection for all API endpoints and UI pages
- Session-based authentication with secure cookies
- Configurable secret key via environment variables

API Endpoints

The Flask backend provides the following API endpoints (all require authentication):

GET /api/status - Check the status of the API and available models
POST /api/detect/yolo - Detect objects using YOLOv8
POST /api/detect/detr - Detect objects using DETR
POST /api/classify/vit - Classify images using ViT
POST /api/analyze - Analyze images with LLM assistant
POST /api/similar-images - Find similar images in the vector database
POST /api/add-to-collection - Add images to the vector database
POST /api/add-detected-objects - Add detected objects to the vector database
POST /api/search-similar-objects - Search for similar objects in the vector database

All POST endpoints accept form data with an 'image' field containing the image file.

Deployment

Gradio App

The Gradio app is designed to be easily deployed to Hugging Face Spaces:

Create a new Space on Hugging Face
Select Gradio as the SDK
Push this repository to the Space's git repository
The app will automatically deploy

React + Flask App

For the React + Flask version, you'll need to:

Build the React frontend:
```
cd frontend
npm run build
```
Serve the static files from a web server or cloud hosting service
Deploy the Flask backend to a server that supports Python