Spaces:
Runtime error
Runtime error
| <p style="font-size:70px; font-weight:bold; text-align:center;"> | |
| Image Data Extractor | |
| </p> | |
| <hr> | |
| # Overview: | |
| The **Image Data Extractor** is a Python-based tool designed to extract and structure text data from images of visiting cards using **PaddleOCR**. The extracted text is processed to identify and organize key information such as name, designation, contact number, address, and company name. The **Mistral 7B model** is used for advanced text analysis, and if it becomes unavailable, the system falls back to the **Gliner urchade/gliner_mediumv2.1** model. | |
| Both **Mistral 7B** and **Gliner urchade/gliner_mediumv2.1** models are used under the **Apache 2.0 license**. | |
| --- | |
| # Installation Guide: | |
| 1. **Create and Activate a Virtual Environment** | |
| ```bash | |
| python -m venv venv | |
| source venv/bin/activate # For Linux/Mac | |
| # or | |
| venv\Scripts\activate # For Windows | |
| ``` | |
| 2. **Install Required Libraries** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Run the Application** | |
| - If Docker is being used: | |
| ```bash | |
| docker-compose up --build | |
| ``` | |
| - Without Docker: | |
| ```bash | |
| python app.py | |
| ``` | |
| 4. **Set up Hugging Face Token** | |
| - Add your Hugging Face token in the `.env` file: | |
| ```bash | |
| HF_TOKEN=<your_huggingface_token> | |
| ``` | |
| --- | |
| # File Structure Overview: | |
| ``` | |
| ImageDataExtractor/ | |
| β | |
| βββ app.py # Main Flask app | |
| βββ requirements.txt # Dependencies | |
| βββ Dockerfile # Docker container setup | |
| βββ docker-compose.yml # Docker Compose setup | |
| β | |
| βββ utility/ | |
| β βββ utils.py # PaddleOCR integration, Image preprocessing and Mistral model processing | |
| β | |
| βββ template/ | |
| β βββ index.html # UI for image uploads | |
| β βββ result.html # Display extracted results | |
| β | |
| βββ Backup/ | |
| β βββ modules/ # Base classes for data processing models | |
| β β βββ base.py | |
| β β βββ data_proc.py | |
| β β βββ evaluator.py | |
| β β βββ layers.py | |
| β β βββ run_evaluation.py | |
| β β βββ span_rep.py | |
| β β βββ token_rep.py | |
| β βββ backup.py # Backup handling Gliner Model integration and backup logic | |
| β βββ model.py | |
| β βββ save_load.py | |
| β βββ train.py | |
| β | |
| βββ .env # Environment variables (includes Hugging Face token) | |
| ``` | |
| --- | |
| # Program Overview: | |
| ### PaddleOCR Integration (utility/utils.py): | |
| - **Text Extraction**: The tool utilizes **PaddleOCR** to extract text from image-based inputs (PNG, JPG, JPEG) of visiting cards. | |
| - **Preprocessing**: Handles basic image preprocessing to enhance text recognition for OCR. | |
| ### Mistral 7B Integration (utility/utils.py): | |
| - **Data Structuring**: After text extraction, the **Mistral 7B model** processes the extracted data, structuring it into fields such as name, designation, contact number, address, and company name. | |
| ### Fallback Mechanism (Backup/backup.py): | |
| - **Gliner urchade/gliner_mediumv2.1 Model**: If the Mistral model is unavailable, the system uses the **Gliner urchade/gliner_mediumv2.1 model** to perform the same task, ensuring continuous service. | |
| - **Error Handling**: Manages failures in model availability and ensures smooth fallback. | |
| ### Web Interface (app.py): | |
| - **Flask API**: Provides endpoints for image uploads and displays the results in a structured manner. | |
| - **HTML Interface**: A frontend for users to upload images of visiting cards and view the parsed results. | |
| --- | |
| # Tree Map of the Program: | |
| ``` | |
| app.py | |
| βββ Handles Flask API and web interface | |
| βββ Manages file upload | |
| βββ Extracts text with PaddleOCR | |
| βββ Processes text with Mistral 7B | |
| βββ Displays structured results | |
| utility/utils.py | |
| βββ PaddleOCR for text extraction | |
| βββ Mistral 7B for data structuring | |
| Backup/backup.py | |
| βββ Gliner urchade/gliner_mediumv2.1 as fallback | |
| βββ Backup and error handling | |
| ``` | |
| --- | |
| # Licensing: | |
| - **Mistral 7B model** is used under the [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0). | |
| - **Gliner urchade/gliner_mediumv2.1 model** is used under the [Apache 2.0 license](https://www.apache.org/licenses/LICENSE-2.0). | |
| --- | |
| # Main Task: | |
| The primary objective is to extract and structure data from visiting cards. The system identifies and organizes: | |
| - **Name** | |
| - **Designation** | |
| - **Phone Number** | |
| - **Address** | |
| - **Company Name** | |
| --- | |
| # References: | |
| - [PaddleOCR Documentation](https://github.com/PaddlePaddle/PaddleOCR) | |
| - [Mistral 7B Documentation](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/blob/main/README.md) | |
| - [Gliner urchade/gliner_mediumv2.1 Documentation](https://huggingface.co/urchade/gliner_medium-v2.1/blob/main/README.md) | |
| - [Flask Documentation](https://flask.palletsprojects.com/) | |
| - [Docker Documentation](https://docs.docker.com/) | |
| - [Virtual Environments in Python](https://docs.python.org/3/tutorial/venv.html) | |
| --- |