Spaces:

Anshika-0909
/

Visual-Product-Matcher

Sleeping

App Files Files Community

VesperAI commited on 6 days ago

Commit

8348919

1 Parent(s): 15deac4

addede a Production Branch

Browse files

Files changed (17) hide show

.gitignore +7 -0
Dockerfile +13 -0
LICENSE +373 -0
README.md +339 -8
app.py +328 -0
folder_manager.py +145 -0
image_database.py +303 -0
image_indexer.py +384 -0
image_search.py +272 -0
pyproject.toml +6 -0
qdrant_singleton.py +148 -0
requirements-test.txt +3 -0
requirements.txt +17 -0
static/image.png +0 -0
static/js/script.js +546 -0
templates/index.html +530 -0
tests/test_qdrant_singleton.py +120 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,7 @@

+data
+qdrant_data
+config
+__pycache__
+*.db
+.venv
+.env

Dockerfile ADDED Viewed

	@@ -0,0 +1,13 @@

+FROM python:3.9-slim
+WORKDIR /code
+COPY ./requirements.txt /code/requirements.txt
+RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
+COPY . /code
+EXPOSE 7860
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

LICENSE ADDED Viewed

	@@ -0,0 +1,373 @@

+Mozilla Public License Version 2.0
+==================================
+1. Definitions
+--------------
+1.1. "Contributor"
+    means each individual or legal entity that creates, contributes to
+    the creation of, or owns Covered Software.
+1.2. "Contributor Version"
+    means the combination of the Contributions of others (if any) used
+    by a Contributor and that particular Contributor's Contribution.
+1.3. "Contribution"
+    means Covered Software of a particular Contributor.
+1.4. "Covered Software"
+    means Source Code Form to which the initial Contributor has attached
+    the notice in Exhibit A, the Executable Form of such Source Code
+    Form, and Modifications of such Source Code Form, in each case
+    including portions thereof.
+1.5. "Incompatible With Secondary Licenses"
+    means
+    (a) that the initial Contributor has attached the notice described
+        in Exhibit B to the Covered Software; or
+    (b) that the Covered Software was made available under the terms of
+        version 1.1 or earlier of the License, but not also under the
+        terms of a Secondary License.
+1.6. "Executable Form"
+    means any form of the work other than Source Code Form.
+1.7. "Larger Work"
+    means a work that combines Covered Software with other material, in
+    a separate file or files, that is not Covered Software.
+1.8. "License"
+    means this document.
+1.9. "Licensable"
+    means having the right to grant, to the maximum extent possible,
+    whether at the time of the initial grant or subsequently, any and
+    all of the rights conveyed by this License.
+1.10. "Modifications"
+    means any of the following:
+    (a) any file in Source Code Form that results from an addition to,
+        deletion from, or modification of the contents of Covered
+        Software; or
+    (b) any new file in Source Code Form that contains any Covered
+        Software.
+1.11. "Patent Claims" of a Contributor
+    means any patent claim(s), including without limitation, method,
+    process, and apparatus claims, in any patent Licensable by such
+    Contributor that would be infringed, but for the grant of the
+    License, by the making, using, selling, offering for sale, having
+    made, import, or transfer of either its Contributions or its
+    Contributor Version.
+1.12. "Secondary License"
+    means either the GNU General Public License, Version 2.0, the GNU
+    Lesser General Public License, Version 2.1, the GNU Affero General
+    Public License, Version 3.0, or any later versions of those
+    licenses.
+1.13. "Source Code Form"
+    means the form of the work preferred for making modifications.
+1.14. "You" (or "Your")
+    means an individual or a legal entity exercising rights under this
+    License. For legal entities, "You" includes any entity that
+    controls, is controlled by, or is under common control with You. For
+    purposes of this definition, "control" means (a) the power, direct
+    or indirect, to cause the direction or management of such entity,
+    whether by contract or otherwise, or (b) ownership of more than
+    fifty percent (50%) of the outstanding shares or beneficial
+    ownership of such entity.
+2. License Grants and Conditions
+--------------------------------
+2.1. Grants
+Each Contributor hereby grants You a world-wide, royalty-free,
+non-exclusive license:
+(a) under intellectual property rights (other than patent or trademark)
+    Licensable by such Contributor to use, reproduce, make available,
+    modify, display, perform, distribute, and otherwise exploit its
+    Contributions, either on an unmodified basis, with Modifications, or
+    as part of a Larger Work; and
+(b) under Patent Claims of such Contributor to make, use, sell, offer
+    for sale, have made, import, and otherwise transfer either its
+    Contributions or its Contributor Version.
+2.2. Effective Date
+The licenses granted in Section 2.1 with respect to any Contribution
+become effective for each Contribution on the date the Contributor first
+distributes such Contribution.
+2.3. Limitations on Grant Scope
+The licenses granted in this Section 2 are the only rights granted under
+this License. No additional rights or licenses will be implied from the
+distribution or licensing of Covered Software under this License.
+Notwithstanding Section 2.1(b) above, no patent license is granted by a
+Contributor:
+(a) for any code that a Contributor has removed from Covered Software;
+    or
+(b) for infringements caused by: (i) Your and any other third party's
+    modifications of Covered Software, or (ii) the combination of its
+    Contributions with other software (except as part of its Contributor
+    Version); or
+(c) under Patent Claims infringed by Covered Software in the absence of
+    its Contributions.
+This License does not grant any rights in the trademarks, service marks,
+or logos of any Contributor (except as may be necessary to comply with
+the notice requirements in Section 3.4).
+2.4. Subsequent Licenses
+No Contributor makes additional grants as a result of Your choice to
+distribute the Covered Software under a subsequent version of this
+License (see Section 10.2) or under the terms of a Secondary License (if
+permitted under the terms of Section 3.3).
+2.5. Representation
+Each Contributor represents that the Contributor believes its
+Contributions are its original creation(s) or it has sufficient rights
+to grant the rights to its Contributions conveyed by this License.
+2.6. Fair Use
+This License is not intended to limit any rights You have under
+applicable copyright doctrines of fair use, fair dealing, or other
+equivalents.
+2.7. Conditions
+Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted
+in Section 2.1.
+3. Responsibilities
+-------------------
+3.1. Distribution of Source Form
+All distribution of Covered Software in Source Code Form, including any
+Modifications that You create or to which You contribute, must be under
+the terms of this License. You must inform recipients that the Source
+Code Form of the Covered Software is governed by the terms of this
+License, and how they can obtain a copy of this License. You may not
+attempt to alter or restrict the recipients' rights in the Source Code
+Form.
+3.2. Distribution of Executable Form
+If You distribute Covered Software in Executable Form then:
+(a) such Covered Software must also be made available in Source Code
+    Form, as described in Section 3.1, and You must inform recipients of
+    the Executable Form how they can obtain a copy of such Source Code
+    Form by reasonable means in a timely manner, at a charge no more
+    than the cost of distribution to the recipient; and
+(b) You may distribute such Executable Form under the terms of this
+    License, or sublicense it under different terms, provided that the
+    license for the Executable Form does not attempt to limit or alter
+    the recipients' rights in the Source Code Form under this License.
+3.3. Distribution of a Larger Work
+You may create and distribute a Larger Work under terms of Your choice,
+provided that You also comply with the requirements of this License for
+the Covered Software. If the Larger Work is a combination of Covered
+Software with a work governed by one or more Secondary Licenses, and the
+Covered Software is not Incompatible With Secondary Licenses, this
+License permits You to additionally distribute such Covered Software
+under the terms of such Secondary License(s), so that the recipient of
+the Larger Work may, at their option, further distribute the Covered
+Software under the terms of either this License or such Secondary
+License(s).
+3.4. Notices
+You may not remove or alter the substance of any license notices
+(including copyright notices, patent notices, disclaimers of warranty,
+or limitations of liability) contained within the Source Code Form of
+the Covered Software, except that You may alter any license notices to
+the extent required to remedy known factual inaccuracies.
+3.5. Application of Additional Terms
+You may choose to offer, and to charge a fee for, warranty, support,
+indemnity or liability obligations to one or more recipients of Covered
+Software. However, You may do so only on Your own behalf, and not on
+behalf of any Contributor. You must make it absolutely clear that any
+such warranty, support, indemnity, or liability obligation is offered by
+You alone, and You hereby agree to indemnify every Contributor for any
+liability incurred by such Contributor as a result of warranty, support,
+indemnity or liability terms You offer. You may include additional
+disclaimers of warranty and limitations of liability specific to any
+jurisdiction.
+4. Inability to Comply Due to Statute or Regulation
+---------------------------------------------------
+If it is impossible for You to comply with any of the terms of this
+License with respect to some or all of the Covered Software due to
+statute, judicial order, or regulation then You must: (a) comply with
+the terms of this License to the maximum extent possible; and (b)
+describe the limitations and the code they affect. Such description must
+be placed in a text file included with all distributions of the Covered
+Software under this License. Except to the extent prohibited by statute
+or regulation, such description must be sufficiently detailed for a
+recipient of ordinary skill to be able to understand it.
+5. Termination
+--------------
+5.1. The rights granted under this License will terminate automatically
+if You fail to comply with any of its terms. However, if You become
+compliant, then the rights granted under this License from a particular
+Contributor are reinstated (a) provisionally, unless and until such
+Contributor explicitly and finally terminates Your grants, and (b) on an
+ongoing basis, if such Contributor fails to notify You of the
+non-compliance by some reasonable means prior to 60 days after You have
+come back into compliance. Moreover, Your grants from a particular
+Contributor are reinstated on an ongoing basis if such Contributor
+notifies You of the non-compliance by some reasonable means, this is the
+first time You have received notice of non-compliance with this License
+from such Contributor, and You become compliant prior to 30 days after
+Your receipt of the notice.
+5.2. If You initiate litigation against any entity by asserting a patent
+infringement claim (excluding declaratory judgment actions,
+counter-claims, and cross-claims) alleging that a Contributor Version
+directly or indirectly infringes any patent, then the rights granted to
+You by any and all Contributors for the Covered Software under Section
+2.1 of this License shall terminate.
+5.3. In the event of termination under Sections 5.1 or 5.2 above, all
+end user license agreements (excluding distributors and resellers) which
+have been validly granted by You or Your distributors under this License
+prior to termination shall survive termination.
+************************************************************************
+*                                                                      *
+*  6. Disclaimer of Warranty                                           *
+*  -------------------------                                           *
+*                                                                      *
+*  Covered Software is provided under this License on an "as is"       *
+*  basis, without warranty of any kind, either expressed, implied, or  *
+*  statutory, including, without limitation, warranties that the       *
+*  Covered Software is free of defects, merchantable, fit for a        *
+*  particular purpose or non-infringing. The entire risk as to the     *
+*  quality and performance of the Covered Software is with You.        *
+*  Should any Covered Software prove defective in any respect, You     *
+*  (not any Contributor) assume the cost of any necessary servicing,   *
+*  repair, or correction. This disclaimer of warranty constitutes an   *
+*  essential part of this License. No use of any Covered Software is   *
+*  authorized under this License except under this disclaimer.         *
+*                                                                      *
+************************************************************************
+************************************************************************
+*                                                                      *
+*  7. Limitation of Liability                                          *
+*  --------------------------                                          *
+*                                                                      *
+*  Under no circumstances and under no legal theory, whether tort      *
+*  (including negligence), contract, or otherwise, shall any           *
+*  Contributor, or anyone who distributes Covered Software as          *
+*  permitted above, be liable to You for any direct, indirect,         *
+*  special, incidental, or consequential damages of any character      *
+*  including, without limitation, damages for lost profits, loss of    *
+*  goodwill, work stoppage, computer failure or malfunction, or any    *
+*  and all other commercial damages or losses, even if such party      *
+*  shall have been informed of the possibility of such damages. This   *
+*  limitation of liability shall not apply to liability for death or   *
+*  personal injury resulting from such party's negligence to the       *
+*  extent applicable law prohibits such limitation. Some               *
+*  jurisdictions do not allow the exclusion or limitation of           *
+*  incidental or consequential damages, so this exclusion and          *
+*  limitation may not apply to You.                                    *
+*                                                                      *
+************************************************************************
+8. Litigation
+-------------
+Any litigation relating to this License may be brought only in the
+courts of a jurisdiction where the defendant maintains its principal
+place of business and such litigation shall be governed by laws of that
+jurisdiction, without reference to its conflict-of-law provisions.
+Nothing in this Section shall prevent a party's ability to bring
+cross-claims or counter-claims.
+9. Miscellaneous
+----------------
+This License represents the complete agreement concerning the subject
+matter hereof. If any provision of this License is held to be
+unenforceable, such provision shall be reformed only to the extent
+necessary to make it enforceable. Any law or regulation which provides
+that the language of a contract shall be construed against the drafter
+shall not be used to construe this License against a Contributor.
+10. Versions of the License
+---------------------------
+10.1. New Versions
+Mozilla Foundation is the license steward. Except as provided in Section
+10.3, no one other than the license steward has the right to modify or
+publish new versions of this License. Each version will be given a
+distinguishing version number.
+10.2. Effect of New Versions
+You may distribute the Covered Software under the terms of the version
+of the License under which You originally received the Covered Software,
+or under the terms of any subsequent version published by the license
+steward.
+10.3. Modified Versions
+If you create software not governed by this License, and you want to
+create a new license for such software, you may create and use a
+modified version of this License if you rename the license and remove
+any references to the name of the license steward (except to note that
+such modified license differs from this License).
+10.4. Distributing Source Code Form that is Incompatible With Secondary
+Licenses
+If You choose to distribute Source Code Form that is Incompatible With
+Secondary Licenses under the terms of this version of the License, the
+notice described in Exhibit B of this License must be attached.
+Exhibit A - Source Code Form License Notice
+-------------------------------------------
+  This Source Code Form is subject to the terms of the Mozilla Public
+  License, v. 2.0. If a copy of the MPL was not distributed with this
+  file, You can obtain one at http://mozilla.org/MPL/2.0/.
+If it is not possible or desirable to put the notice in a particular
+file, then You may include the notice in a location (such as a LICENSE
+file in a relevant directory) where a recipient would be likely to look
+for such a notice.
+You may add additional accurate notices of copyright ownership.
+Exhibit B - "Incompatible With Secondary Licenses" Notice
+---------------------------------------------------------
+  This Source Code Form is "Incompatible With Secondary Licenses", as
+  defined by the Mozilla Public License, v. 2.0.

README.md CHANGED Viewed

@@ -1,10 +1,341 @@
 ---
-title: Visual Product Matcher
-emoji: 🐠
-colorFrom: pink
-colorTo: purple
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Visual Product Search 🔍
+An intelligent visual search engine that revolutionizes product discovery using state-of-the-art AI technology. This application combines CLIP (Contrastive Language-Image Pre-Training) with Qdrant vector database to enable semantic search across image collections, making it perfect for e-commerce, inventory management, and content discovery.
+## 🌟 Key Features
+- 🎯 **Multi-Modal Search**: Search using text descriptions, uploaded images, or image URLs
+- 🖼️ **Smart Indexing**: Automatically indexes and monitors image folders with real-time updates
+- 🔍 **Semantic Understanding**: Uses OpenAI's CLIP model for deep image-text comprehension
+- � **Similarity Scoring**: Provides percentage-based similarity scores for accurate results
+- ⚡ **Real-time Processing**: WebSocket-powered live progress updates during indexing
+- 🎨 **Modern UI**: Clean, responsive interface with advanced search capabilities
+- 🌐 **URL Support**: Direct image search from web URLs
+- 📱 **Mobile Responsive**: Works seamlessly across all devices
+## 🧠 Technical Approach & Solution
+### Problem Statement
+Traditional image search relies on metadata and filenames, which often fail to capture the actual visual content. Users struggle to find specific products or images without knowing exact file names or having perfect tagging systems.
+### Our Solution Architecture
+#### 1. **Multi-Modal Embedding Generation**
+```
+Text Query → CLIP Text Encoder → 512D Vector
+Image Input → CLIP Vision Encoder → 512D Vector
+URL Image → Download → CLIP Vision Encoder → 512D Vector
+```
+#### 2. **Vector Similarity Search**
+- **Database**: Qdrant cloud vector database for scalable similarity search
+- **Indexing**: Real-time folder monitoring with automatic embedding generation
+- **Storage**: Hybrid approach - embeddings in Qdrant, metadata in SQLite
+#### 3. **Semantic Matching Pipeline**
+```
+User Input → Feature Extraction → Vector Search → Similarity Ranking → Results
+```
+### �️ Architecture Components
+#### Backend (FastAPI)
+- **Image Processing**: PIL + CLIP for feature extraction
+- **Vector Operations**: Qdrant client for similarity search
+- **File Management**: Automatic folder monitoring and indexing
+- **API Endpoints**: RESTful APIs for all search operations
+#### Frontend (Modern Web UI)
+- **Framework**: Vanilla JavaScript with Bootstrap 5
+- **Styling**: Custom CSS with modern design principles
+- **Real-time Updates**: WebSocket connections for live progress
+- **Responsive Design**: Mobile-first approach
+#### Database Layer
+- **Vector Storage**: Qdrant cloud for embeddings and similarity search
+- **Metadata Storage**: SQLite for image metadata and file information
+- **Caching**: Thumbnail generation and caching for performance
+## 🚀 Quick Start
+### Prerequisites
+- Python 3.8+
+- CUDA-compatible GPU (optional, recommended for performance)
+- Qdrant Cloud account (free tier available)
+### Installation
+1. **Clone the repository**:
+```bash
+git clone https://github.com/itsfuad/SnapSeek
+cd SnapSeek
+```
+2. **Create virtual environment**:
+```bash
+python -m venv venv
+source venv/bin/activate  # Windows: venv\Scripts\activate
+```
+3. **Install dependencies**:
+```bash
+pip install -r requirements.txt
+```
+4. **Configure environment**:
+Create a `.env` file:
+```env
+QDRANT_API_KEY=your_qdrant_api_key
+QDRANT_URL=your_qdrant_cluster_url
+```
+5. **Launch the application**:
+```bash
+python app.py
+```
+6. **Access the interface**:
+Open http://localhost:8000 in your browser
+## 🎯 Usage Guide
+### 1. **Index Your Images**
+- Click "Add Folder" to select image directories
+- Watch real-time indexing progress
+- Images are automatically monitored for changes
+### 2. **Search Methods**
+#### Text Search
+```
+"red sports car"
+"woman wearing blue dress"
+"modern kitchen design"
+```
+#### Image Upload Search
+- Click the image icon
+- Upload a reference image
+- Get visually similar results
+#### URL Search
+- Click the link icon
+- Paste any image URL
+- Find similar images in your collection
+### 3. **Results & Insights**
+- Similarity percentages for each match
+- High-resolution image previews
+- Metadata and file information
+## 🏭 Production Deployment
+### Recommended Platforms
+#### 1. **Railway (Recommended)**
+- **Why**: Best for AI/ML applications with generous free tier
+- **Resources**: 512MB RAM, 1GB storage
+- **Benefits**: No sleep mode, automatic GitHub deployments
+```dockerfile
+# Dockerfile
+FROM python:3.9-slim
+WORKDIR /app
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+EXPOSE 8000
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
+```
+#### 2. **Render**
+- **Resources**: 512MB RAM, 1GB storage
+- **Benefits**: Free SSL, auto-deploy, no cold starts
+#### 3. **Fly.io**
+- **Resources**: 256MB RAM, 3GB storage volume
+- **Benefits**: Global edge deployment, persistent volumes
+### Environment Variables for Production
+```env
+QDRANT_API_KEY=your_production_key
+QDRANT_URL=your_production_cluster
+PORT=8000
+DATA_DIR=/app/data
+```
+## 🛠️ Development & Testing
+### Project Structure
+```
+SnapSeek/
+├── app.py                 # FastAPI application
+├── image_indexer.py       # Image processing and indexing
+├── image_search.py        # Search logic and CLIP integration
+├── image_database.py      # Database operations
+├── folder_manager.py      # Folder monitoring and management
+├── qdrant_singleton.py    # Qdrant client management
+├── requirements.txt       # Dependencies
+├── .env                   # Environment configuration
+├── templates/
+│   └── index.html        # Main UI template
+├── static/
+│   ├── js/
+│   │   └── script.js     # Frontend JavaScript
+│   └── image.png         # Application icon
+├── config/
+│   └── folders.json      # Folder configuration
+└── tests/
+    └── test_*.py         # Test files
+```
+### Running Tests
+```bash
+pip install -r requirements-test.txt
+pytest tests/ -v
+```
+### Development Setup
+```bash
+# Install development dependencies
+pip install -r requirements-test.txt
+# Run with auto-reload
+uvicorn app:app --reload --host 0.0.0.0 --port 8000
+```
+## 🔧 Performance Optimization
+### Model Selection
+```python
+# For production (smaller, faster)
+MODEL_NAME = "openai/clip-vit-base-patch16"
+# For development (balance)
+MODEL_NAME = "openai/clip-vit-base-patch32"
+```
+### Hardware Recommendations
+- **CPU**: 4+ cores for concurrent processing
+- **RAM**: 8GB+ for model loading and image processing
+- **Storage**: SSD recommended for faster I/O
+- **GPU**: Optional, CUDA-compatible for faster inference
+### Scaling Considerations
+- **Batch Processing**: Process multiple images simultaneously
+- **Caching**: Implement Redis for frequent queries
+- **Load Balancing**: Use multiple instances for high traffic
+- **Database Sharding**: Split collections by categories
+## 🐛 Troubleshooting
+### Common Issues
+#### 1. **Model Loading Errors**
+```bash
+# Clear cache and reinstall
+pip uninstall torch torchvision transformers
+pip install torch torchvision transformers --no-cache-dir
+```
+#### 2. **Qdrant Connection Issues**
+- Verify API key and URL in `.env`
+- Check network connectivity
+- Ensure Qdrant cluster is active
+#### 3. **Memory Issues**
+- Reduce batch size in processing
+- Use CPU-only mode: `device="cpu"`
+- Close unused applications
+#### 4. **Slow Performance**
+- Enable GPU acceleration
+- Optimize image sizes
+- Implement result caching
+### Performance Monitoring
+```python
+# Add logging for performance tracking
+import time
+import logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# Time search operations
+start_time = time.time()
+results = await searcher.search_by_text(query)
+logger.info(f"Search completed in {time.time() - start_time:.2f}s")
+```
+## 🤝 Contributing
+1. Fork the repository
+2. Create a feature branch: `git checkout -b feature-name`
+3. Make your changes and add tests
+4. Run tests: `pytest tests/`
+5. Commit changes: `git commit -m "Add feature"`
+6. Push to branch: `git push origin feature-name`
+7. Create a Pull Request
+### Code Standards
+- Follow PEP 8 style guidelines
+- Add docstrings to all functions
+- Include type hints where appropriate
+- Write tests for new features
+## 📊 Use Cases & Applications
+### E-commerce
+- Product recommendation systems
+- Visual search for online stores
+- Inventory management
+- Duplicate product detection
+### Content Management
+- Digital asset organization
+- Stock photo searching
+- Brand consistency checking
+- Content moderation
+### Research & Education
+- Academic image databases
+- Scientific data analysis
+- Historical archive searches
+- Educational content discovery
+## 🔮 Future Enhancements
+- [ ] **Multi-language Support**: Extend text search to multiple languages
+- [ ] **Advanced Filters**: Add size, color, and metadata filters
+- [ ] **Batch Operations**: Upload and search multiple images at once
+- [ ] **API Integration**: RESTful API for external applications
+- [ ] **Machine Learning**: Custom fine-tuned models for specific domains
+- [ ] **Analytics Dashboard**: Search metrics and usage statistics
+- [ ] **Mobile App**: Native mobile applications
+- [ ] **Cloud Storage**: Integration with AWS S3, Google Drive, etc.
+## 📄 License
+This project is licensed under the Mozilla Public License 2.0 - see the [LICENSE](LICENSE) file for details.
+## 🙏 Acknowledgments
+- **OpenAI**: For the CLIP model and research
+- **Qdrant**: For the excellent vector database
+- **FastAPI**: For the modern web framework
+- **Transformers**: For the model implementation
+- **Bootstrap**: For the UI components
+## 📞 Support & Contact
+- **Issues**: [GitHub Issues](https://github.com/itsfuad/SnapSeek/issues)
+- **Discussions**: [GitHub Discussions](https://github.com/itsfuad/SnapSeek/discussions)
+- **Documentation**: [Wiki](https://github.com/itsfuad/SnapSeek/wiki)
 ---
+**Made with ❤️ by [itsfuad](https://github.com/itsfuad)**
+*Revolutionizing visual search with AI technology*

app.py ADDED Viewed

	@@ -0,0 +1,328 @@

+import os
+from pathlib import Path
+from typing import List, Optional
+import io
+from contextlib import asynccontextmanager
+from fastapi import FastAPI, File, UploadFile, Request, WebSocket, WebSocketDisconnect, HTTPException, BackgroundTasks
+from fastapi.responses import HTMLResponse, FileResponse, StreamingResponse
+from fastapi.staticfiles import StaticFiles
+from fastapi.templating import Jinja2Templates
+from PIL import Image
+from image_indexer import ImageIndexer
+from image_search import ImageSearch
+from image_database import ImageDatabase
+# Initialize image indexer, searcher, and database
+indexer = ImageIndexer()
+searcher = ImageSearch()
+image_db = ImageDatabase()
+image_extensions = [".jpg", ".jpeg", ".png", ".gif"]
+@asynccontextmanager
+async def lifespan(_: FastAPI):
+    """Initialize the image indexer"""
+    yield
+app = FastAPI(title="Visual Product Search", lifespan=lifespan)
+# Setup templates and static files
+templates = Jinja2Templates(directory="templates")
+app.mount("/static", StaticFiles(directory="static"), name="static")
+@app.get("/", response_class=HTMLResponse)
+async def home(request: Request):
+    """Render the home page"""
+    folders = indexer.folder_manager.get_all_folders()
+    return templates.TemplateResponse(
+        "index.html",
+        {
+            "request": request,
+            "initial_status": {
+                "status": indexer.status.value,
+                "current_file": indexer.current_file,
+                "total_files": indexer.total_files,
+                "processed_files": indexer.processed_files,
+                "progress_percentage": round((indexer.processed_files / indexer.total_files * 100) if indexer.total_files > 0 else 0, 2)
+            },
+            "folders": folders
+        }
+    )
+@app.post("/folders")
+async def add_folder(folder_path: str, background_tasks: BackgroundTasks):
+    """Add a new folder to index"""
+    try:
+        # Add folder to manager first (this creates the collection)
+        folder_info = indexer.folder_manager.add_folder(folder_path)
+        # Start indexing in the background
+        background_tasks.add_task(indexer.index_folder, folder_path)
+        return folder_info
+    except Exception as e:
+        raise HTTPException(status_code=400, detail=str(e)) from e
+@app.delete("/folders/{folder_path:path}")
+async def remove_folder(folder_path: str):
+    """Remove a folder from indexing"""
+    try:
+        await indexer.remove_folder(folder_path)
+        return {"status": "success"}
+    except Exception as e:
+        raise HTTPException(status_code=400, detail=str(e)) from e
+@app.get("/folders")
+async def list_folders():
+    """List all indexed folders"""
+    return indexer.folder_manager.get_all_folders()
+@app.get("/search/text")
+async def search_by_text(query: str, folder: Optional[str] = None) -> List[dict]:
+    """Search images by text query, optionally filtered by folder"""
+    results = await searcher.search_by_text(query, folder)
+    return results
+@app.post("/search/image")
+async def search_by_image(
+    file: UploadFile = File(...),
+    folder: Optional[str] = None
+) -> List[dict]:
+    """Search images by uploading a similar image, optionally filtered by folder"""
+    contents = await file.read()
+    image = Image.open(io.BytesIO(contents))
+    results = await searcher.search_by_image(image, folder)
+    return results
+@app.get("/search/url")
+async def search_by_url(
+    url: str,
+    folder: Optional[str] = None
+) -> List[dict]:
+    """Search images by providing a URL to a similar image, optionally filtered by folder"""
+    results = await searcher.search_by_url(url, folder)
+    return results
+@app.get("/images")
+async def list_images(folder: Optional[str] = None) -> List[dict]:
+    """List all indexed images, optionally filtered by folder"""
+    return await indexer.get_all_images(folder)
+@app.websocket("/ws")
+async def websocket_endpoint(websocket: WebSocket):
+    """WebSocket endpoint for real-time indexing status updates"""
+    await indexer.add_websocket_connection(websocket)
+    try:
+        while True:
+            await websocket.receive_text()
+    except WebSocketDisconnect:
+        await indexer.remove_websocket_connection(websocket)
+@app.get("/image/{image_id}")
+async def serve_image(image_id: str):
+    """Serve an image from the database by ID"""
+    try:
+        image_data = image_db.get_image(image_id)
+        if not image_data:
+            raise HTTPException(status_code=404, detail="Image not found")
+        return StreamingResponse(
+            io.BytesIO(image_data["image_data"]),
+            media_type=f"image/{image_data['file_extension'].lstrip('.')}",
+            headers={
+                "Cache-Control": "max-age=86400",  # Cache for 24 hours
+                "Content-Disposition": f"inline; filename=\"{image_data['filename']}\""
+            }
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.get("/thumbnail/{image_id}")
+async def serve_thumbnail_by_id(image_id: str):
+    """Serve a thumbnail from the database by ID"""
+    try:
+        thumbnail_data = image_db.get_thumbnail(image_id)
+        if not thumbnail_data:
+            raise HTTPException(status_code=404, detail="Thumbnail not found")
+        return StreamingResponse(
+            io.BytesIO(thumbnail_data),
+            media_type="image/jpeg",
+            headers={"Cache-Control": "max-age=86400"}  # Cache for 24 hours
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.get("/stats")
+async def get_database_stats():
+    """Get database statistics"""
+    try:
+        return image_db.get_database_stats()
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.get("/debug/collections")
+async def debug_collections():
+    """Debug endpoint to check collections and folders"""
+    try:
+        # Get Qdrant client and collections
+        qdrant_client = indexer.qdrant
+        collections = qdrant_client.get_collections().collections
+        # Get folder manager status
+        folders = indexer.folder_manager.get_all_folders()
+        return {
+            "qdrant_collections": [col.name for col in collections],
+            "folder_manager_folders": folders,
+            "collections_count": len(collections),
+            "folders_count": len(folders)
+        }
+    except Exception as e:
+        return {"error": str(e)}
+# Keep the old endpoints for backward compatibility but mark as deprecated
+@app.get("/thumbnail/{folder_path:path}/{file_path:path}")
+async def serve_thumbnail(folder_path: str, file_path: str):
+    """Serve resized image thumbnails (DEPRECATED - use /thumbnail/{image_id} instead)"""
+    try:
+        # Get folder info to verify it's an indexed folder
+        folder_info = indexer.folder_manager.get_folder_info(folder_path)
+        if not folder_info:
+            raise HTTPException(status_code=404, detail="Folder not found")
+        # Construct full file path
+        full_path = Path(folder_path) / file_path
+        if not full_path.exists():
+            raise HTTPException(status_code=404, detail="File not found")
+        # Only serve image files
+        if full_path.suffix.lower() not in image_extensions:
+            raise HTTPException(status_code=400, detail="Invalid file type")
+        # Open image, resize, and convert to JPEG
+        img = Image.open(full_path)
+        img.thumbnail((200, 200))  # Resize maintaining aspect ratio
+        # Save to a byte stream
+        img_byte_arr = io.BytesIO()
+        img.save(img_byte_arr, format="JPEG")
+        img_byte_arr.seek(0)
+        return StreamingResponse(img_byte_arr, media_type="image/jpeg", headers={"Cache-Control": "max-age=3600"})  # Cache for 1 hour
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
+@app.get("/files/{folder_path:path}/{file_path:path}")
+async def serve_file(folder_path: str, file_path: str):
+    """Serve files from indexed folders (DEPRECATED - use /image/{image_id} instead)"""
+    try:
+        # Get folder info to verify it's an indexed folder
+        folder_info = indexer.folder_manager.get_folder_info(folder_path)
+        if not folder_info:
+            raise HTTPException(status_code=404, detail="Folder not found")
+        # Construct full file path
+        full_path = Path(folder_path) / file_path
+        if not full_path.exists():
+            raise HTTPException(status_code=404, detail="File not found")
+        # Only serve image files
+        if full_path.suffix.lower() not in image_extensions:
+            raise HTTPException(status_code=400, detail="Invalid file type")
+        return FileResponse(full_path)
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e)) from e
+def get_windows_drives():
+    """Get available drives on Windows"""
+    from ctypes import windll
+    drives = []
+    bitmask = windll.kernel32.GetLogicalDrives()
+    for letter in range(65, 91):  # A-Z
+        if bitmask & (1 << (letter - 65)):
+            drives.append(chr(letter) + ":\\")
+    return drives
+def get_directory_item(item):
+    """Get directory item info"""
+    try:
+        is_dir = item.is_dir()
+        if is_dir or item.suffix.lower() in image_extensions:
+            return {
+                "name": item.name,
+                "path": str(item.absolute()),
+                "type": "directory" if is_dir else "file",
+                "size": item.stat().st_size if not is_dir else None
+            }
+    except Exception:
+        pass
+    return None
+def get_directory_contents(path: str):
+    """Get contents of a directory"""
+    try:
+        path_obj = Path(path)
+        if not path_obj.exists():
+            return {"error": "Path does not exist"}
+        parent = str(path_obj.parent) if path_obj.parent != path_obj else None
+        contents = [
+            item for item in (get_directory_item(i) for i in path_obj.iterdir())
+            if item is not None
+        ]
+        return {
+            "current_path": str(path_obj.absolute()),
+            "parent_path": parent,
+            "contents": sorted(contents, key=lambda x: (x["type"] != "directory", x["name"].lower()))
+        }
+    except Exception as e:
+        return {"error": str(e)}
+@app.get("/browse")
+async def browse_folders():
+    """Browse system folders"""
+    if os.name == "nt":  # Windows
+        return {"drives": get_windows_drives()}
+    return get_directory_contents("/")  # Unix-like
+@app.get("/browse/{path:path}")
+async def browse_path(path: str):
+    """Browse a specific path"""
+    try:
+        path_obj = Path(path)
+        if not path_obj.exists():
+            raise HTTPException(status_code=404, detail="Path not found")
+        # Get parent directory for navigation
+        parent = str(path_obj.parent) if path_obj.parent != path_obj else None
+        # List directories and files
+        contents = []
+        for item in path_obj.iterdir():
+            try:
+                is_dir = item.is_dir()
+                if is_dir or item.suffix.lower() in image_extensions:
+                    contents.append({
+                        "name": item.name,
+                        "path": str(item.absolute()),
+                        "type": "directory" if is_dir else "file",
+                        "size": item.stat().st_size if not is_dir else None
+                    })
+            except Exception:
+                continue  # Skip items we can't access
+        return {
+            "current_path": str(path_obj.absolute()),
+            "parent_path": parent,
+            "contents": sorted(contents, key=lambda x: (x["type"] != "directory", x["name"].lower()))
+        }
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e)) from e
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run("app:app", host="0.0.0.0", port=8000, reload=False)

folder_manager.py ADDED Viewed

	@@ -0,0 +1,145 @@

+from pathlib import Path
+from typing import List, Dict, Optional
+import json
+import time
+from qdrant_singleton import QdrantClientSingleton
+class FolderManager:
+    def __init__(self):
+        # Ensure config directory exists
+        self.config_dir = Path("config")
+        self.config_dir.mkdir(exist_ok=True)
+        # Ensure folders.json exists
+        self.config_file = self.config_dir / "folders.json"
+        if not self.config_file.exists():
+            self._create_default_config()
+        self.folders: Dict[str, Dict] = self._load_folders()
+    def _create_default_config(self):
+        """Create default configuration file if it doesn't exist"""
+        default_config = {}
+        with open(self.config_file, 'w') as f:
+            json.dump(default_config, f, indent=2)
+        print(f"Created default configuration file at {self.config_file}")
+    def _load_folders(self) -> Dict[str, Dict]:
+        """Load folder configurations from JSON file"""
+        if self.config_file.exists():
+            with open(self.config_file, 'r') as f:
+                return json.load(f)
+        return {}
+    def _save_folders(self):
+        """Save folder configurations to JSON file"""
+        # Ensure config directory exists before saving
+        self.config_dir.mkdir(exist_ok=True)
+        # Write config
+        with open(self.config_file, 'w') as f:
+            json.dump(self.folders, f, indent=2)
+    def add_folder(self, folder_path: str) -> Dict:
+        """Add a new folder to index"""
+        folder_path = str(Path(folder_path).absolute())
+        print(f"Adding folder: {folder_path}")
+        # Check if this folder or any parent/child is already being indexed
+        for existing_path in self.folders:
+            existing = Path(existing_path)
+            new_path = Path(folder_path)
+            # If the new path is already indexed
+            if existing == new_path:
+                print(f"Folder already indexed: {folder_path}")
+                return self.folders[existing_path]
+            # If the new path is a parent of an existing path, use the same collection
+            if existing.is_relative_to(new_path):
+                print(f"Using existing collection for parent path: {folder_path}")
+                return self.folders[existing_path]
+            # If the new path is a child of an existing path, use the parent's collection
+            if new_path.is_relative_to(existing):
+                print(f"Using parent's collection for: {folder_path}")
+                return self.folders[existing_path]
+        # If it's a completely new path, create a new entry
+        collection_name = f"images_{len(self.folders)}"
+        print(f"Creating new collection {collection_name} for folder: {folder_path}")
+        folder_info = {
+            "path": folder_path,
+            "collection_name": collection_name,
+            "added_at": int(time.time()),
+            "last_indexed": None
+        }
+        # Initialize new collection in Qdrant
+        try:
+            QdrantClientSingleton.initialize_collection(collection_name)
+            print(f"Successfully initialized collection: {collection_name}")
+        except Exception as e:
+            print(f"Error initializing collection {collection_name}: {e}")
+            raise e
+        # Save to config
+        self.folders[folder_path] = folder_info
+        self._save_folders()
+        print(f"Successfully added folder {folder_path} with collection {collection_name}")
+        return folder_info
+    def remove_folder(self, folder_path: str):
+        """Remove a folder from indexing"""
+        folder_path = str(Path(folder_path).absolute())
+        if folder_path in self.folders:
+            # Delete the collection
+            collection_name = self.folders[folder_path]["collection_name"]
+            client = QdrantClientSingleton.get_instance()
+            try:
+                client.delete_collection(collection_name=collection_name)
+            except Exception as e:
+                print(f"Error deleting collection: {e}")
+            # Remove from config
+            del self.folders[folder_path]
+            self._save_folders()
+    def get_folder_info(self, folder_path: str) -> Optional[Dict]:
+        """Get information about an indexed folder"""
+        folder_path = str(Path(folder_path).absolute())
+        return self.folders.get(folder_path)
+    def get_all_folders(self) -> List[Dict]:
+        """Get all indexed folders"""
+        return [
+            {
+                "path": path,
+                **info,
+                "is_valid": Path(path).exists()  # Check if folder still exists
+            }
+            for path, info in self.folders.items()
+        ]
+    def update_last_indexed(self, folder_path: str):
+        """Update the last indexed timestamp for a folder"""
+        folder_path = str(Path(folder_path).absolute())
+        if folder_path in self.folders:
+            self.folders[folder_path]["last_indexed"] = int(time.time())
+            self._save_folders()
+    def get_collection_for_path(self, folder_path: str) -> Optional[str]:
+        """Get the collection name for a given path"""
+        folder_path = Path(folder_path).absolute()
+        print(f"Looking for collection for path: {folder_path}")
+        # Check each indexed folder to find the appropriate collection
+        for path, info in self.folders.items():
+            if folder_path == Path(path) or folder_path.is_relative_to(Path(path)):
+                print(f"Found collection {info['collection_name']} for path {folder_path}")
+                return info["collection_name"]
+        print(f"No collection found for path {folder_path}")
+        return None

image_database.py ADDED Viewed

	@@ -0,0 +1,303 @@

+import sqlite3
+import base64
+import uuid
+from pathlib import Path
+from typing import Optional, List, Dict, Tuple
+from PIL import Image
+import io
+import hashlib
+class ImageDatabase:
+    """SQLite database for storing images and metadata"""
+    def __init__(self, db_path: str = "images.db"):
+        self.db_path = db_path
+        self.init_database()
+    def init_database(self):
+        """Initialize the database with required tables"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        # Create images table
+        cursor.execute('''
+            CREATE TABLE IF NOT EXISTS images (
+                id TEXT PRIMARY KEY,
+                file_hash TEXT UNIQUE NOT NULL,
+                original_path TEXT NOT NULL,
+                filename TEXT NOT NULL,
+                file_extension TEXT NOT NULL,
+                file_size INTEGER NOT NULL,
+                width INTEGER NOT NULL,
+                height INTEGER NOT NULL,
+                image_data BLOB NOT NULL,
+                thumbnail_data BLOB,
+                root_folder TEXT NOT NULL,
+                relative_path TEXT NOT NULL,
+                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
+                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
+            )
+        ''')
+        # Create indexes for better performance
+        cursor.execute('CREATE INDEX IF NOT EXISTS idx_file_hash ON images(file_hash)')
+        cursor.execute('CREATE INDEX IF NOT EXISTS idx_root_folder ON images(root_folder)')
+        cursor.execute('CREATE INDEX IF NOT EXISTS idx_relative_path ON images(relative_path)')
+        cursor.execute('CREATE INDEX IF NOT EXISTS idx_filename ON images(filename)')
+        conn.commit()
+        conn.close()
+    def _calculate_file_hash(self, image_data: bytes) -> str:
+        """Calculate SHA-256 hash of image data"""
+        return hashlib.sha256(image_data).hexdigest()
+    def _create_thumbnail(self, image: Image.Image, size: Tuple[int, int] = (200, 200)) -> bytes:
+        """Create a thumbnail of the image"""
+        # Create a copy to avoid modifying original
+        thumbnail = image.copy()
+        thumbnail.thumbnail(size, Image.Resampling.LANCZOS)
+        # Convert to bytes
+        img_byte_arr = io.BytesIO()
+        # Save as JPEG for thumbnails to reduce size
+        if thumbnail.mode in ('RGBA', 'LA', 'P'):
+            thumbnail = thumbnail.convert('RGB')
+        thumbnail.save(img_byte_arr, format='JPEG', quality=85, optimize=True)
+        return img_byte_arr.getvalue()
+    def store_image(self, image_path: Path, root_folder: Path) -> Optional[str]:
+        """
+        Store an image in the database
+        Returns the image ID if successful, None if failed
+        """
+        try:
+            # Load the image
+            with Image.open(image_path) as image:
+                # Convert to RGB if needed
+                if image.mode in ('RGBA', 'LA', 'P'):
+                    image = image.convert('RGB')
+                # Get image data as bytes
+                img_byte_arr = io.BytesIO()
+                image.save(img_byte_arr, format='JPEG', quality=95, optimize=True)
+                image_data = img_byte_arr.getvalue()
+                # Calculate file hash
+                file_hash = self._calculate_file_hash(image_data)
+                # Create thumbnail
+                thumbnail_data = self._create_thumbnail(image)
+                # Calculate relative path
+                relative_path = str(image_path.relative_to(root_folder))
+                # Prepare metadata
+                image_id = str(uuid.uuid4())
+                filename = image_path.name
+                file_extension = image_path.suffix.lower()
+                file_size = len(image_data)
+                width, height = image.size
+                conn = sqlite3.connect(self.db_path)
+                cursor = conn.cursor()
+                # Check if image already exists (by hash)
+                cursor.execute('SELECT id FROM images WHERE file_hash = ?', (file_hash,))
+                existing = cursor.fetchone()
+                if existing:
+                    print(f"Image already exists in database: {filename}")
+                    conn.close()
+                    return existing[0]
+                # Insert new image
+                cursor.execute('''
+                    INSERT INTO images (
+                        id, file_hash, original_path, filename, file_extension,
+                        file_size, width, height, image_data, thumbnail_data,
+                        root_folder, relative_path
+                    ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+                ''', (
+                    image_id, file_hash, str(image_path.absolute()), filename,
+                    file_extension, file_size, width, height, image_data,
+                    thumbnail_data, str(root_folder.absolute()), relative_path
+                ))
+                conn.commit()
+                conn.close()
+                print(f"Stored image in database: {filename} (ID: {image_id})")
+                return image_id
+        except Exception as e:
+            print(f"Error storing image {image_path}: {e}")
+            return None
+    def get_image(self, image_id: str) -> Optional[Dict]:
+        """Get an image by ID"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            SELECT id, filename, file_extension, file_size, width, height,
+                   image_data, root_folder, relative_path, created_at
+            FROM images WHERE id = ?
+        ''', (image_id,))
+        result = cursor.fetchone()
+        conn.close()
+        if result:
+            return {
+                'id': result[0],
+                'filename': result[1],
+                'file_extension': result[2],
+                'file_size': result[3],
+                'width': result[4],
+                'height': result[5],
+                'image_data': result[6],
+                'root_folder': result[7],
+                'relative_path': result[8],
+                'created_at': result[9]
+            }
+        return None
+    def get_thumbnail(self, image_id: str) -> Optional[bytes]:
+        """Get thumbnail data for an image"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('SELECT thumbnail_data FROM images WHERE id = ?', (image_id,))
+        result = cursor.fetchone()
+        conn.close()
+        return result[0] if result else None
+    def get_images_by_folder(self, root_folder: str) -> List[Dict]:
+        """Get all images from a specific folder"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            SELECT id, filename, file_extension, file_size, width, height,
+                   root_folder, relative_path, created_at
+            FROM images WHERE root_folder = ?
+            ORDER BY created_at DESC
+        ''', (root_folder,))
+        results = cursor.fetchall()
+        conn.close()
+        return [
+            {
+                'id': row[0],
+                'filename': row[1],
+                'file_extension': row[2],
+                'file_size': row[3],
+                'width': row[4],
+                'height': row[5],
+                'root_folder': row[6],
+                'relative_path': row[7],
+                'created_at': row[8]
+            }
+            for row in results
+        ]
+    def get_all_images(self) -> List[Dict]:
+        """Get all images from the database"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            SELECT id, filename, file_extension, file_size, width, height,
+                   root_folder, relative_path, created_at
+            FROM images
+            ORDER BY created_at DESC
+        ''')
+        results = cursor.fetchall()
+        conn.close()
+        return [
+            {
+                'id': row[0],
+                'filename': row[1],
+                'file_extension': row[2],
+                'file_size': row[3],
+                'width': row[4],
+                'height': row[5],
+                'root_folder': row[6],
+                'relative_path': row[7],
+                'created_at': row[8]
+            }
+            for row in results
+        ]
+    def delete_image(self, image_id: str) -> bool:
+        """Delete an image from the database"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('DELETE FROM images WHERE id = ?', (image_id,))
+        deleted = cursor.rowcount > 0
+        conn.commit()
+        conn.close()
+        return deleted
+    def delete_images_by_folder(self, root_folder: str) -> int:
+        """Delete all images from a specific folder"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('DELETE FROM images WHERE root_folder = ?', (root_folder,))
+        deleted_count = cursor.rowcount
+        conn.commit()
+        conn.close()
+        return deleted_count
+    def image_exists_by_path(self, relative_path: str, root_folder: str) -> Optional[str]:
+        """Check if an image exists by its path, return image ID if exists"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        cursor.execute('''
+            SELECT id FROM images
+            WHERE relative_path = ? AND root_folder = ?
+        ''', (relative_path, root_folder))
+        result = cursor.fetchone()
+        conn.close()
+        return result[0] if result else None
+    def get_database_stats(self) -> Dict:
+        """Get database statistics"""
+        conn = sqlite3.connect(self.db_path)
+        cursor = conn.cursor()
+        # Total images
+        cursor.execute('SELECT COUNT(*) FROM images')
+        total_images = cursor.fetchone()[0]
+        # Total size
+        cursor.execute('SELECT SUM(file_size) FROM images')
+        total_size = cursor.fetchone()[0] or 0
+        # Images by folder
+        cursor.execute('SELECT root_folder, COUNT(*) FROM images GROUP BY root_folder')
+        folders = cursor.fetchall()
+        conn.close()
+        return {
+            'total_images': total_images,
+            'total_size_bytes': total_size,
+            'total_size_mb': round(total_size / (1024 * 1024), 2),
+            'folders': {folder: count for folder, count in folders}
+        }

image_indexer.py ADDED Viewed

	@@ -0,0 +1,384 @@

+from pathlib import Path
+from typing import List, Dict, Set, Optional
+import torch
+from PIL import Image
+import numpy as np
+from transformers import CLIPProcessor, CLIPModel
+from watchdog.observers import Observer
+from watchdog.events import FileSystemEventHandler
+import asyncio
+from concurrent.futures import ThreadPoolExecutor
+import threading
+from qdrant_client.http.models import PointStruct
+import uuid
+from qdrant_singleton import QdrantClientSingleton, CURRENT_SCHEMA_VERSION
+from fastapi import WebSocket
+from enum import Enum
+import qdrant_client
+import time
+from folder_manager import FolderManager
+from image_database import ImageDatabase
+class IndexingStatus(Enum):
+    IDLE = "idle"
+    INDEXING = "indexing"
+    MONITORING = "monitoring"
+class ImageIndexer:
+    def __init__(self):
+        # Initialize folder manager and image database
+        self.folder_manager = FolderManager()
+        self.image_db = ImageDatabase()
+        # Initialize status tracking
+        self.status = IndexingStatus.IDLE
+        self.current_file: Optional[str] = None
+        self.total_files = 0
+        self.processed_files = 0
+        self.websocket_connections: Set[WebSocket] = set()
+        # Thread synchronization
+        self.collection_initialized = threading.Event()
+        self.model_initialized = threading.Event()
+        # Initialize Qdrant client
+        self.qdrant = QdrantClientSingleton.get_instance()
+        # Thread pool for background processing
+        self.executor = ThreadPoolExecutor(max_workers=4)
+        # Cache of indexed paths per collection
+        self.indexed_paths: Dict[str, Set[str]] = {}
+        # Model initialization flags
+        self.model = None
+        self.processor = None
+        self.device = None
+        # Start model initialization in a separate thread
+        threading.Thread(target=self._initialize_model_thread, daemon=True).start()
+    def _load_indexed_paths(self, collection_name: str):
+        """Load the set of already indexed paths from a collection"""
+        try:
+            response = self.qdrant.scroll(
+                collection_name=collection_name,
+                limit=10000,
+                with_payload=True,
+                with_vectors=False
+            )
+            self.indexed_paths[collection_name] = {point.payload["path"] for point in response[0]}
+        except Exception as e:
+            print(f"Error loading indexed paths for collection {collection_name}: {e}")
+            self.indexed_paths[collection_name] = set()
+    async def broadcast_status(self):
+        """Broadcast current status to all connected WebSocket clients"""
+        status_data = {
+            "status": self.status.value,
+            "current_file": self.current_file,
+            "total_files": self.total_files,
+            "processed_files": self.processed_files,
+            "progress_percentage": round((self.processed_files / self.total_files * 100) if self.total_files > 0 else 0, 2)
+        }
+        for connection in self.websocket_connections:
+            try:
+                await connection.send_json(status_data)
+            except Exception as e:
+                print(f"Error broadcasting to WebSocket: {e}")
+                self.websocket_connections.remove(connection)
+    async def add_websocket_connection(self, websocket: WebSocket):
+        """Add a new WebSocket connection"""
+        await websocket.accept()
+        self.websocket_connections.add(websocket)
+        await self.broadcast_status()
+    async def remove_websocket_connection(self, websocket: WebSocket):
+        """Remove a WebSocket connection"""
+        self.websocket_connections.remove(websocket)
+    async def add_folder(self, folder_path: str) -> Dict:
+        """Add a new folder to index"""
+        folder_info = self.folder_manager.add_folder(folder_path)
+        # Start indexing the new folder
+        await self.index_folder(folder_path)
+        return folder_info
+    async def remove_folder(self, folder_path: str):
+        """Remove a folder from indexing"""
+        # First remove from the folder manager
+        self.folder_manager.remove_folder(folder_path)
+        # Clean up SQLite database
+        folder_abs_path = str(Path(folder_path).absolute())
+        deleted_count = self.image_db.delete_images_by_folder(folder_abs_path)
+        print(f"Deleted {deleted_count} images from database for folder: {folder_path}")
+    async def index_folder(self, folder_path: str):
+        """Index all images in a specific folder"""
+        if not self.model_initialized.is_set() or not self.model or not self.processor:
+            print("Model not initialized. Skipping indexing.")
+            self.status = IndexingStatus.IDLE
+            await self.broadcast_status()
+            return
+        folder_path = Path(folder_path)
+        if not folder_path.exists():
+            print(f"Folder not found: {folder_path}")
+            return
+        collection_name = self.folder_manager.get_collection_for_path(folder_path)
+        if not collection_name:
+            print(f"No collection found for folder: {folder_path}")
+            return
+        # Wait for model initialization before starting indexing
+        while not self.model_initialized.is_set():
+            print("Waiting for model initialization...")
+            await asyncio.sleep(0.1)
+        print(f"Starting to index folder: {folder_path}")
+        self.status = IndexingStatus.INDEXING
+        self.processed_files = 0
+        self.current_file = None
+        await self.broadcast_status()  # Broadcast initial status
+        # Load indexed paths for this collection if not already loaded
+        if collection_name not in self.indexed_paths:
+            self._load_indexed_paths(collection_name)
+        # Use rglob for recursive directory scanning
+        image_files = [f for f in folder_path.rglob("*") if f.suffix.lower() in {".jpg", ".jpeg", ".png", ".gif"}]
+        self.total_files = len(image_files)
+        print(f"Found {self.total_files} images to index")
+        await self.broadcast_status()  # Broadcast after finding total files
+        try:
+            for i, image_file in enumerate(image_files, 1):
+                relative_path = str(image_file.relative_to(folder_path))
+                self.current_file = str(image_file)
+                self.processed_files = i - 1  # Update before processing
+                await self.broadcast_status()  # Broadcast before processing each file
+                if relative_path not in self.indexed_paths[collection_name]:
+                    print(f"Indexing image {i}/{self.total_files}: {image_file.name}")
+                    await self.index_image(image_file, folder_path)
+                else:
+                    print(f"Skipping already indexed image {i}/{self.total_files}: {image_file.name}")
+                self.processed_files = i  # Update after processing
+                await self.broadcast_status()  # Broadcast after processing each file
+                # Small delay to allow other tasks to run
+                await asyncio.sleep(0)
+        except Exception as e:
+            print(f"Error during indexing: {e}")
+            import traceback
+            traceback.print_exc()
+        finally:
+            # Update last indexed timestamp
+            self.folder_manager.update_last_indexed(str(folder_path))
+            # Reset status
+            self.status = IndexingStatus.MONITORING
+            self.current_file = None
+            await self.broadcast_status()  # Final status broadcast
+            print("Finished indexing folder")
+    async def index_image(self, image_path: Path, root_folder: Path):
+        """Index a single image"""
+        if not self.model_initialized.is_set() or not self.model or not self.processor:
+            print("Model not initialized. Skipping indexing image.")
+            return
+        try:
+            # Wait for model initialization
+            while not self.model_initialized.is_set():
+                await asyncio.sleep(0.1)
+            # Get the collection for this path
+            collection_name = self.folder_manager.get_collection_for_path(str(root_folder))
+            if not collection_name:
+                print(f"No collection found for image: {image_path}")
+                return
+            # Convert to relative path from root folder
+            try:
+                relative_path = str(image_path.relative_to(root_folder))
+            except ValueError:
+                print(f"Image {image_path} is not under root folder {root_folder}")
+                return
+            print(f"Indexing image: {relative_path}")
+            self.current_file = str(image_path)
+            await self.broadcast_status()
+            # Check if image already exists in database
+            existing_image_id = self.image_db.image_exists_by_path(relative_path, str(root_folder.absolute()))
+            if existing_image_id:
+                # Check if it exists in Qdrant with current schema version
+                existing_points = self.qdrant.scroll(
+                    collection_name=collection_name,
+                    scroll_filter=qdrant_client.http.models.Filter(
+                        must=[
+                            qdrant_client.http.models.FieldCondition(
+                                key="image_id",
+                                match={"value": existing_image_id}
+                            ),
+                            qdrant_client.http.models.FieldCondition(
+                                key="schema_version",
+                                match={"value": CURRENT_SCHEMA_VERSION}
+                            )
+                        ]
+                    ),
+                    limit=1
+                )[0]
+                if existing_points:
+                    print(f"Skipping {relative_path} - already indexed with current schema version")
+                    return
+            # Store image in SQLite database first
+            image_id = self.image_db.store_image(image_path, root_folder)
+            if not image_id:
+                print(f"Failed to store image in database: {relative_path}")
+                return
+            # Load and preprocess image for embedding
+            image = Image.open(image_path).convert("RGB")
+            inputs = self.processor(images=image, return_tensors="pt").to(self.device)
+            # Generate image embedding
+            with torch.no_grad():
+                image_features = self.model.get_image_features(**inputs)
+                # Normalize the features
+                image_features = image_features / image_features.norm(dim=-1, keepdim=True)
+            embedding = image_features.cpu().numpy().flatten()
+            # Verify embedding is valid
+            if np.isnan(embedding).any() or np.isinf(embedding).any():
+                print(f"Warning: Invalid embedding generated for {relative_path}")
+                return
+            # Delete any old versions from Qdrant if they exist
+            self.qdrant.delete(
+                collection_name=collection_name,
+                points_selector=qdrant_client.http.models.FilterSelector(
+                    filter=qdrant_client.http.models.Filter(
+                        must=[
+                            qdrant_client.http.models.FieldCondition(
+                                key="path",
+                                match={"value": relative_path}
+                            )
+                        ]
+                    )
+                )
+            )
+            # Store in Qdrant with image ID reference and minimal metadata
+            point_id = str(uuid.uuid4())
+            self.qdrant.upsert(
+                collection_name=collection_name,
+                points=[
+                    PointStruct(
+                        id=point_id,
+                        vector=embedding.tolist(),
+                        payload={
+                            "image_id": image_id,  # Reference to SQLite database
+                            "path": relative_path,  # Relative path from root folder
+                            "root_folder": str(root_folder.absolute()),  # Store root folder path
+                            "schema_version": CURRENT_SCHEMA_VERSION,
+                            "indexed_at": int(time.time())
+                        }
+                    )
+                ]
+            )
+            # Update indexed paths cache
+            if collection_name not in self.indexed_paths:
+                self.indexed_paths[collection_name] = set()
+            self.indexed_paths[collection_name].add(relative_path)
+            print(f"Stored embedding in Qdrant for {relative_path} (Image ID: {image_id})")
+        except Exception as e:
+            print(f"Error indexing image {image_path}: {e}")
+            import traceback
+            traceback.print_exc()
+        finally:
+            # Don't reset current_file here as it's managed by index_folder
+            await self.broadcast_status()
+    def _initialize_model_thread(self):
+        """Initialize model in a separate thread"""
+        try:
+            self.device = "cuda" if torch.cuda.is_available() else "cpu"
+            print(f"Using device: {self.device}")
+            # Load model and processor with proper device handling
+            self.processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch16")
+            # Load model directly to the target device to avoid meta tensor issues
+            if self.device == "cuda":
+                self.model = CLIPModel.from_pretrained(
+                    "openai/clip-vit-base-patch16",
+                    torch_dtype=torch.float16,
+                    device_map="auto"
+                )
+            else:
+                # For CPU, use device_map to avoid meta tensor issues
+                self.model = CLIPModel.from_pretrained(
+                    "openai/clip-vit-base-patch16",
+                    device_map="cpu"
+                )
+            self.model_initialized.set()
+            print("Model initialization complete")
+        except Exception as e:
+            print(f"Error initializing model: {e}")
+            self.status = IndexingStatus.IDLE
+            asyncio.run(self.broadcast_status())
+    async def get_all_images(self, folder_path: Optional[str] = None) -> List[Dict]:
+        """Get all indexed images, optionally filtered by folder"""
+        try:
+            if folder_path:
+                # Get images from specific folder
+                results = self.image_db.get_images_by_folder(str(Path(folder_path).absolute()))
+            else:
+                # Get images from all folders
+                results = self.image_db.get_all_images()
+            # Convert to API format
+            api_results = []
+            for image_data in results:
+                api_results.append({
+                    "id": image_data["id"],
+                    "path": image_data["relative_path"],
+                    "filename": image_data["filename"],
+                    "root_folder": image_data["root_folder"],
+                    "file_size": image_data["file_size"],
+                    "width": image_data["width"],
+                    "height": image_data["height"],
+                    "created_at": image_data["created_at"]
+                })
+            return api_results
+        except Exception as e:
+            print(f"Error getting images: {e}")
+            import traceback
+            traceback.print_exc()
+            return []
+class ImageEventHandler(FileSystemEventHandler):
+    def __init__(self, indexer: ImageIndexer, root_folder: Path):
+        self.indexer = indexer
+        self.root_folder = root_folder
+    def on_created(self, event):
+        if not event.is_directory:
+            asyncio.create_task(self.indexer.index_image(Path(event.src_path), self.root_folder))

image_search.py ADDED Viewed

	@@ -0,0 +1,272 @@

+import torch
+from PIL import Image
+from typing import List, Dict, Optional
+from transformers import CLIPProcessor, CLIPModel
+from qdrant_singleton import QdrantClientSingleton
+from folder_manager import FolderManager
+from image_database import ImageDatabase
+import httpx
+import io
+class ImageSearch:
+    def __init__(self):
+        self.device = "cuda" if torch.cuda.is_available() else "cpu"
+        print(f"Using device: {self.device}")
+        # Load model and processor with proper device handling
+        self.processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch16")
+        # Load model directly to the target device to avoid meta tensor issues
+        if self.device == "cuda":
+            self.model = CLIPModel.from_pretrained(
+                "openai/clip-vit-base-patch16",
+                torch_dtype=torch.float16,
+                device_map="auto"
+            )
+        else:
+            # For CPU, use device_map to avoid meta tensor issues
+            self.model = CLIPModel.from_pretrained(
+                "openai/clip-vit-base-patch16",
+                device_map="cpu"
+            )
+        # Initialize Qdrant client, folder manager and image database
+        self.qdrant = QdrantClientSingleton.get_instance()
+        self.folder_manager = FolderManager()
+        self.image_db = ImageDatabase()
+    def calculate_similarity_percentage(self, score: float) -> float:
+        """Convert cosine similarity score to percentage"""
+        # Qdrant returns cosine similarity scores between -1 and 1
+        # We want to convert this to a percentage between 0 and 100
+        # First normalize to 0-1 range, then convert to percentage
+        normalized = (score + 1) / 2
+        return normalized * 100
+    def filter_results(self, search_results: list, threshold: float = 60) -> List[Dict]:
+        """Filter and format search results"""
+        results = []
+        for scored_point in search_results:
+            # Convert cosine similarity to percentage
+            similarity = self.calculate_similarity_percentage(scored_point.score)
+            # Only include results above threshold (60% similarity)
+            if similarity >= threshold:
+                # Get image data from SQLite database
+                image_id = scored_point.payload.get("image_id")
+                if image_id:
+                    image_data = self.image_db.get_image(image_id)
+                    if image_data:
+                        results.append({
+                            "id": image_id,
+                            "path": scored_point.payload["path"],
+                            "filename": image_data["filename"],
+                            "root_folder": scored_point.payload["root_folder"],
+                            "similarity": round(similarity, 1),
+                            "file_size": image_data["file_size"],
+                            "width": image_data["width"],
+                            "height": image_data["height"]
+                        })
+        return results
+    async def search_by_text(self, query: str, folder_path: Optional[str] = None, k: int = 10) -> List[Dict]:
+        """Search images by text query"""
+        try:
+            print(f"\nSearching for text: '{query}'")
+            # Get collections to search
+            collections_to_search = []
+            if folder_path:
+                # Search in specific folder's collection
+                collection_name = self.folder_manager.get_collection_for_path(folder_path)
+                if collection_name:
+                    collections_to_search.append(collection_name)
+                    print(f"Searching in specific folder collection: {collection_name}")
+            else:
+                # Search in all collections
+                folders = self.folder_manager.get_all_folders()
+                print(f"Found {len(folders)} folders")
+                for folder in folders:
+                    print(f"Folder: {folder['path']}, Valid: {folder['is_valid']}, Collection: {folder.get('collection_name', 'None')}")
+                # Include all collections regardless of folder validity since images are in SQLite
+                collections_to_search.extend(folder["collection_name"] for folder in folders if folder.get("collection_name"))
+            print(f"Collections to search: {collections_to_search}")
+            if not collections_to_search:
+                print("No collections available to search")
+                return []
+            # Generate text embedding
+            inputs = self.processor(text=[query], return_tensors="pt", padding=True).to(self.device)
+            with torch.no_grad():
+                text_features = self.model.get_text_features(**inputs)
+                text_features = text_features / text_features.norm(dim=-1, keepdim=True)
+            text_embedding = text_features.cpu().numpy().flatten()
+            # Search in all relevant collections
+            all_results = []
+            for collection_name in collections_to_search:
+                try:
+                    # Get more results from each collection when searching multiple collections
+                    collection_limit = k * 3 if len(collections_to_search) > 1 else k
+                    search_result = self.qdrant.search(
+                        collection_name=collection_name,
+                        query_vector=text_embedding.tolist(),
+                        limit=collection_limit,  # Get more results from each collection
+                        offset=0,  # Explicitly set offset
+                        score_threshold=0.2  # Corresponds to 60% similarity after normalization
+                    )
+                    # Filter and format results
+                    results = self.filter_results(search_result) # Threshold is now default 60 in filter_results
+                    all_results.extend(results)
+                    print(f"Found {len(results)} matches in collection {collection_name}")
+                except Exception as e:
+                    print(f"Error searching collection {collection_name}: {e}")
+                    continue
+            # Sort all results by similarity
+            all_results.sort(key=lambda x: x["similarity"], reverse=True)
+            # Take top k results
+            final_results = all_results[:k]
+            print(f"Found {len(final_results)} total relevant matches across {len(collections_to_search)} collections")
+            return final_results
+        except Exception as e:
+            print(f"Error in text search: {e}")
+            import traceback
+            traceback.print_exc()
+            return []
+    async def search_by_image(self, image: Image.Image, folder_path: Optional[str] = None, k: int = 10) -> List[Dict]:
+        """Search images by similarity to uploaded image"""
+        try:
+            print(f"\nSearching by image...")
+            # Get collections to search
+            collections_to_search = []
+            if folder_path:
+                # Search in specific folder's collection
+                collection_name = self.folder_manager.get_collection_for_path(folder_path)
+                if collection_name:
+                    collections_to_search.append(collection_name)
+                    print(f"Searching in specific folder collection: {collection_name}")
+            else:
+                # Search in all collections
+                folders = self.folder_manager.get_all_folders()
+                print(f"Found {len(folders)} folders")
+                for folder in folders:
+                    print(f"Folder: {folder['path']}, Valid: {folder['is_valid']}, Collection: {folder.get('collection_name', 'None')}")
+                # Include all collections regardless of folder validity since images are in SQLite
+                collections_to_search.extend(folder["collection_name"] for folder in folders if folder.get("collection_name"))
+            print(f"Collections to search: {collections_to_search}")
+            if not collections_to_search:
+                print("No collections available to search")
+                return []
+            # Generate image embedding
+            inputs = self.processor(images=image, return_tensors="pt").to(self.device)
+            with torch.no_grad():
+                image_features = self.model.get_image_features(**inputs)
+                image_features = image_features / image_features.norm(dim=-1, keepdim=True)
+            image_embedding = image_features.cpu().numpy().flatten()
+            # Search in all relevant collections
+            all_results = []
+            for collection_name in collections_to_search:
+                try:
+                    # Get more results from each collection when searching multiple collections
+                    collection_limit = k * 3 if len(collections_to_search) > 1 else k
+                    search_result = self.qdrant.search(
+                        collection_name=collection_name,
+                        query_vector=image_embedding.tolist(),
+                        limit=collection_limit,  # Get more results from each collection
+                        offset=0,  # Explicitly set offset
+                        score_threshold=0.2  # Corresponds to 60% similarity after normalization
+                    )
+                    # Filter and format results
+                    results = self.filter_results(search_result) # Threshold is now default 60 in filter_results
+                    all_results.extend(results)
+                    print(f"Found {len(results)} matches in collection {collection_name}")
+                except Exception as e:
+                    print(f"Error searching collection {collection_name}: {e}")
+                    continue
+            # Sort all results by similarity
+            all_results.sort(key=lambda x: x["similarity"], reverse=True)
+            # Take top k results
+            final_results = all_results[:k]
+            print(f"Found {len(final_results)} total relevant matches across {len(collections_to_search)} collections")
+            return final_results
+        except Exception as e:
+            print(f"Error in image search: {e}")
+            import traceback
+            traceback.print_exc()
+            return []
+    async def download_image_from_url(self, url: str) -> Optional[Image.Image]:
+        """Download and return an image from a URL"""
+        try:
+            print(f"Downloading image from URL: {url}")
+            # Use httpx for async HTTP requests
+            async with httpx.AsyncClient(timeout=30.0) as client:
+                response = await client.get(url)
+                response.raise_for_status()
+                # Check if the response is an image
+                content_type = response.headers.get('content-type', '')
+                if not content_type.startswith('image/'):
+                    raise ValueError(f"URL does not point to an image. Content-Type: {content_type}")
+                # Load image from response content
+                image_bytes = io.BytesIO(response.content)
+                image = Image.open(image_bytes)
+                # Convert to RGB if necessary (for consistency with CLIP)
+                if image.mode != 'RGB':
+                    image = image.convert('RGB')
+                print(f"Successfully downloaded image: {image.size}")
+                return image
+        except httpx.TimeoutException:
+            print(f"Timeout while downloading image from URL: {url}")
+            return None
+        except httpx.HTTPStatusError as e:
+            print(f"HTTP error {e.response.status_code} while downloading image from URL: {url}")
+            return None
+        except Exception as e:
+            print(f"Error downloading image from URL {url}: {e}")
+            return None
+    async def search_by_url(self, url: str, folder_path: Optional[str] = None, k: int = 10) -> List[Dict]:
+        """Search images by downloading and comparing an image from a URL"""
+        try:
+            print(f"\nSearching by image URL: {url}")
+            # Download the image from URL
+            image = await self.download_image_from_url(url)
+            if image is None:
+                return []
+            # Use the existing search_by_image method
+            return await self.search_by_image(image, folder_path, k)
+        except Exception as e:
+            print(f"Error in URL search: {e}")
+            import traceback
+            traceback.print_exc()
+            return []

pyproject.toml ADDED Viewed

	@@ -0,0 +1,6 @@

+[tool.pytest.ini_options]
+pythonpath = "."
+testpaths = ["tests"]
+python_files = ["test_*.py"]
+asyncio_mode = "strict"
+asyncio_default_fixture_loop_scope = "function"

qdrant_singleton.py ADDED Viewed

	@@ -0,0 +1,148 @@

+from qdrant_client import QdrantClient
+from qdrant_client.http import models
+from pathlib import Path
+import os
+from dotenv import load_dotenv
+# Load environment variables
+load_dotenv()
+CURRENT_SCHEMA_VERSION = "1.2"  # Increment this when schema changes
+VECTOR_SIZE = 512  # CLIP embedding size
+class QdrantClientSingleton:
+    _instance = None
+    @classmethod
+    def get_instance(cls):
+        if cls._instance is None:
+            # Check if we have cloud credentials
+            qdrant_url = os.getenv('QDRANT_URL')
+            qdrant_api_key = os.getenv('QDRANT_API_KEY')
+            print(f"QDRANT_URL: {qdrant_url}")
+            print(f"QDRANT_API_KEY: {'***' + qdrant_api_key[-10:] if qdrant_api_key else 'None'}")
+            if qdrant_url and qdrant_api_key:
+                print(f"Initializing Qdrant Cloud client: {qdrant_url}")
+                try:
+                    cls._instance = QdrantClient(
+                        url=qdrant_url,
+                        api_key=qdrant_api_key,
+                    )
+                    print("Successfully connected to Qdrant Cloud")
+                except Exception as e:
+                    print(f"Failed to connect to Qdrant Cloud: {e}")
+                    print("Falling back to local storage")
+                    storage_path = Path("qdrant_data").absolute()
+                    storage_path.mkdir(exist_ok=True)
+                    cls._instance = QdrantClient(path=str(storage_path))
+            else:
+                # Fallback to local storage
+                print("Cloud credentials not found, using local Qdrant storage")
+                storage_path = Path("qdrant_data").absolute()
+                storage_path.mkdir(exist_ok=True)
+                cls._instance = QdrantClient(path=str(storage_path))
+            # Print collections for debugging
+            try:
+                collections = cls._instance.get_collections().collections
+                print(f"Available collections: {[col.name for col in collections]}")
+            except Exception as e:
+                print(f"Error getting collections: {e}")
+        return cls._instance
+    @classmethod
+    def initialize_collection(cls, collection_name: str):
+        client = cls.get_instance()
+        # Check if collection exists
+        collections = client.get_collections().collections
+        exists = any(collection.name == collection_name for collection in collections)
+        if not exists:
+            # Create new collection with current schema version
+            cls._create_collection(client, collection_name)
+        else:
+            # Check schema version and update if necessary
+            cls._check_and_update_schema(client, collection_name)
+    @classmethod
+    def _create_collection(cls, client: QdrantClient, collection_name: str):
+        """Create a new collection with the current schema version"""
+        # First create the collection with basic config
+        client.create_collection(
+            collection_name=collection_name,
+            vectors_config=models.VectorParams(
+                size=VECTOR_SIZE,
+                distance=models.Distance.COSINE
+            ),
+            on_disk_payload=True,  # Store vectors on disk
+            optimizers_config=models.OptimizersConfigDiff(
+                indexing_threshold=0  # Index immediately
+            )
+        )
+        # Then create payload indexes for efficient searching
+        client.create_payload_index(
+            collection_name=collection_name,
+            field_name="image_id",
+            field_schema=models.PayloadSchemaType.KEYWORD
+        )
+        client.create_payload_index(
+            collection_name=collection_name,
+            field_name="path",
+            field_schema=models.PayloadSchemaType.KEYWORD
+        )
+        client.create_payload_index(
+            collection_name=collection_name,
+            field_name="root_folder",
+            field_schema=models.PayloadSchemaType.KEYWORD
+        )
+        client.create_payload_index(
+            collection_name=collection_name,
+            field_name="schema_version",
+            field_schema=models.PayloadSchemaType.KEYWORD
+        )
+        client.create_payload_index(
+            collection_name=collection_name,
+            field_name="indexed_at",
+            field_schema=models.PayloadSchemaType.INTEGER
+        )
+        print(f"Created collection {collection_name} with schema version {CURRENT_SCHEMA_VERSION}")
+    @classmethod
+    def _check_and_update_schema(cls, client: QdrantClient, collection_name: str):
+        """Check collection schema version and update if necessary"""
+        try:
+            # Get a sample point to check schema version
+            sample = client.scroll(
+                collection_name=collection_name,
+                limit=1,
+                with_payload=True
+            )[0]
+            if not sample:
+                print(f"Collection {collection_name} is empty")
+                return
+            # Check schema version of existing data
+            point_version = sample[0].payload.get("schema_version", "0.0")
+            if point_version != CURRENT_SCHEMA_VERSION:
+                print(f"Schema version mismatch: {point_version} != {CURRENT_SCHEMA_VERSION}")
+                print(f"Collection {collection_name} needs to be recreated")
+                # Recreate collection with new schema
+                client.delete_collection(collection_name=collection_name)
+                cls._create_collection(client, collection_name)
+            else:
+                print(f"Collection {collection_name} schema is up to date (version {CURRENT_SCHEMA_VERSION})")
+        except Exception as e:
+            print(f"Error checking schema: {e}")
+            cls._create_collection(client, collection_name)

requirements-test.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+pytest==7.4.4
+pytest-asyncio==0.23.5
+requests==2.31.0

requirements.txt ADDED Viewed

	@@ -0,0 +1,17 @@

+fastapi
+uvicorn
+torch
+torchvision
+transformers
+Pillow
+python-multipart
+watchdog
+numpy
+qdrant-client
+aiofiles
+jinja2
+uvicorn[standard]
+websockets
+python-dotenv
+httpx
+accelerate

static/image.png ADDED Viewed

static/js/script.js ADDED Viewed

	@@ -0,0 +1,546 @@

+console.log('script.js loaded');
+let currentPath = null;
+let folderModal = null;
+let selectedFolder = null;
+let ws = null;
+// Initialize WebSocket connection
+function connectWebSocket() {
+    ws = new WebSocket(`ws://${window.location.host}/ws`);
+    ws.onopen = function () {
+        console.log('WebSocket connected');
+    };
+    ws.onmessage = function (event) {
+        const status = JSON.parse(event.data);
+        updateIndexingStatus(status);
+    };
+    ws.onclose = function () {
+        console.log('WebSocket disconnected, attempting to reconnect...');
+        setTimeout(connectWebSocket, 1000);
+    };
+    ws.onerror = function (error) {
+        console.error('WebSocket error:', error);
+    };
+}
+// Update indexing progress
+function updateIndexingStatus(status) {
+    const statusDiv = document.getElementById('indexingStatus');
+    const progressBar = statusDiv.querySelector('.progress-bar');
+    const details = document.getElementById('indexingDetails');
+    if (status.status === 'idle') {
+        // Fade out the status div
+        statusDiv.style.opacity = '0';
+        setTimeout(() => {
+            statusDiv.style.display = 'none';
+            statusDiv.style.opacity = '1';
+        }, 500);
+        return;
+    }
+    // Show and update the status
+    statusDiv.style.display = 'block';
+    statusDiv.style.opacity = '1';
+    // Calculate progress percentage
+    const percentage = status.total_files > 0
+        ? Math.round((status.processed_files / status.total_files) * 100)
+        : 0;
+    progressBar.style.width = `${percentage}%`;
+    progressBar.setAttribute('aria-valuenow', percentage);
+    // Update status text
+    let statusText = `Status: ${status.status}`;
+    if (status.current_file) {
+        statusText += ` | Current file: ${status.current_file}`;
+    }
+    if (status.total_files > 0) {
+        statusText += ` | Progress: ${status.processed_files}/${status.total_files} (${percentage}%)`;
+    }
+    details.textContent = statusText;
+}
+// IntersectionObserver for lazy loading images
+let imageObserver = null;
+function observeLazyLoadImages() {
+    const lazyLoadImages = document.querySelectorAll('img.lazy-load');
+    if (imageObserver) {
+        // Disconnect previous observer if any
+        imageObserver.disconnect();
+    }
+    imageObserver = new IntersectionObserver((entries, observer) => {
+        entries.forEach(entry => {
+            if (entry.isIntersecting) {
+                const img = entry.target;
+                const fullSrc = img.dataset.src;
+                if (fullSrc) {
+                    img.src = fullSrc;
+                    img.removeAttribute('data-src'); // Remove data-src to prevent re-processing
+                    img.classList.remove('lazy-load'); // Remove class to prevent re-observing
+                }
+                observer.unobserve(img); // Stop observing the image once loaded
+            }
+        });
+    }, {
+        rootMargin: '0px 0px 200px 0px' // Load images 200px before they enter viewport
+    });
+    lazyLoadImages.forEach(img => {
+        imageObserver.observe(img);
+    });
+}
+// Initialize folder browser
+async function initFolderBrowser() {
+    folderModal = new bootstrap.Modal(document.getElementById('folderBrowserModal'));
+    await loadFolderContents();
+    await loadIndexedFolders();
+}
+// Open folder browser modal
+function openFolderBrowser() {
+    selectedFolder = null;
+    folderModal.show();
+    loadFolderContents();
+}
+function showDrives(breadcrumb, browser, data) {
+    // Windows drives
+    breadcrumb.innerHTML = '<li class="breadcrumb-item active">Drives</li>';
+    data.drives.forEach(drive => {
+        const escapedDrive = drive.replace(/\\/g, '\\\\').replace(/'/g, "\\'");
+        browser.innerHTML += `
+                    <div class="folder-item" onclick="loadFolderContents('${escapedDrive}')">
+                        <i class="bi bi-hdd"></i>${drive}
+                    </div>
+                `;
+    });
+}
+function showFolderContents(breadcrumb, browser, data) {
+    // Folder contents
+    currentPath = data.current_path;
+    // Update breadcrumb
+    const pathParts = currentPath.split(/[\\/]/);
+    let currentBreadcrumb = '';
+    pathParts.forEach((part, index) => {
+        if (part) {
+            // Check if the path contains backslashes to detect Windows
+            const isWindows = currentPath.includes('\\');
+            currentBreadcrumb += part + (isWindows ? '\\' : '/');
+            const isLast = index === pathParts.length - 1;
+            const escapedPath = currentBreadcrumb.replace(/\\/g, '\\\\').replace(/'/g, "\\'");
+            breadcrumb.innerHTML += `
+                                    <li class="breadcrumb-item ${isLast ? 'active' : ''}">
+                                        ${isLast ? part : `<a href="#" onclick="loadFolderContents('${escapedPath}')">${part}</a>`}
+                                    </li>
+                                `;
+        }
+    });
+    // Add parent directory
+    if (data.parent_path) {
+        addParentDirectory(browser, data);
+    }
+    // Add folders and files
+    addFolderContents(browser, data);
+}
+function addParentDirectory(browser, data) {
+    const escapedParentPath = data.parent_path.replace(/\\/g, '\\\\').replace(/'/g, "\\'");
+    browser.innerHTML += `
+                            <div class="folder-item" onclick="loadFolderContents('${escapedParentPath}')">
+                                <i class="bi bi-arrow-up"></i>..
+                            </div>
+                        `;
+}
+function addFolderContents(browser, data) {
+    data.contents.forEach(item => {
+        const icon = item.type === 'directory' ? 'bi-folder' : 'bi-image';
+        const escapedPath = item.path.replace(/\\/g, '\\\\').replace(/'/g, "\\'");
+        browser.innerHTML += `
+                                <div class="folder-item" onclick="${item.type === 'directory' ? `loadFolderContents('${escapedPath}')` : ''}" ondblclick="${item.type === 'directory' ? `selectFolder('${escapedPath}')` : ''}">
+                                    <i class="bi ${icon}"></i>${item.name}
+                                </div>
+                            `;
+    });
+}
+// Load folder contents
+async function loadFolderContents(path = null) {
+    try {
+        const url = path ? `/browse/${encodeURIComponent(path)}` : '/browse';
+        const response = await fetch(url);
+        const data = await response.json();
+        const browser = document.getElementById('folderBrowser');
+        const breadcrumb = document.getElementById('folderBreadcrumb');
+        browser.innerHTML = '';
+        breadcrumb.innerHTML = '';
+        if (data.drives) {
+            showDrives(breadcrumb, browser, data);
+        } else {
+            showFolderContents(breadcrumb, browser, data);
+        }
+    } catch (error) {
+        console.error('Error loading folder contents:', error);
+    }
+}
+// Select folder for indexing
+function selectFolder(path) {
+    selectedFolder = path;
+    addSelectedFolder();
+}
+// Add selected folder
+async function addSelectedFolder() {
+    folderModal.hide();
+    if (!selectedFolder && currentPath) {
+        selectedFolder = currentPath;
+    }
+    if (selectedFolder) {
+        try {
+            const encodedPath = encodeURIComponent(selectedFolder);
+            const response = await fetch(`/folders?folder_path=${encodedPath}`, {
+                method: 'POST'
+            });
+            if (response.ok) {
+                await loadIndexedFolders();
+                selectedFolder = null;
+            } else {
+                const error = await response.json();
+                alert(`Error adding folder: ${error.detail || error.message || JSON.stringify(error)}`);
+            }
+        } catch (error) {
+            console.error('Error adding folder:', error);
+            alert('Error adding folder. Please try again.');
+        }
+    }
+}
+// Load indexed folders
+async function loadIndexedFolders() {
+    try {
+        const response = await fetch('/folders');
+        const folders = await response.json();
+        const folderList = document.getElementById('folderList');
+        folderList.innerHTML = '';
+        if (folders.length === 0) {
+            folderList.innerHTML = `
+                <div class="text-center p-4 text-muted">
+                    <i class="bi bi-folder-x fs-2 d-block mb-2"></i>
+                    <small>No folders indexed yet</small>
+                </div>
+            `;
+            return;
+        }
+        folders.forEach(folder => {
+            const escapedPath = folder.path.replace(/\\/g, '\\\\').replace(/'/g, "\\'");
+            const folderCard = document.createElement('div');
+            folderCard.className = `folder-item-card ${!folder.is_valid ? 'invalid' : ''}`;
+            folderCard.innerHTML = `
+                <div class="d-flex justify-content-between align-items-start p-3">
+                    <div class="flex-grow-1 me-2">
+                        <div class="d-flex align-items-center mb-1">
+                            <i class="bi bi-folder-fill me-2 ${folder.is_valid ? 'text-primary' : 'text-danger'}"></i>
+                            <span class="fw-semibold ${!folder.is_valid ? 'text-danger' : 'text-dark'}" style="font-size: 0.9rem;">
+                                ${folder.path.split(/[\\/]/).pop()}
+                            </span>
+                        </div>
+                        <div class="text-muted small" style="word-break: break-all; line-height: 1.3;">
+                            ${folder.path}
+                        </div>
+                        ${!folder.is_valid ? '<small class="text-danger"><i class="bi bi-exclamation-triangle me-1"></i>Path not accessible</small>' : ''}
+                    </div>
+                    <button class="btn btn-outline-danger btn-sm" onclick="removeFolder('${escapedPath}')" title="Remove folder">
+                        <i class="bi bi-trash"></i>
+                    </button>
+                </div>
+            `;
+            folderList.appendChild(folderCard);
+        });
+        // Load images from all folders
+        await loadImages();
+    } catch (error) {
+        console.error('Error loading folders:', error);
+    }
+}
+// Remove folder
+async function removeFolder(path) {
+    if (confirm('Are you sure you want to remove this folder?')) {
+        try {
+            const encodedPath = encodeURIComponent(path).replace(/%5C/g, '\\');
+            const response = await fetch(`/folders/${encodedPath}`, {
+                method: 'DELETE'
+            });
+            if (response.ok) {
+                await loadIndexedFolders();
+            } else {
+                const error = await response.text();
+                alert(`Error removing folder: ${error}`);
+            }
+        } catch (error) {
+            console.error('Error removing folder:', error);
+            alert('Error removing folder. Please try again.');
+        }
+    }
+}
+// Load images
+async function loadImages(folder = null) {
+    try {
+        const url = folder ? `/images?folder=${encodeURIComponent(folder)}` : '/images';
+        const response = await fetch(url);
+        const images = await response.json();
+        const imageGrid = document.getElementById('imageGrid');
+        imageGrid.innerHTML = '';
+        if (images.length === 0) {
+            imageGrid.innerHTML = `
+                <div class="col-12">
+                    <div class="text-center p-5">
+                        <i class="bi bi-images fs-1 text-muted d-block mb-3"></i>
+                        <h5 class="text-muted mb-2">No images found</h5>
+                        <p class="text-muted">Add some folders to start indexing your images</p>
+                    </div>
+                </div>
+            `;
+            return;
+        }
+        images.forEach(image => {
+            const card = document.createElement('div');
+            card.className = 'image-card';
+            card.innerHTML = `
+                <div class="image-wrapper">
+                    <img class="lazy-load"
+                         src="/thumbnail/${image.id}"
+                         data-src="/image/${image.id}"
+                         alt="${image.filename || image.path}"
+                         loading="lazy">
+                </div>
+                <div class="image-info">
+                    <span class="filename" title="${image.filename || image.path}">${image.filename || image.path}</span>
+                    <span class="file-size">${formatFileSize(image.file_size)}</span>
+                </div>
+            `;
+            imageGrid.appendChild(card);
+        });
+        observeLazyLoadImages(); // Initialize IntersectionObserver for new images
+    } catch (error) {
+        console.error('Error loading images:', error);
+        const imageGrid = document.getElementById('imageGrid');
+        imageGrid.innerHTML = '<div class="col-12"><div class="error text-center p-4">Error loading images. Please try again.</div></div>';
+    }
+}
+// Utility function to format file sizes
+function formatFileSize(bytes) {
+    if (bytes === 0) return '0 Bytes';
+    const k = 1024;
+    const sizes = ['Bytes', 'KB', 'MB', 'GB'];
+    const i = Math.floor(Math.log(bytes) / Math.log(k));
+    return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i];
+}
+// Get current folder path
+function getCurrentPath() {
+    // Return the current path if we're in a folder, otherwise null
+    return currentPath;
+}
+// Search images
+async function searchImages(event) {
+    event.preventDefault();
+    const query = document.getElementById('searchInput').value;
+    if (!query) return;
+    try {
+        // Only include folder parameter if we're inside the folder browser
+        const searchUrl = `/search/text?query=${encodeURIComponent(query)}`;
+        const response = await fetch(searchUrl);
+        const results = await response.json();
+        displaySearchResults(results);
+    } catch (error) {
+        console.error('Error searching images:', error);
+        const imageGrid = document.getElementById('imageGrid');
+        imageGrid.innerHTML = `
+            <div class="col-12">
+                <div class="error text-center p-5">
+                    <i class="bi bi-exclamation-triangle fs-1 text-danger d-block mb-3"></i>
+                    <h5 class="text-danger mb-2">Search Error</h5>
+                    <p class="text-muted">An error occurred while searching. Please try again.</p>
+                </div>
+            </div>
+        `;
+    }
+}
+// Search by image
+async function searchByImage(event) {
+    const file = event.target.files[0];
+    if (!file) return;
+    const formData = new FormData();
+    formData.append('file', file);
+    try {
+        const searchUrl = '/search/image';
+        const response = await fetch(searchUrl, {
+            method: 'POST',
+            body: formData
+        });
+        const results = await response.json();
+        displaySearchResults(results);
+        // Reset file input
+        event.target.value = '';
+    } catch (error) {
+        console.error('Error searching by image:', error);
+        const imageGrid = document.getElementById('imageGrid');
+        imageGrid.innerHTML = `
+            <div class="col-12">
+                <div class="error text-center p-5">
+                    <i class="bi bi-exclamation-triangle fs-1 text-danger d-block mb-3"></i>
+                    <h5 class="text-danger mb-2">Image Search Error</h5>
+                    <p class="text-muted">An error occurred while processing your image. Please try again.</p>
+                </div>
+            </div>
+        `;
+    }
+}
+// Search by URL
+async function searchByUrl(event) {
+    event.preventDefault();
+    const url = document.getElementById('urlInput').value;
+    if (!url) return;
+    try {
+        // Show loading state
+        const imageGrid = document.getElementById('imageGrid');
+        imageGrid.innerHTML = `
+            <div class="col-12">
+                <div class="loading text-center p-5">
+                    <div class="spinner-border text-primary mb-3" role="status">
+                        <span class="visually-hidden">Loading...</span>
+                    </div>
+                    <h5 class="text-primary mb-2">Downloading and analyzing image...</h5>
+                    <p class="text-muted">This may take a few moments</p>
+                </div>
+            </div>
+        `;
+        const searchUrl = `/search/url?url=${encodeURIComponent(url)}`;
+        const response = await fetch(searchUrl);
+        const results = await response.json();
+        displaySearchResults(results);
+        // Clear URL input and hide form
+        document.getElementById('urlInput').value = '';
+        toggleUrlSearch();
+    } catch (error) {
+        console.error('Error searching by URL:', error);
+        const imageGrid = document.getElementById('imageGrid');
+        imageGrid.innerHTML = `
+            <div class="col-12">
+                <div class="error text-center p-5">
+                    <i class="bi bi-exclamation-triangle fs-1 text-danger d-block mb-3"></i>
+                    <h5 class="text-danger mb-2">Error processing URL</h5>
+                    <p class="text-muted">Please check the URL and try again. Make sure it points to a valid image.</p>
+                </div>
+            </div>
+        `;
+    }
+}
+// Display search results (common function for all search types)
+function displaySearchResults(results) {
+    const imageGrid = document.getElementById('imageGrid');
+    imageGrid.innerHTML = '';
+    if (results.length === 0) {
+        imageGrid.innerHTML = `
+            <div class="col-12">
+                <div class="no-results text-center p-5">
+                    <i class="bi bi-search fs-1 text-muted d-block mb-3"></i>
+                    <h5 class="text-muted mb-2">No similar images found</h5>
+                    <p class="text-muted">Try adjusting your search terms or uploading a different image</p>
+                </div>
+            </div>
+        `;
+        return;
+    }
+    results.forEach(result => {
+        const card = document.createElement('div');
+        card.className = 'image-card';
+        card.innerHTML = `
+            <div class="image-wrapper">
+                <img class="lazy-load"
+                     src="/thumbnail/${result.id}"
+                     data-src="/image/${result.id}"
+                     alt="${result.filename || result.path}"
+                     loading="lazy">
+                <div class="similarity-score">${result.similarity}%</div>
+            </div>
+            <div class="image-info">
+                <span class="filename" title="${result.filename || result.path}">${result.filename || result.path}</span>
+                <span class="file-size">${formatFileSize(result.file_size)}</span>
+            </div>
+        `;
+        imageGrid.appendChild(card);
+    });
+    observeLazyLoadImages(); // Initialize IntersectionObserver for new images
+}
+// Toggle URL search form visibility
+function toggleUrlSearch() {
+    const urlForm = document.getElementById('urlSearchForm');
+    const isVisible = urlForm.style.display !== 'none';
+    if (isVisible) {
+        urlForm.style.display = 'none';
+        document.getElementById('urlInput').value = '';
+    } else {
+        urlForm.style.display = 'flex';
+        document.getElementById('urlInput').focus();
+    }
+}
+// Initialize
+document.addEventListener('DOMContentLoaded', () => {
+    connectWebSocket();
+    initFolderBrowser();
+});

templates/index.html ADDED Viewed

	@@ -0,0 +1,530 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Visual Product Search</title>
+    <link rel="icon" href="/static/image.png" type="image/png">
+    <link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet">
+    <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/font/bootstrap-icons.css">
+    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
+    <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js"></script>
+    <script src="/static/js/script.js"></script>
+    <style>
+        :root {
+            --primary-color: #6366f1;
+            --primary-dark: #4f46e5;
+            --primary-light: #8b5cf6;
+            --secondary-color: #f8fafc;
+            --accent-color: #06b6d4;
+            --text-primary: #1e293b;
+            --text-secondary: #64748b;
+            --border-color: #e2e8f0;
+            --success-color: #10b981;
+            --warning-color: #f59e0b;
+            --danger-color: #ef4444;
+            --gradient-bg: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            --card-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
+            --card-shadow-hover: 0 10px 15px -3px rgba(0, 0, 0, 0.1), 0 4px 6px -2px rgba(0, 0, 0, 0.05);
+        }
+        * {
+            font-family: 'Inter', sans-serif;
+        }
+        body {
+            background: linear-gradient(135deg, #f8fafc 0%, #e2e8f0 100%);
+            min-height: 100vh;
+        }
+        .navbar {
+            background: var(--gradient-bg) !important;
+            backdrop-filter: blur(10px);
+            border-bottom: 1px solid rgba(255, 255, 255, 0.1);
+            padding: 1rem 0;
+        }
+        .navbar-brand {
+            font-weight: 700;
+            font-size: 1.5rem;
+            color: white !important;
+            display: flex;
+            align-items: center;
+            gap: 0.5rem;
+        }
+        .brand-icon {
+            width: 32px;
+            height: 32px;
+            background: rgba(255, 255, 255, 0.2);
+            border-radius: 8px;
+            display: flex;
+            align-items: center;
+            justify-content: center;
+        }
+        .search-container {
+            max-width: 600px;
+            margin: 0 auto;
+        }
+        .search-form {
+            background: rgba(255, 255, 255, 0.95);
+            backdrop-filter: blur(10px);
+            border-radius: 20px;
+            padding: 8px;
+            box-shadow: var(--card-shadow);
+            border: 1px solid rgba(255, 255, 255, 0.2);
+        }
+        .search-input {
+            border: none;
+            background: transparent;
+            padding: 12px 20px;
+            font-weight: 500;
+        }
+        .search-input:focus {
+            outline: none;
+            box-shadow: none;
+        }
+        .search-btn {
+            border-radius: 16px;
+            padding: 12px 24px;
+            font-weight: 600;
+            background: var(--primary-color);
+            border: none;
+            transition: all 0.3s ease;
+        }
+        .search-btn:hover {
+            background: var(--primary-dark);
+            transform: translateY(-1px);
+        }
+        .action-btn {
+            border-radius: 16px;
+            padding: 12px 16px;
+            border: 1px solid rgba(255, 255, 255, 0.5);
+            background: rgba(255, 255, 255, 0.9);
+            color: var(--primary-color);
+            font-weight: 600;
+            transition: all 0.3s ease;
+            min-width: 48px;
+            display: flex;
+            align-items: center;
+            justify-content: center;
+        }
+        .action-btn:hover {
+            background: rgba(255, 255, 255, 1);
+            color: var(--primary-dark);
+            transform: translateY(-1px);
+            border-color: rgba(255, 255, 255, 0.7);
+            box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
+        }
+        .action-btn i {
+            font-size: 1.1rem;
+        }
+        .url-search-form {
+            background: rgba(255, 255, 255, 0.95);
+            backdrop-filter: blur(10px);
+            border-radius: 20px;
+            padding: 8px;
+            box-shadow: var(--card-shadow);
+            border: 1px solid rgba(255, 255, 255, 0.2);
+        }
+        .main-container {
+            padding-top: 2rem;
+        }
+        .sidebar {
+            background: white;
+            border-radius: 20px;
+            padding: 1.5rem;
+            box-shadow: var(--card-shadow);
+            border: 1px solid var(--border-color);
+            height: fit-content;
+            position: sticky;
+            top: 2rem;
+        }
+        .sidebar-title {
+            font-weight: 600;
+            color: var(--text-primary);
+            margin-bottom: 1rem;
+            display: flex;
+            align-items: center;
+            gap: 0.5rem;
+        }
+        .folder-list {
+            border: none;
+        }
+        .folder-item-card {
+            border: 1px solid var(--border-color);
+            border-radius: 12px;
+            margin-bottom: 8px;
+            transition: all 0.3s ease;
+            background: white;
+        }
+        .folder-item-card:hover {
+            transform: translateY(-2px);
+            box-shadow: var(--card-shadow-hover);
+            border-color: var(--primary-color);
+        }
+        .folder-item-card.invalid {
+            border-color: var(--danger-color);
+            background: #fef2f2;
+        }
+        .add-folder-btn {
+            background: var(--primary-color);
+            color: white;
+            border: none;
+            border-radius: 12px;
+            padding: 12px 20px;
+            font-weight: 600;
+            width: 100%;
+            margin-bottom: 1rem;
+            transition: all 0.3s ease;
+        }
+        .add-folder-btn:hover {
+            background: var(--primary-dark);
+            transform: translateY(-1px);
+            color: white;
+        }
+        .content-area {
+            background: white;
+            border-radius: 20px;
+            padding: 2rem;
+            box-shadow: var(--card-shadow);
+            border: 1px solid var(--border-color);
+            min-height: 70vh;
+        }
+        .image-grid {
+            display: grid;
+            grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
+            gap: 1.5rem;
+            padding: 1rem 0;
+        }
+        .image-card {
+            position: relative;
+            background: white;
+            border-radius: 16px;
+            overflow: hidden;
+            box-shadow: var(--card-shadow);
+            border: 1px solid var(--border-color);
+            transition: all 0.3s ease;
+        }
+        .image-card:hover {
+            transform: translateY(-4px);
+            box-shadow: var(--card-shadow-hover);
+        }
+        .image-wrapper {
+            aspect-ratio: 1;
+            overflow: hidden;
+            position: relative;
+        }
+        .image-card img {
+            width: 100%;
+            height: 100%;
+            object-fit: cover;
+            transition: transform 0.3s ease;
+        }
+        .image-card:hover img {
+            transform: scale(1.05);
+        }
+        .similarity-score {
+            position: absolute;
+            top: 12px;
+            right: 12px;
+            background: var(--primary-color);
+            color: white;
+            padding: 6px 12px;
+            border-radius: 20px;
+            font-size: 0.875rem;
+            font-weight: 600;
+            backdrop-filter: blur(10px);
+        }
+        .image-info {
+            padding: 1rem;
+            background: white;
+        }
+        .filename {
+            display: block;
+            font-weight: 600;
+            color: var(--text-primary);
+            margin-bottom: 0.5rem;
+            overflow: hidden;
+            text-overflow: ellipsis;
+            white-space: nowrap;
+        }
+        .file-size {
+            font-size: 0.875rem;
+            color: var(--text-secondary);
+            font-weight: 500;
+        }
+        .status-card {
+            background: white;
+            border-radius: 16px;
+            padding: 1.5rem;
+            box-shadow: var(--card-shadow);
+            border: 1px solid var(--border-color);
+            margin-bottom: 1.5rem;
+        }
+        .progress {
+            height: 8px;
+            border-radius: 20px;
+            background: #f1f5f9;
+            overflow: hidden;
+        }
+        .progress-bar {
+            background: var(--gradient-bg);
+            border-radius: 20px;
+            transition: width 0.3s ease;
+        }
+        .no-results, .error, .loading {
+            text-align: center;
+            padding: 3rem;
+            color: var(--text-secondary);
+            font-weight: 500;
+        }
+        .error {
+            color: var(--danger-color);
+        }
+        .loading {
+            color: var(--primary-color);
+        }
+        .folder-browser {
+            max-height: 400px;
+            overflow-y: auto;
+            border-radius: 12px;
+            border: 1px solid var(--border-color);
+        }
+        .folder-item {
+            cursor: pointer;
+            padding: 12px 16px;
+            border-radius: 8px;
+            margin: 4px;
+            transition: all 0.2s ease;
+            display: flex;
+            align-items: center;
+            gap: 12px;
+        }
+        .folder-item:hover {
+            background: var(--secondary-color);
+            color: var(--primary-color);
+        }
+        .modal-content {
+            border-radius: 20px;
+            border: none;
+            box-shadow: 0 20px 25px -5px rgba(0, 0, 0, 0.1), 0 10px 10px -5px rgba(0, 0, 0, 0.04);
+        }
+        .modal-header {
+            border-bottom: 1px solid var(--border-color);
+            padding: 1.5rem;
+        }
+        .breadcrumb {
+            background: var(--secondary-color);
+            border-radius: 12px;
+            padding: 12px 16px;
+        }
+        .btn-primary {
+            background: var(--primary-color);
+            border: none;
+            border-radius: 12px;
+            padding: 10px 20px;
+            font-weight: 600;
+        }
+        .btn-primary:hover {
+            background: var(--primary-dark);
+        }
+        .btn-secondary {
+            background: var(--text-secondary);
+            border: none;
+            border-radius: 12px;
+            padding: 10px 20px;
+            font-weight: 600;
+        }
+        .btn-outline-danger {
+            border-color: var(--danger-color);
+            color: var(--danger-color);
+            border-radius: 8px;
+            font-weight: 600;
+        }
+        .btn-outline-danger:hover {
+            background: var(--danger-color);
+            border-color: var(--danger-color);
+        }
+        @media (max-width: 768px) {
+            .search-container {
+                padding: 0 1rem;
+            }
+            .main-container {
+                padding-top: 1rem;
+            }
+            .image-grid {
+                grid-template-columns: repeat(auto-fill, minmax(200px, 1fr));
+                gap: 1rem;
+            }
+            .sidebar {
+                margin-bottom: 1.5rem;
+            }
+        }
+    </style>
+</head>
+<body>
+    <nav class="navbar navbar-expand-lg" aria-label="Main navigation">
+        <div class="container-fluid px-4">
+            <a class="navbar-brand" href="#">
+                <div class="brand-icon">
+                    <i class="bi bi-search"></i>
+                </div>
+                Visual Product Search
+            </a>
+            <div class="search-container">
+                <form class="search-form d-flex" onsubmit="searchImages(event)">
+                    <input class="form-control search-input" type="search" id="searchInput" placeholder="Search products and images...">
+                    <button class="btn btn-primary search-btn" type="submit">
+                        <i class="bi bi-search me-1"></i>Search
+                    </button>
+                    <label class="btn action-btn ms-2" for="imageUpload" title="Search by image">
+                        <i class="bi bi-image"></i>
+                    </label>
+                    <input type="file" id="imageUpload" style="display: none" accept="image/*" onchange="searchByImage(event)">
+                    <button class="btn action-btn ms-2" type="button" onclick="toggleUrlSearch()" title="Search by URL">
+                        <i class="bi bi-link-45deg"></i>
+                    </button>
+                </form>
+                <!-- URL Search Form (initially hidden) -->
+                <form class="url-search-form d-flex mt-3" id="urlSearchForm" style="display: none;" onsubmit="searchByUrl(event)">
+                    <input class="form-control search-input" type="url" id="urlInput" placeholder="Enter image URL..." required>
+                    <button class="btn btn-primary search-btn" type="submit">
+                        <i class="bi bi-link me-1"></i>Search URL
+                    </button>
+                    <button class="btn action-btn ms-2" type="button" onclick="toggleUrlSearch()">
+                        <i class="bi bi-x"></i>
+                    </button>
+                </form>
+            </div>
+        </div>
+    </nav>
+    <!-- Indexing Progress -->
+    <div class="container main-container" id="indexingStatus" style="display: none;">
+        <div class="status-card">
+            <h6 class="mb-3">
+                <i class="bi bi-gear-fill me-2 text-primary"></i>
+                Indexing Progress
+            </h6>
+            <div class="progress mb-3">
+                <div class="progress-bar progress-bar-striped progress-bar-animated" style="width: 0%"></div>
+            </div>
+            <p class="mb-0 text-muted" id="indexingDetails"></p>
+        </div>
+    </div>
+    <!-- Main Content -->
+    <div class="container main-container">
+        <div class="row g-4">
+            <div class="col-lg-3">
+                <div class="sidebar">
+                    <button class="add-folder-btn" onclick="openFolderBrowser()">
+                        <i class="bi bi-folder-plus me-2"></i>Add Folder
+                    </button>
+                    <div class="sidebar-title">
+                        <i class="bi bi-folder2-open"></i>
+                        Indexed Folders
+                    </div>
+                    <div id="folderList">
+                        <!-- Folders will be listed here -->
+                    </div>
+                </div>
+            </div>
+            <div class="col-lg-9">
+                <div class="content-area">
+                    <div class="image-grid" id="imageGrid">
+                        <!-- Images will be displayed here -->
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+    <!-- Folder Browser Modal -->
+    <div class="modal fade" id="folderBrowserModal" tabindex="-1">
+        <div class="modal-dialog modal-lg">
+            <div class="modal-content">
+                <div class="modal-header">
+                    <h5 class="modal-title">
+                        <i class="bi bi-folder2-open me-2 text-primary"></i>
+                        Choose Folder to Index
+                    </h5>
+                    <button type="button" class="btn-close" data-bs-dismiss="modal"></button>
+                </div>
+                <div class="modal-body">
+                    <nav aria-label="breadcrumb">
+                        <ol class="breadcrumb" id="folderBreadcrumb">
+                            <li class="breadcrumb-item active">Root</li>
+                        </ol>
+                    </nav>
+                    <div class="folder-browser" id="folderBrowser">
+                        <!-- Folder contents will be displayed here -->
+                    </div>
+                </div>
+                <div class="modal-footer">
+                    <button type="button" class="btn btn-secondary" data-bs-dismiss="modal">Cancel</button>
+                    <button type="button" class="btn btn-primary" onclick="addSelectedFolder()">
+                        <i class="bi bi-plus-circle me-1"></i>Add Folder
+                    </button>
+                </div>
+            </div>
+        </div>
+    </div>
+</body>
+</html>

tests/test_qdrant_singleton.py ADDED Viewed

	@@ -0,0 +1,120 @@

+import pytest
+import uuid
+from pathlib import Path
+import shutil
+from qdrant_singleton import QdrantClientSingleton, CURRENT_SCHEMA_VERSION
+from qdrant_client.http import models
+@pytest.fixture(autouse=True)
+def setup_teardown():
+    """Setup and teardown for each test"""
+    # Store original state
+    original_path = QdrantClientSingleton._storage_path
+    original_instance = QdrantClientSingleton._instance
+    # Create temporary storage
+    temp_path = Path("test_qdrant_data")
+    QdrantClientSingleton._storage_path = temp_path
+    QdrantClientSingleton._instance = None
+    yield
+    # Cleanup
+    if QdrantClientSingleton._instance:
+        QdrantClientSingleton._instance.close()
+    # Restore original state
+    QdrantClientSingleton._instance = original_instance
+    QdrantClientSingleton._storage_path = original_path
+    # Remove test directory if it exists
+    if temp_path.exists():
+        shutil.rmtree(temp_path)
+def test_singleton_pattern():
+    """Test that get_instance returns the same instance"""
+    instance1 = QdrantClientSingleton.get_instance()
+    instance2 = QdrantClientSingleton.get_instance()
+    assert instance1 is instance2
+def test_storage_path_creation():
+    """Test that storage path is created if it doesn't exist"""
+    assert not QdrantClientSingleton._storage_path.exists()
+    QdrantClientSingleton.get_instance()
+    assert QdrantClientSingleton._storage_path.exists()
+def test_collection_creation():
+    """Test collection creation"""
+    client = QdrantClientSingleton.get_instance()
+    collection_name = "test_collection"
+    # Create collection
+    QdrantClientSingleton.initialize_collection(collection_name)
+    # Check collection exists
+    collections = client.get_collections().collections
+    collection_names = [collection.name for collection in collections]
+    assert collection_name in collection_names
+def test_schema_version_check():
+    """Test schema version checking and updating"""
+    client = QdrantClientSingleton.get_instance()
+    collection_name = "test_schema_collection"
+    # Create collection
+    QdrantClientSingleton.initialize_collection(collection_name)
+    # Add a point with current schema version
+    point_id = str(uuid.uuid4())
+    client.upsert(
+        collection_name=collection_name,
+        points=[
+            models.PointStruct(
+                id=point_id,
+                vector=[0.0] * 512,  # VECTOR_SIZE
+                payload={
+                    "path": "test.jpg",
+                    "absolute_path": "/test/test.jpg",
+                    "schema_version": CURRENT_SCHEMA_VERSION,
+                    "indexed_at": 123456789
+                }
+            )
+        ]
+    )
+    # Verify point was added
+    search_result = client.scroll(
+        collection_name=collection_name,
+        limit=1
+    )
+    assert len(search_result[0]) == 1
+    assert search_result[0][0].id == point_id
+    assert search_result[0][0].payload["schema_version"] == CURRENT_SCHEMA_VERSION
+def test_payload_indexes():
+    """Test that payload indexes are created correctly"""
+    client = QdrantClientSingleton.get_instance()
+    collection_name = "test_indexes"
+    # Create collection
+    QdrantClientSingleton.initialize_collection(collection_name)
+    # Get collection info
+    collection_info = client.get_collection(collection_name)
+    # Check that collection exists and has correct vector size
+    assert collection_info.config.params.vectors.size == 512
+    assert collection_info.config.params.vectors.distance == models.Distance.COSINE
+def test_empty_collection_schema_check():
+    """Test schema check behavior with empty collection"""
+    client = QdrantClientSingleton.get_instance()
+    collection_name = "test_empty_collection"
+    # Create collection
+    QdrantClientSingleton.initialize_collection(collection_name)
+    # Verify collection exists
+    collections = client.get_collections().collections
+    collection_names = [collection.name for collection in collections]
+    assert collection_name in collection_names