nomic-embeddings

Running

App Files Files Community

Patryk Ptasiński commited on Jul 14

Commit

b366822

1 Parent(s): cc86f1b

Add 15+ embedding models with dropdown selector and comprehensive API support

Browse files

Files changed (2) hide show

CLAUDE.md +13 -8
app.py +114 -24

CLAUDE.md CHANGED Viewed

@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 ## Project Overview
-This is a Hugging Face Spaces application that provides text embeddings using the Nomic AI model (nomic-embed-text-v1.5). It runs on CPU and provides both a web interface and API endpoints for generating text embeddings.
 ## Key Commands
@@ -29,11 +29,12 @@ huggingface-cli login
 ## Architecture
 The application consists of a single `app.py` file with:
-- **Model Initialization**: SentenceTransformer with `device='cpu'` (line 10)
-- **FastAPI App**: Direct HTTP endpoint at `/embed` (lines 13, 21-46)
-- **Embedding Function**: Simple wrapper that calls model.encode() (lines 16-17)
-- **Gradio Interface**: UI components and API endpoint configuration (lines 49-122)
-- **Dual Server**: FastAPI mounted with Gradio using uvicorn (lines 126-129)
 ## Important Configuration Details
@@ -48,16 +49,20 @@ Two options for API access:
 1. **Direct FastAPI endpoint** (no queue):
 ```bash
 curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
   -H "Content-Type: application/json" \
-  -d '{"text": "your text"}'
 ```
 2. **Gradio client** (handles queue automatically):
 ```python
 from gradio_client import Client
 client = Client("ipepe/nomic-embeddings")
-result = client.predict("text to embed", api_name="/predict")
 ```
 ## Deployment Notes

 ## Project Overview
+This is a Hugging Face Spaces application that provides text embeddings using 15+ state-of-the-art embedding models including Nomic, BGE, Snowflake Arctic, IBM Granite, and sentence-transformers models. It runs on CPU and provides both a web interface and API endpoints for generating text embeddings with model selection.
 ## Key Commands
 ## Architecture
 The application consists of a single `app.py` file with:
+- **Model Configuration**: Dictionary of 15+ embedding models with trust_remote_code settings (lines 10-26)
+- **Model Caching**: Dynamic model loading with caching to avoid reloading (lines 32-42)
+- **FastAPI App**: Direct HTTP endpoints at `/embed` and `/models` (lines 44, 57-102)
+- **Embedding Function**: Multi-model wrapper that calls model.encode() (lines 49-53)
+- **Gradio Interface**: UI with model dropdown selector and API endpoint (lines 106-135)
+- **Dual Server**: FastAPI mounted with Gradio using uvicorn (lines 214-219)
 ## Important Configuration Details
 1. **Direct FastAPI endpoint** (no queue):
 ```bash
+# List models
+curl https://ipepe-nomic-embeddings.hf.space/models
+# Generate embedding with specific model
 curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
   -H "Content-Type: application/json" \
+  -d '{"text": "your text", "model": "mixedbread-ai/mxbai-embed-large-v1"}'
 ```
 2. **Gradio client** (handles queue automatically):
 ```python
 from gradio_client import Client
 client = Client("ipepe/nomic-embeddings")
+result = client.predict("text to embed", "model-name", api_name="/predict")
 ```
 ## Deployment Notes

app.py CHANGED Viewed

@@ -6,14 +6,52 @@ from fastapi import FastAPI
 from fastapi.responses import JSONResponse
 from sentence_transformers import SentenceTransformer
-# Initialize model
-model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True, device='cpu')
 # Create FastAPI app
 fastapi_app = FastAPI()
-def embed(document: str):
     return model.encode(document)
@@ -23,20 +61,28 @@ async def embed_text(data: Dict[str, Any]):
     """Direct API endpoint for text embedding without queue"""
     try:
         text = data.get("text", "")
         if not text:
             return JSONResponse(
                 status_code=400,
                 content={"error": "No text provided"}
             )
         # Generate embedding
-        embedding = model.encode(text)
         return JSONResponse(
             content={
                 "embedding": embedding.tolist(),
                 "dim": len(embedding),
-                "model": "nomic-embed-text-v1.5"
             }
         )
     except Exception as e:
@@ -46,9 +92,28 @@ async def embed_text(data: Dict[str, Any]):
         )
-with gr.Blocks(title="Nomic Text Embeddings") as app:
-    gr.Markdown("# Nomic Text Embeddings v1.5")
-    gr.Markdown("Generate embeddings for your text using the nomic-embed-text-v1.5 model.")
     # Create an input text box
     text_input = gr.Textbox(label="Enter text to embed", placeholder="Type or paste your text here...")
@@ -60,27 +125,38 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
     submit_btn = gr.Button("Generate Embedding", variant="primary")
     # Handle both button click and text submission
-    submit_btn.click(embed, inputs=text_input, outputs=output, api_name="predict")
-    text_input.submit(embed, inputs=text_input, outputs=output)
     # Add API usage guide
     gr.Markdown("## API Usage")
     gr.Markdown("""
     You can use this API in two ways: via the direct FastAPI endpoint or through Gradio clients.
     ### Direct API Endpoint (No Queue!)
     ```bash
     curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
       -H "Content-Type: application/json" \
       -d '{"text": "Your text to embed goes here"}'
     ```
     Response format:
     ```json
     {
       "embedding": [0.123, -0.456, ...],
-      "dim": 768,
-      "model": "nomic-embed-text-v1.5"
     }
     ```
@@ -88,9 +164,17 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
     ```python
     import requests
     response = requests.post(
         "https://ipepe-nomic-embeddings.hf.space/embed",
-        json={"text": "Your text to embed goes here"}
     )
     result = response.json()
     embedding = result["embedding"]
@@ -103,22 +187,28 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
     client = Client("ipepe/nomic-embeddings")
     result = client.predict(
         "Your text to embed goes here",
         api_name="/predict"
     )
     print(result)  # Returns the embedding array
     ```
-    ### JavaScript/Node.js Example
-    ```javascript
-    // Direct API
-    const response = await fetch('https://ipepe-nomic-embeddings.hf.space/embed', {
-      method: 'POST',
-      headers: { 'Content-Type': 'application/json' },
-      body: JSON.stringify({ text: 'Your text to embed goes here' })
-    });
-    const result = await response.json();
-    console.log(result.embedding);
-    ```
     """)
 if __name__ == '__main__':

 from fastapi.responses import JSONResponse
 from sentence_transformers import SentenceTransformer
+# Available models
+MODELS = {
+    "nomic-ai/nomic-embed-text-v1.5": {"trust_remote_code": True},
+    "nomic-ai/nomic-embed-text-v1": {"trust_remote_code": True},
+    "mixedbread-ai/mxbai-embed-large-v1": {"trust_remote_code": False},
+    "BAAI/bge-m3": {"trust_remote_code": False},
+    "sentence-transformers/all-MiniLM-L6-v2": {"trust_remote_code": False},
+    "sentence-transformers/all-mpnet-base-v2": {"trust_remote_code": False},
+    "Snowflake/snowflake-arctic-embed-m": {"trust_remote_code": False},
+    "Snowflake/snowflake-arctic-embed-l": {"trust_remote_code": False},
+    "Snowflake/snowflake-arctic-embed-m-v2.0": {"trust_remote_code": False},
+    "BAAI/bge-large-en-v1.5": {"trust_remote_code": False},
+    "BAAI/bge-base-en-v1.5": {"trust_remote_code": False},
+    "BAAI/bge-small-en-v1.5": {"trust_remote_code": False},
+    "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2": {"trust_remote_code": False},
+    "ibm-granite/granite-embedding-30m-english": {"trust_remote_code": False},
+    "ibm-granite/granite-embedding-278m-multilingual": {"trust_remote_code": False},
+}
+# Model cache
+loaded_models = {}
+current_model_name = "nomic-ai/nomic-embed-text-v1.5"
+# Initialize default model
+def load_model(model_name: str):
+    global loaded_models
+    if model_name not in loaded_models:
+        config = MODELS.get(model_name, {})
+        loaded_models[model_name] = SentenceTransformer(
+            model_name,
+            trust_remote_code=config.get("trust_remote_code", False),
+            device='cpu'
+        )
+    return loaded_models[model_name]
+# Load default model
+model = load_model(current_model_name)
 # Create FastAPI app
 fastapi_app = FastAPI()
+def embed(document: str, model_name: str = None):
+    if model_name and model_name in MODELS:
+        selected_model = load_model(model_name)
+        return selected_model.encode(document)
     return model.encode(document)
     """Direct API endpoint for text embedding without queue"""
     try:
         text = data.get("text", "")
+        model_name = data.get("model", current_model_name)
         if not text:
             return JSONResponse(
                 status_code=400,
                 content={"error": "No text provided"}
             )
+        if model_name not in MODELS:
+            return JSONResponse(
+                status_code=400,
+                content={"error": f"Model '{model_name}' not supported. Available models: {list(MODELS.keys())}"}
+            )
         # Generate embedding
+        embedding = embed(text, model_name)
         return JSONResponse(
             content={
                 "embedding": embedding.tolist(),
                 "dim": len(embedding),
+                "model": model_name
             }
         )
     except Exception as e:
         )
+@fastapi_app.get("/models")
+async def list_models():
+    """List available embedding models"""
+    return JSONResponse(
+        content={
+            "models": list(MODELS.keys()),
+            "default": current_model_name
+        }
+    )
+with gr.Blocks(title="Multi-Model Text Embeddings") as app:
+    gr.Markdown("# Multi-Model Text Embeddings")
+    gr.Markdown("Generate embeddings for your text using 15+ state-of-the-art embedding models from Nomic, BGE, Snowflake, IBM Granite, and more.")
+    # Model selector dropdown
+    model_dropdown = gr.Dropdown(
+        choices=list(MODELS.keys()),
+        value=current_model_name,
+        label="Select Embedding Model",
+        info="Choose the embedding model to use"
+    )
     # Create an input text box
     text_input = gr.Textbox(label="Enter text to embed", placeholder="Type or paste your text here...")
     submit_btn = gr.Button("Generate Embedding", variant="primary")
     # Handle both button click and text submission
+    submit_btn.click(embed, inputs=[text_input, model_dropdown], outputs=output, api_name="predict")
+    text_input.submit(embed, inputs=[text_input, model_dropdown], outputs=output)
     # Add API usage guide
     gr.Markdown("## API Usage")
     gr.Markdown("""
     You can use this API in two ways: via the direct FastAPI endpoint or through Gradio clients.
+    ### List Available Models
+    ```bash
+    curl https://ipepe-nomic-embeddings.hf.space/models
+    ```
     ### Direct API Endpoint (No Queue!)
     ```bash
+    # Default model (nomic-ai/nomic-embed-text-v1.5)
     curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
       -H "Content-Type: application/json" \
       -d '{"text": "Your text to embed goes here"}'
+    # With specific model
+    curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
+      -H "Content-Type: application/json" \
+      -d '{"text": "Your text to embed goes here", "model": "sentence-transformers/all-MiniLM-L6-v2"}'
     ```
     Response format:
     ```json
     {
       "embedding": [0.123, -0.456, ...],
+      "dim": 384,
+      "model": "sentence-transformers/all-MiniLM-L6-v2"
     }
     ```
     ```python
     import requests
+    # List available models
+    models = requests.get("https://ipepe-nomic-embeddings.hf.space/models").json()
+    print(models["models"])
+    # Generate embedding with specific model
     response = requests.post(
         "https://ipepe-nomic-embeddings.hf.space/embed",
+        json={
+            "text": "Your text to embed goes here",
+            "model": "BAAI/bge-small-en-v1.5"
+        }
     )
     result = response.json()
     embedding = result["embedding"]
     client = Client("ipepe/nomic-embeddings")
     result = client.predict(
         "Your text to embed goes here",
+        "nomic-ai/nomic-embed-text-v1.5",  # model selection
         api_name="/predict"
     )
     print(result)  # Returns the embedding array
     ```
+    ### Available Models
+    - `nomic-ai/nomic-embed-text-v1.5` (default) - High-performing open embedding model with large token context
+    - `nomic-ai/nomic-embed-text-v1` - Previous version of Nomic embedding model
+    - `mixedbread-ai/mxbai-embed-large-v1` - State-of-the-art large embedding model from mixedbread.ai
+    - `BAAI/bge-m3` - Multi-functional, multi-lingual, multi-granularity embedding model
+    - `sentence-transformers/all-MiniLM-L6-v2` - Fast, small embedding model for general use
+    - `sentence-transformers/all-mpnet-base-v2` - Balanced performance embedding model
+    - `Snowflake/snowflake-arctic-embed-m` - Medium-sized Arctic embedding model
+    - `Snowflake/snowflake-arctic-embed-l` - Large Arctic embedding model
+    - `Snowflake/snowflake-arctic-embed-m-v2.0` - Latest Arctic embedding with multilingual support
+    - `BAAI/bge-large-en-v1.5` - Large BGE embedding model for English
+    - `BAAI/bge-base-en-v1.5` - Base BGE embedding model for English
+    - `BAAI/bge-small-en-v1.5` - Small BGE embedding model for English
+    - `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` - Multilingual paraphrase model
+    - `ibm-granite/granite-embedding-30m-english` - IBM Granite 30M English embedding model
+    - `ibm-granite/granite-embedding-278m-multilingual` - IBM Granite 278M multilingual embedding model
     """)
 if __name__ == '__main__':