nomic-embeddings

Running

App Files Files Community

Patryk Ptasiński commited on Jul 14

Commit

cc86f1b

1 Parent(s): 06197a9

Add FastAPI endpoint for direct HTTP access without queue

Browse files

Files changed (3) hide show

CLAUDE.md +15 -7
app.py +77 -24
requirements.txt +2 -0

CLAUDE.md CHANGED Viewed

@@ -29,10 +29,11 @@ huggingface-cli login
 ## Architecture
 The application consists of a single `app.py` file with:
-- **Model Initialization**: SentenceTransformer with `device='cpu'` (line 6)
-- **Embedding Function**: Simple wrapper that calls model.encode() (lines 9-10)
-- **Gradio Interface**: UI components and API endpoint configuration (lines 13-81)
-- **API Endpoint**: Named "predict" for programmatic access (line 27)
 ## Important Configuration Details
@@ -43,15 +44,22 @@ The application consists of a single `app.py` file with:
 ## API Usage
-Use Gradio client libraries for API access:
 ```python
 from gradio_client import Client
 client = Client("ipepe/nomic-embeddings")
 result = client.predict("text to embed", api_name="/predict")
 ```
-Direct HTTP requires implementing Gradio's queue protocol (join queue, SSE listening, session management).
 ## Deployment Notes
 - Deployed on Hugging Face Spaces at https://huggingface.co/spaces/ipepe/nomic-embeddings

 ## Architecture
 The application consists of a single `app.py` file with:
+- **Model Initialization**: SentenceTransformer with `device='cpu'` (line 10)
+- **FastAPI App**: Direct HTTP endpoint at `/embed` (lines 13, 21-46)
+- **Embedding Function**: Simple wrapper that calls model.encode() (lines 16-17)
+- **Gradio Interface**: UI components and API endpoint configuration (lines 49-122)
+- **Dual Server**: FastAPI mounted with Gradio using uvicorn (lines 126-129)
 ## Important Configuration Details
 ## API Usage
+Two options for API access:
+1. **Direct FastAPI endpoint** (no queue):
+```bash
+curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
+  -H "Content-Type: application/json" \
+  -d '{"text": "your text"}'
+```
+2. **Gradio client** (handles queue automatically):
 ```python
 from gradio_client import Client
 client = Client("ipepe/nomic-embeddings")
 result = client.predict("text to embed", api_name="/predict")
 ```
 ## Deployment Notes
 - Deployed on Hugging Face Spaces at https://huggingface.co/spaces/ipepe/nomic-embeddings

app.py CHANGED Viewed

@@ -1,15 +1,51 @@
-from typing import List
 import gradio as gr
 from sentence_transformers import SentenceTransformer
 model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True, device='cpu')
 def embed(document: str):
     return model.encode(document)
 with gr.Blocks(title="Nomic Text Embeddings") as app:
     gr.Markdown("# Nomic Text Embeddings v1.5")
     gr.Markdown("Generate embeddings for your text using the nomic-embed-text-v1.5 model.")
@@ -30,18 +66,37 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
     # Add API usage guide
     gr.Markdown("## API Usage")
     gr.Markdown("""
-    You can use this API programmatically. Hugging Face Spaces requires using their client libraries which handle queuing automatically.
-    ### Quick Command-Line Usage
     ```bash
-    # Install gradio client
-    pip install gradio_client
-    # Generate embedding with one command
-    python -c "from gradio_client import Client; print(Client('ipepe/nomic-embeddings').predict('Your text here', api_name='/predict'))"
     ```
-    ### Python Example (Recommended)
     ```python
     from gradio_client import Client
@@ -55,23 +110,21 @@ with gr.Blocks(title="Nomic Text Embeddings") as app:
     ### JavaScript/Node.js Example
     ```javascript
-    import { client } from "@gradio/client";
-    const app = await client("ipepe/nomic-embeddings");
-    const result = await app.predict("/predict", ["Your text to embed goes here"]);
-    console.log(result.data);
     ```
-    ### Direct HTTP (Advanced)
-    Direct HTTP requests require implementing the Gradio queue protocol:
-    1. POST to `/queue/join` to join queue
-    2. Listen to `/queue/data` via SSE for results
-    3. Handle session management
-    For direct HTTP, we recommend using the official Gradio clients above which handle this automatically.
-    The response will contain the embedding array as a list of floats.
     """)
 if __name__ == '__main__':
-    app.launch(server_name="0.0.0.0", show_error=True, server_port=7860)

+from typing import List, Dict, Any
+import json
 import gradio as gr
+from fastapi import FastAPI
+from fastapi.responses import JSONResponse
 from sentence_transformers import SentenceTransformer
+# Initialize model
 model = SentenceTransformer("nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True, device='cpu')
+# Create FastAPI app
+fastapi_app = FastAPI()
 def embed(document: str):
     return model.encode(document)
+# FastAPI endpoints
+@fastapi_app.post("/embed")
+async def embed_text(data: Dict[str, Any]):
+    """Direct API endpoint for text embedding without queue"""
+    try:
+        text = data.get("text", "")
+        if not text:
+            return JSONResponse(
+                status_code=400,
+                content={"error": "No text provided"}
+            )
+        # Generate embedding
+        embedding = model.encode(text)
+        return JSONResponse(
+            content={
+                "embedding": embedding.tolist(),
+                "dim": len(embedding),
+                "model": "nomic-embed-text-v1.5"
+            }
+        )
+    except Exception as e:
+        return JSONResponse(
+            status_code=500,
+            content={"error": str(e)}
+        )
 with gr.Blocks(title="Nomic Text Embeddings") as app:
     gr.Markdown("# Nomic Text Embeddings v1.5")
     gr.Markdown("Generate embeddings for your text using the nomic-embed-text-v1.5 model.")
     # Add API usage guide
     gr.Markdown("## API Usage")
     gr.Markdown("""
+    You can use this API in two ways: via the direct FastAPI endpoint or through Gradio clients.
+    ### Direct API Endpoint (No Queue!)
     ```bash
+    curl -X POST https://ipepe-nomic-embeddings.hf.space/embed \
+      -H "Content-Type: application/json" \
+      -d '{"text": "Your text to embed goes here"}'
+    ```
+    Response format:
+    ```json
+    {
+      "embedding": [0.123, -0.456, ...],
+      "dim": 768,
+      "model": "nomic-embed-text-v1.5"
+    }
+    ```
+    ### Python Example (Direct API)
+    ```python
+    import requests
+    response = requests.post(
+        "https://ipepe-nomic-embeddings.hf.space/embed",
+        json={"text": "Your text to embed goes here"}
+    )
+    result = response.json()
+    embedding = result["embedding"]
     ```
+    ### Python Example (Gradio Client)
     ```python
     from gradio_client import Client
     ### JavaScript/Node.js Example
     ```javascript
+    // Direct API
+    const response = await fetch('https://ipepe-nomic-embeddings.hf.space/embed', {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json' },
+      body: JSON.stringify({ text: 'Your text to embed goes here' })
+    });
+    const result = await response.json();
+    console.log(result.embedding);
     ```
     """)
 if __name__ == '__main__':
+    # Mount FastAPI app to Gradio
+    app = gr.mount_gradio_app(fastapi_app, app, path="/")
+    # Run with Uvicorn (Gradio uses this internally)
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=7860)

requirements.txt CHANGED Viewed

@@ -1,4 +1,6 @@
 sentence_transformers==3.0.1
 einops==0.7.0
 torch>=2.0.0
 --extra-index-url https://download.pytorch.org/whl/cpu

 sentence_transformers==3.0.1
 einops==0.7.0
 torch>=2.0.0
+fastapi
+uvicorn
 --extra-index-url https://download.pytorch.org/whl/cpu