Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -1,10 +1,57 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Self-Hosted Qwen2.5-0.5B API on Hugging Face
|
2 |
+
|
3 |
+
This Hugging Face Space provides a self-hosted, OpenAI-compatible API for the `Qwen/Qwen2.5-0.5B-Instruct` model. It is designed to run on free CPU hardware and allows you to connect various tools and applications (like IDE extensions) that support custom API endpoints.
|
4 |
+
|
5 |
+
## β¨ Key Features
|
6 |
+
|
7 |
+
* **Free Hosting**: Runs on the free "CPU basic" hardware provided by Hugging Face Spaces.
|
8 |
+
* **OpenAI-Compatible**: Exposes `/models` and `/chat/completions` endpoints that mimic the OpenAI API structure, making it compatible with a wide range of clients.
|
9 |
+
* **Streaming Support**: The API streams responses back, which is required by many modern clients for a real-time, "typing" effect.
|
10 |
+
* **Lightweight & Fast**: Uses the `Qwen2.5-0.5B-Instruct` model, which is extremely small and optimized for fast responses on a CPU.
|
11 |
+
|
12 |
+
## π How to Use
|
13 |
+
|
14 |
+
To connect your application or client to this API, you need to configure it with the following settings.
|
15 |
+
|
16 |
+
### 1. Get the API Base URL
|
17 |
+
|
18 |
+
The Base URL is the main URL of this Hugging Face Space.
|
19 |
+
|
20 |
+
**`https://enzgamers-smallagent.hf.space`**
|
21 |
+
|
22 |
+
**Important Note:** Do **not** add `/chat/completions` or anything else to the end of the Base URL. Your client application will add the correct path automatically.
|
23 |
+
|
24 |
+
### 2. Configure Your Client
|
25 |
+
|
26 |
+
In your application's settings (e.g., a VS Code extension like Cline, a web UI, etc.), find the API configuration section and enter the following details:
|
27 |
+
|
28 |
+
* **API Provider / Type**: Select **`OpenAI-Compatible`** or a similar option.
|
29 |
+
* **Base URL**:
|
30 |
+
```
|
31 |
+
https://enzgamers-smallagent.hf.space
|
32 |
+
```
|
33 |
+
* **API Key**: You can enter **any value**. This API does not require authentication, but your client's UI might require the field to be filled. Examples: `123456`, `hf_space`, `not_needed`.
|
34 |
+
* **Model ID**: (Optional, but recommended)
|
35 |
+
```
|
36 |
+
Qwen/Qwen2.5-0.5B-Instruct
|
37 |
+
```
|
38 |
+
|
39 |
+
### Example Configuration Summary
|
40 |
+
|
41 |
+
API Provider: OpenAI-Compatible
|
42 |
+
Base URL: https://enzgamers-smallagent.hf.space
|
43 |
+
API Key: any_value
|
44 |
+
Model ID: Qwen/Qwen2.5-0.5B-Instruct
|
45 |
+
|
46 |
+
|
47 |
+
After saving these settings, your application should be able to communicate with this model.
|
48 |
+
|
49 |
+
## π οΈ Technical Details
|
50 |
+
|
51 |
+
* **Model**: `Qwen/Qwen2.5-0.5B-Instruct`
|
52 |
+
* **Framework**: The API is built with [FastAPI](https://fastapi.tiangolo.com/).
|
53 |
+
* **Server**: The application is served by [Uvicorn](https://www.uvicorn.org/) running inside a Docker container.
|
54 |
+
|
55 |
+
## π Disclaimer
|
56 |
+
|
57 |
+
This Space is provided for educational and personal use. It runs on shared, free hardware and is not intended for production-level traffic or performance-critical applications.
|