manojapinew / README.md
ManojINaik's picture
Upload 4 files
93cf301 verified

Fine-Tuned LLM API

This is a FastAPI-based API service for the fine-tuned model "ManojINaik/Strength_weakness". The model is optimized for text generation with 4-bit quantization for efficient inference.

API Endpoints

GET /

Health check endpoint that confirms the API is running.

POST /generate/

Generate text based on a prompt with optional parameters.

Request Body

{
    "prompt": "What are the strengths of Python?",
    "history": [],  // Optional: List of previous conversation messages
    "system_prompt": "You are a very powerful AI assistant.",  // Optional
    "max_length": 200,  // Optional: Maximum length of generated text
    "temperature": 0.7  // Optional: Controls randomness (0.0 to 1.0)
}

Response

{
    "response": "Generated text response..."
}

Model Details

  • Base Model: ManojINaik/Strength_weakness
  • Quantization: 4-bit quantization using bitsandbytes
  • Device: Automatically uses GPU if available, falls back to CPU
  • Memory Efficient: Uses device mapping for optimal resource utilization

Technical Details

  • Framework: FastAPI
  • Python Version: 3.9+
  • Key Dependencies:
    • transformers
    • torch
    • bitsandbytes
    • accelerate
    • peft

Example Usage

import requests

url = "https://your-space-name.hf.space/generate"
payload = {
    "prompt": "What are the strengths of Python?",
    "temperature": 0.7,
    "max_length": 200
}

response = requests.post(url, json=payload)
print(response.json()["response"])