Spaces:
Running
Running
GGUF Timeout Fix - Progress Update
β Completed Steps:
- Increased GGUF timeout: Changed from 120s to 300s for Hugging Face Spaces
- Configurable timeout: Added GGUF_GENERATION_TIMEOUT environment variable support
- Better error handling: Enhanced timeout and fallback mechanisms in routes.py
- Fallback pipeline: Added robust fallback when GGUF model fails to load or times out
π§ Changes Made:
model_loader_gguf.py:
- Updated
_generate_with_timeout()
to use 300s default for Spaces, 120s for local - Made timeout configurable via environment variable
- Updated
generate()
to use configurable timeout
routes.py:
- Added fallback pipeline usage when GGUF times out
- Added better logging for timeout errors
- Added fallback for GGUF model loading failures
- Improved error messages and response handling
π Next Steps:
- Test the changes with the GGUF model
- Verify timeout is sufficient for Phi-3 model
- Test fallback mechanisms
- Add progress logging for generation
βοΈ Configuration:
- Default timeout: 300s (Spaces) / 120s (local)
- Environment variable:
GGUF_GENERATION_TIMEOUT
- Fallback: Template-based summary when GGUF fails