paperindex / README.md
DVampire
update website
310e884
|
raw
history blame
4.7 kB
metadata
title: PaperIndex
emoji: πŸ“š
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: latest
app_file: app.py
pinned: false

Paper Index - AI Paper Evaluation System

A comprehensive system for evaluating AI research papers using advanced language models.

Features

  • Daily Paper Crawling: Automatically fetches papers from Hugging Face daily
  • AI Evaluation: Uses Claude Sonnet to evaluate papers across multiple dimensions
  • Interactive Dashboard: Beautiful web interface for browsing and evaluating papers
  • Database Storage: Persistent storage of papers and evaluations
  • Smart Navigation: Intelligent date navigation with fallback mechanisms

Hugging Face Spaces Deployment

This application is configured for deployment on Hugging Face Spaces.

Configuration

  • Port: 7860 (Hugging Face Spaces standard)
  • Health Check: /api/health endpoint
  • Docker: Optimized Dockerfile for containerized deployment

Deployment Steps

  1. Fork/Clone this repository to your Hugging Face account
  2. Create a new Space on Hugging Face
  3. Select Docker as the SDK
  4. Set Environment Variables:
    • ANTHROPIC_API_KEY: Your Anthropic API key for Claude access
  5. Deploy: The Space will automatically build and deploy

Environment Variables

ANTHROPIC_API_KEY=your_api_key_here
PORT=7860  # Optional, defaults to 7860

Local Development

Prerequisites

  • Python 3.9+
  • Anthropic API key

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd paperindex
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set environment variables:

    export ANTHROPIC_API_KEY=your_api_key_here
    
  4. Run the application:

    python app.py
    
  5. Access the application:

API Endpoints

Core Endpoints

  • GET /api/daily - Get daily papers with smart navigation
  • GET /api/paper/{paper_id} - Get paper details
  • GET /api/eval/{paper_id} - Get paper evaluation
  • GET /api/health - Health check endpoint

Evaluation Endpoints

  • POST /api/papers/evaluate/{arxiv_id} - Start paper evaluation
  • GET /api/papers/evaluate/{arxiv_id}/status - Get evaluation status

Cache Management

  • GET /api/cache/status - Get cache statistics
  • POST /api/cache/clear - Clear all cached data
  • POST /api/cache/refresh/{date} - Refresh cache for specific date

Architecture

Frontend

  • HTML/CSS/JavaScript: Modern, responsive interface
  • Real-time Updates: Dynamic content loading
  • Theme Support: Light/dark mode toggle

Backend

  • FastAPI: High-performance web framework
  • SQLite: Lightweight database for paper storage
  • Async Processing: Background evaluation tasks
  • Caching: Intelligent caching system for performance

AI Integration

  • Claude Sonnet: Advanced paper evaluation
  • Multi-dimensional Analysis: Comprehensive evaluation criteria
  • Structured Output: JSON-based evaluation results

Database Schema

Papers Table

CREATE TABLE papers (
    arxiv_id TEXT PRIMARY KEY,
    title TEXT NOT NULL,
    authors TEXT NOT NULL,
    abstract TEXT,
    categories TEXT,
    published_date TEXT,
    evaluation_content TEXT,
    evaluation_score REAL,
    overall_score REAL,
    evaluation_tags TEXT,
    evaluation_status TEXT DEFAULT 'not_started',
    is_evaluated BOOLEAN DEFAULT FALSE,
    evaluation_date TIMESTAMP,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Evaluation Dimensions

The system evaluates papers across 12 key dimensions:

  1. Task Formalization - Clarity of problem definition
  2. Data & Resource Availability - Access to required data
  3. Input-Output Complexity - Complexity of inputs/outputs
  4. Real-World Interaction - Practical applicability
  5. Existing AI Coverage - Current AI capabilities
  6. Automation Barriers - Technical challenges
  7. Human Originality - Creative contribution
  8. Safety & Ethics - Responsible AI considerations
  9. Societal/Economic Impact - Broader implications
  10. Technical Maturity Needed - Development requirements
  11. 3-Year Feasibility - Short-term potential
  12. Overall Automatability - Comprehensive assessment

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.