Entropy-Harvester / README.md
TuringsSolutions's picture
Update README.md
afd2e5c verified
metadata
title: Entropy Harvester
emoji: 🌍
colorFrom: red
colorTo: red
sdk: gradio
sdk_version: 5.44.1
app_file: app.py
pinned: false
license: mit

Dataset Energy & Entropy Analyzer (Gradio)

A lightweight app that analyzes a CSV and reports:

  • Global compressibility (gzip ratio)
  • Per-column entropy (numeric via quantile-binning; categorical via counts)
  • Monotone runs and run-entropy (numeric columns)
  • Sortedness fraction (numeric columns)
  • 2D Pareto maxima (first two numeric columns)
  • kd-partition entropy approximation (first two numeric columns)
  • Overall Harvestable Energy score (0–1)

How it relates to "Harvestable Energy" & range-partition entropy

  • Lower entropy and higher compressibility imply more exploitable structure β†’ higher harvestable energy.
  • kd-entropy approximates how many bits are needed to split spatial data into simple blocks (a proxy for range-partition entropy).
  • Run-entropy captures how many monotone runs are present (adaptive sorting lens). Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference