|
--- |
|
title: Entropy Harvester |
|
emoji: π |
|
colorFrom: red |
|
colorTo: red |
|
sdk: gradio |
|
sdk_version: 5.44.1 |
|
app_file: app.py |
|
pinned: false |
|
license: mit |
|
--- |
|
# Dataset Energy & Entropy Analyzer (Gradio) |
|
|
|
A lightweight app that analyzes a CSV and reports: |
|
- Global compressibility (gzip ratio) |
|
- Per-column entropy (numeric via quantile-binning; categorical via counts) |
|
- Monotone runs and run-entropy (numeric columns) |
|
- Sortedness fraction (numeric columns) |
|
- 2D Pareto maxima (first two numeric columns) |
|
- kd-partition entropy approximation (first two numeric columns) |
|
- Overall **Harvestable Energy** score (0β1) |
|
|
|
## How it relates to "Harvestable Energy" & range-partition entropy |
|
- Lower entropy and higher compressibility imply more exploitable structure β higher harvestable energy. |
|
- kd-entropy approximates how many bits are needed to split spatial data into simple blocks (a proxy for range-partition entropy). |
|
- Run-entropy captures how many monotone runs are present (adaptive sorting lens). |
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|