Update README.md
Browse files
README.md
CHANGED
|
@@ -9,5 +9,19 @@ app_file: app.py
|
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
---
|
|
|
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
| 9 |
pinned: false
|
| 10 |
license: mit
|
| 11 |
---
|
| 12 |
+
# Dataset Energy & Entropy Analyzer (Gradio)
|
| 13 |
|
| 14 |
+
A lightweight app that analyzes a CSV and reports:
|
| 15 |
+
- Global compressibility (gzip ratio)
|
| 16 |
+
- Per-column entropy (numeric via quantile-binning; categorical via counts)
|
| 17 |
+
- Monotone runs and run-entropy (numeric columns)
|
| 18 |
+
- Sortedness fraction (numeric columns)
|
| 19 |
+
- 2D Pareto maxima (first two numeric columns)
|
| 20 |
+
- kd-partition entropy approximation (first two numeric columns)
|
| 21 |
+
- Overall **Harvestable Energy** score (0–1)
|
| 22 |
+
|
| 23 |
+
## How it relates to "Harvestable Energy" & range-partition entropy
|
| 24 |
+
- Lower entropy and higher compressibility imply more exploitable structure → higher harvestable energy.
|
| 25 |
+
- kd-entropy approximates how many bits are needed to split spatial data into simple blocks (a proxy for range-partition entropy).
|
| 26 |
+
- Run-entropy captures how many monotone runs are present (adaptive sorting lens).
|
| 27 |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|