Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -18,13 +18,13 @@ metrics:
|
|
| 18 |
- f1
|
| 19 |
- roc_auc
|
| 20 |
model-index:
|
| 21 |
-
- name: TinyByteCNN-Fiction-
|
| 22 |
results:
|
| 23 |
- task:
|
| 24 |
type: text-classification
|
| 25 |
name: Fiction vs Non-Fiction Classification
|
| 26 |
dataset:
|
| 27 |
-
name: Custom Fiction/Non-Fiction Dataset
|
| 28 |
type: custom
|
| 29 |
split: validation
|
| 30 |
metrics:
|
|
@@ -37,6 +37,20 @@ model-index:
|
|
| 37 |
- type: roc_auc
|
| 38 |
value: 99.99
|
| 39 |
name: ROC AUC
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
---
|
| 41 |
|
| 42 |
# TinyByteCNN Fiction vs Non-Fiction Detector
|
|
@@ -131,17 +145,42 @@ The model was trained on a diverse dataset of 85,000 samples (60k train, 15k val
|
|
| 131 |
| ROC AUC | 0.9999 |
|
| 132 |
| Loss | 0.1194 |
|
| 133 |
|
| 134 |
-
### Test
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
|
| 139 |
-
|
| 140 |
-
|
|
| 141 |
-
|
|
| 142 |
-
|
|
| 143 |
-
|
| 144 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
|
| 146 |
### Detailed Test Results
|
| 147 |
|
|
|
|
| 18 |
- f1
|
| 19 |
- roc_auc
|
| 20 |
model-index:
|
| 21 |
+
- name: TinyByteCNN-Fiction-Classifier
|
| 22 |
results:
|
| 23 |
- task:
|
| 24 |
type: text-classification
|
| 25 |
name: Fiction vs Non-Fiction Classification
|
| 26 |
dataset:
|
| 27 |
+
name: Custom Fiction/Non-Fiction Dataset (85k samples)
|
| 28 |
type: custom
|
| 29 |
split: validation
|
| 30 |
metrics:
|
|
|
|
| 37 |
- type: roc_auc
|
| 38 |
value: 99.99
|
| 39 |
name: ROC AUC
|
| 40 |
+
- task:
|
| 41 |
+
type: text-classification
|
| 42 |
+
name: Curated Test Samples
|
| 43 |
+
dataset:
|
| 44 |
+
name: 18 Diverse Fiction/Non-Fiction Samples
|
| 45 |
+
type: curated
|
| 46 |
+
split: test
|
| 47 |
+
metrics:
|
| 48 |
+
- type: accuracy
|
| 49 |
+
value: 100.0
|
| 50 |
+
name: Test Accuracy
|
| 51 |
+
- type: confidence_avg
|
| 52 |
+
value: 96.3
|
| 53 |
+
name: Average Confidence
|
| 54 |
---
|
| 55 |
|
| 56 |
# TinyByteCNN Fiction vs Non-Fiction Detector
|
|
|
|
| 145 |
| ROC AUC | 0.9999 |
|
| 146 |
| Loss | 0.1194 |
|
| 147 |
|
| 148 |
+
### Detailed Test Results on 18 Curated Samples
|
| 149 |
+
|
| 150 |
+
The model achieved **100% accuracy** across all categories, but shows interesting confidence patterns:
|
| 151 |
+
|
| 152 |
+
| Category | Sample Title/Type | True Label | Predicted | Confidence | Analysis |
|
| 153 |
+
|----------|------------------|------------|-----------|------------|----------|
|
| 154 |
+
| **FICTION - General** | | | | | |
|
| 155 |
+
| Literary | Lighthouse Keeper Storm | Fiction | Fiction | **79.8%** | β οΈ **Lowest confidence** - realistic setting |
|
| 156 |
+
| Sci-Fi | Time Travel Bedroom | Fiction | Fiction | 97.2% | β
Clear fantastical elements |
|
| 157 |
+
| Mystery | Detective Rose Case | Fiction | Fiction | 97.3% | β
Strong narrative structure |
|
| 158 |
+
| **FICTION - Children's** | | | | | |
|
| 159 |
+
| Animal Tale | Benny's Carrot Problem | Fiction | Fiction | 97.1% | β
Clear storytelling markers |
|
| 160 |
+
| Fantasy | Princess Luna's Paintings | Fiction | Fiction | 97.3% | β
Magical elements detected |
|
| 161 |
+
| Magical | Tommy's Dream Sprites | Fiction | Fiction | **96.0%** | β οΈ Lower confidence - whimsical tone |
|
| 162 |
+
| **FICTION - Fantasy** | | | | | |
|
| 163 |
+
| Epic Fantasy | Shadowgate & Void Lords | Fiction | Fiction | 97.4% | β
High fantasy vocabulary |
|
| 164 |
+
| Magic System | Moonlight Weaver Elara | Fiction | Fiction | 96.8% | β
Complex world-building |
|
| 165 |
+
| Urban Fantasy | Dragon Memory Markets | Fiction | Fiction | 97.3% | β
Supernatural commerce |
|
| 166 |
+
| **NON-FICTION - Academic** | | | | | |
|
| 167 |
+
| Biology | Photosynthesis Process | Non-Fiction | Non-Fiction | 97.8% | β
Technical terminology |
|
| 168 |
+
| Mathematics | Calculus Theorem | Non-Fiction | Non-Fiction | 97.8% | β
Mathematical concepts |
|
| 169 |
+
| Economics | Market Equilibrium | Non-Fiction | Non-Fiction | 97.9% | β
Economic theory |
|
| 170 |
+
| **NON-FICTION - News** | | | | | |
|
| 171 |
+
| Financial | Federal Reserve Decision | Non-Fiction | Non-Fiction | 97.8% | β
Factual reporting style |
|
| 172 |
+
| Local Gov | Homeless Crisis Plan | Non-Fiction | Non-Fiction | 97.9% | β
Policy announcement format |
|
| 173 |
+
| Science | Exoplanet Discovery | Non-Fiction | Non-Fiction | 97.9% | β
Research reporting |
|
| 174 |
+
| **NON-FICTION - Journals** | | | | | |
|
| 175 |
+
| Financial | Wall Street Journal Market | Non-Fiction | Non-Fiction | 97.7% | β
Professional journalism |
|
| 176 |
+
| Scientific | Nature Research Report | Non-Fiction | Non-Fiction | 97.7% | β
Academic publication style |
|
| 177 |
+
| Personal | Kyoto Travel Log | Non-Fiction | Non-Fiction | **97.5%** | β οΈ Slightly lower - personal narrative |
|
| 178 |
+
|
| 179 |
+
### Key Insights:
|
| 180 |
+
- **Weakest Performance**: Realistic literary fiction (79.8% confidence) - the lighthouse story lacks obvious fantastical elements
|
| 181 |
+
- **Strongest Performance**: Academic/news content (97.8-97.9% confidence) - clear technical/factual language
|
| 182 |
+
- **Edge Cases**: Personal narratives and whimsical children's stories show slightly lower confidence
|
| 183 |
+
- **Perfect Accuracy**: 18/18 samples correctly classified despite confidence variations
|
| 184 |
|
| 185 |
### Detailed Test Results
|
| 186 |
|