metadata
license: apache-2.0
datasets:
- asobirov/dga-preprocessed
- harpomaxx/dga-detection
- YangYang-Research/dga-detection
metrics:
- accuracy
- f1
tags:
- dga-detection
- streaming
DGA SGD Detector
This model was trained on the asobirov/dga-preprocessed
dataset using:
- HashingVectorizer (char 2–5 grams, 2**20 hash buckets)
- SGDClassifier with online
partial_fit
and incremental learning-rate tuning - Lightweight stats: length, digit ratio, Shannon entropy
- Streaming training on ~5 M domains, evaluated on ~1 M test domains from harpomaxx/dga-detection and YangYang-Research/dga-detection
- Decision threshold optimized for F1 (=0.0794) yielding 85.7% accuracy, 83.6% DGA recall