DriftMoE – A Mixture of Experts Approach to Handle Concept Drifts

Model weights for paper DriftMoEThis repository hosts weights only so you can plug the model straight into your Python pipeline. These weights correspond to one training run on the LED_g stream. Full training code & utilities live in a separate GitHub repo: https://github.com/miguel-ceadar/drift-moe

📂 Files

We have two folders, one for the MoE-Task variant and other for the MoE-Data variant, both share this file structure:

router.pth           # PyTorch state_dict for the gating MLP
expert_0.pkl         # CapyMOA HoeffdingTree for each expert(pickled)
expert_1.pkl
…
expert_{N‑1}.pkl

⚡ Quick Start (CPU or GPU)

1 · Install runtime deps

You need to install Java and have a working java Runtime Environment to run Capymoa: https://openjdk.org/install/

python -m pip install torch capymoa numpy river
# git clone training repo – needed so we can recreate RouterMLP & Expert wrappers
git clone https://github.com/miguel-ceadar/drift-moe drift_moe

2 · Load the router & experts

import torch, pickle, numpy as np 
from capymoa.misc import load_model
from drift_moe.driftmoe.moe_model import RouterMLP, Expert

INPUT_DIM   = 24      # ↩ must match dimensions of LED_g stream
NUM_CLASSES = 10      # ↩ idem
N_EXPERTS   = 12      # ↩ number of expert_*.pkl files (12 for MoE_Data and 10 for MoE_Task)
DEVICE      = 'cpu'   # or 'cuda'

# 2‑a) Router
router = RouterMLP(input_dim=INPUT_DIM, hidden_dim=256, output_dim=N_EXPERTS)
router.load_state_dict(torch.load('path/to/router.pth', map_location=DEVICE))
router = router.to(DEVICE).eval()

# 2‑b) Experts (pickled CapyMOA trees)
experts = []
for i in range(N_EXPERTS):
    with open(f'path/to/expert_{i}.pkl', 'rb') as f:
        ex = load_model(f)               # HoeffdingTree object
    
    experts.append(ex)

Reference: the official CapyMOA save & load notebook.

3 · Single‑sample inference helper

def predict_one(instance) -> int:
    """Route a single feature vector through driftMoE and return class index."""
    x_vec = instance.x
    x_t = torch.tensor(x_vec, dtype=torch.float32).unsqueeze(0).to(DEVICE)
    with torch.no_grad():
        logits = router(x_t)                 # shape [1, N_EXPERTS]
        eid    = int(torch.argmax(logits, 1).item())
    return experts[eid].predict(instance)

🚰 Streaming usage

from capymoa.stream.generator import LEDGenerator
stream = LEDGenerator()
while stream.has_more_instances():
    inst = stream.next_instance()
    y_hat = predict_one(inst)
    print(y_hat)
    print(inst.y_index)

The experts are frozen; only the router runs every forward pass.

✏️ Citation

@misc{aspis2025driftmoemixtureexpertsapproach,
      title={DriftMoE: A Mixture of Experts Approach to Handle Concept Drifts}, 
      author={Miguel Aspis and Sebastián A. Cajas Ordónez and Andrés L. Suárez-Cetrulo and Ricardo Simón Carbajo},
      year={2025},
      eprint={2507.18464},
      archivePrefix={arXiv},
      primaryClass={stat.ML},
      url={https://arxiv.org/abs/2507.18464}, 
}

Questions or issues? Open an issue on the GitHub repo and we’ll be happy to help.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support