← Back to Paper List

Don't Look Back in Anger: MAGIC Net for Streaming Continual Learning with Temporal Dependence

Federico Giannini, Sandro D'Andrea, Emanuele Della Valle
Politecnico di Milano
arXiv (2026)
Memory Benchmark

📝 Paper Summary

Streaming Continual Learning (SCL) Temporal Dependence in Data Streams
MAGIC Net adapts to concept drifts in data streams by freezing past weights and dynamically choosing between learning masks for existing parameters or expanding the recurrent network architecture.
Core Problem
Existing methods fail to simultaneously address concept drift, catastrophic forgetting, and temporal dependence in data streams, often resorting to offline training or unlimited architecture growth.
Why it matters:
  • Real-world streams (IoT, robotics, finance) have temporal dependencies that standard Continual Learning ignores
  • Streaming Machine Learning methods adapt quickly but suffer from catastrophic forgetting when concepts recur
  • Prior hybrid approaches like cPNN expand the architecture at every drift, leading to unbounded memory growth
Concrete Example: In product demand forecasting, seasonal fluctuations (concept drift) require adapting to new patterns without forgetting the baseline relationship between price and demand. Current models either forget the old season entirely or add a whole new network column for every season change, inflating memory usage.
Key Novelty
Masked, Adaptive, Growing, Intelligent, and Continuous Network (MAGIC Net)
  • Upon detecting drift, the model freezes current weights and launches a parallel ensemble of adaptation strategies: random masking, mask fine-tuning, or architecture expansion.
  • It automatically selects the best strategy online based on short-term performance, expanding the network only when necessary (unlike cPNN which always expands).
  • Uses learnable real-valued masks passed through a sigmoid function (soft masking) rather than binary masks, allowing more expressive gradient-based optimization on frozen weights.
Architecture
Architecture Figure Figure 1
The MAGIC Net architecture and its adaptive ensemble mechanism triggered after a drift detection.
Evaluation Highlights
  • Outperforms cPNN by +31.5% in Kappa score on the PowerConsumption dataset (start phase adaptation).
  • Achieves comparable or better accuracy than cPNN on synthetic SineRW benchmarks while requiring significantly fewer parameters (expanding only when necessary).
  • Demonstrates +66.7% improvement in backward transfer (memory retention) compared to standard cGRU on the Weather dataset.
Breakthrough Assessment
7/10
Strong conceptual unification of streaming, continual learning, and time-series forecasting. The dynamic expansion mechanism is a smart efficiency improvement over cPNN, though validation is limited to specific benchmarks.
×