← Back to Paper List

Memorization to Generalization: Emergence of Diffusion Models from Associative Memory

Bao Pham, Gabriel Raya, M. Negri, M. J. Zaki, Luca Ambrogioni, Dmitry Krotov
Department of Computer Science, Rensselaer Polytechnic Institute, Jheronimus Academy of Data Science, Tilburg University, Department of Physics, University of Rome Sapienza, Donders Institute, Radboud University, MIT-IBM Watson AI Lab, IBM Research
arXiv.org (2025)
Memory Pretraining

📝 Paper Summary

Memory recall Generative Models Associative Memory
Diffusion models function as dense associative memories where generalization is the failure of perfect memory recall, characterized by the emergence of spurious states at the transition boundary.
Core Problem
The mechanism by which diffusion models transition from memorizing training data to generating novel samples is poorly understood, specifically regarding the existence and role of intermediate states.
Why it matters:
  • Understanding this transition is crucial for addressing privacy concerns where models replicate sensitive training data
  • Current theories often treat memorization as a side effect rather than a fundamental phase of the generative process
  • Identifying spurious patterns helps distinguish true creative generalization from mere blending or interpolation of training examples
Concrete Example: In a 2D toy model with data on a circle, a model trained on few points creates energy wells only at those points (memorization). As data increases, before forming a continuous circle (generalization), it creates 'spurious' wells at locations between data points that don't exist in the training set.
Key Novelty
Diffusion Models as Associative Memory with Spurious States
  • Conceptualizes diffusion training as writing memories and generation as memory retrieval, applying Dense Associative Memory (DenseAM) theory to diffusion models
  • Predicts and empirically verifies that 'spurious states' (stable patterns not in the training set) emerge specifically at the boundary between the memorization and generalization phases
  • Proposes that generalization effectively arises from the 'failure' of precise memory recall in the large-data limit
Architecture
Architecture Figure Figure 1
A conceptual diagram showing the three phases of the energy landscape: Memorization, Spurious, and Generalization.
Evaluation Highlights
  • Confirmed existence of spurious states in diffusion models trained on CIFAR-10 and CelebA using novel distance-based detection metrics
  • Demonstrated that spurious samples have distinct basins of attraction in the energy landscape, differing from both memorized and generalized samples
  • Showed that spurious states appear uniquely at the transition point where training set size is large enough to break pure memorization but too small for perfect generalization
Breakthrough Assessment
8/10
Provides a theoretically grounded and empirically verified link between classical associative memory theory and modern diffusion models, offering a novel perspective on generalization as 'failed recall'.
×