← Back to Paper List

Generative Flow Networks as Entropy-Regularized RL

Daniil Tiapkin, Nikita Morozov, Alexey Naumov, Dmitry Vetrov
École polytechnique, HSE University, Constructor University
arXiv (2023)
RL

📝 Paper Summary

Generative Flow Networks (GFlowNets) Entropy-Regularized Reinforcement Learning (Soft RL) Probabilistic Modeling
This paper proves that training Generative Flow Networks on general graphs is mathematically equivalent to entropy-regularized reinforcement learning, enabling the use of standard Soft RL algorithms for generative modeling.
Core Problem
Standard Reinforcement Learning (RL) maximizes returns, leading to deterministic policies unsuitable for sampling diverse objects, while GFlowNet training requires specialized, often unstable objectives.
Why it matters:
  • GFlowNets are powerful for scientific discovery (e.g., molecule generation) but have a fragmented algorithmic landscape separate from the mature RL field
  • Prior work incorrectly assumed the connection between GFlowNets and RL was limited to tree-structured graphs (autoregressive generation), discouraging the use of standard RL tools for general graph-based generation
Concrete Example: In molecule generation, many different sequences of actions (order of adding atoms) can create the same molecule (a Directed Acyclic Graph structure). Classical RL would find just one optimal path to the highest-reward molecule, failing to sample diverse high-reward candidates proportionally to their rewards.
Key Novelty
Equivalence of GFlowNets and Soft RL on DAGs
  • Demonstrates that the GFlowNet flow-matching problem on any Directed Acyclic Graph (DAG) can be strictly reformulated as an entropy-regularized RL problem with specific rewards and regularizers
  • Maps established GFlowNet objectives like Detailed Balance (DB) and Trajectory Balance (TB) directly to Soft RL concepts, allowing algorithms like SoftDQN to replace specialized GFlowNet solvers
Evaluation Highlights
  • Refutes previous claims by Bengio et al. (2023) that the Soft RL connection holds only for tree structures, proving it holds for general DAGs
  • Demonstrates that standard Soft RL algorithms (SoftDQN, Munchausen DQN) are competitive with or outperform specialized GFlowNet methods (like Trajectory Balance) on probabilistic modeling tasks
  • Establishes a direct reduction allowing off-the-shelf RL algorithms to solve GFlowNet problems without modification
Breakthrough Assessment
9/10
Provides a fundamental theoretical unification that bridges two major subfields. By proving GFlowNets are a special case of Soft RL, it unlocks decades of RL research for generative modeling.
×