VRec: Verifiable Reasoning for Recommendation—the proposed framework using interleaved verification.
Reason4Rec: The baseline 'reason-then-recommend' paradigm where reasoning happens fully before generation without intermediate checks.
Homogeneous reasoning: A failure mode where the model repeats the same trivial or surface-level reasoning patterns without gaining new insights.
Mixture of Verifiers (MoV): A set of specialized evaluation modules, each checking reasoning alignment with a different aspect (e.g., semantic, collaborative).
Monotonicity Regularization: A loss term enforcing that the uncertainty (entropy) of the reasoning process decreases strictly over subsequent steps.
Implicit feedback: Indirect user signals like clicks or views, as opposed to explicit ratings.
Proxy prediction objective: A surrogate task (predicting group-level preferences like genre) used to train the verifier since 'correct reasoning' has no ground truth labels.