Problem Definition
Setting: Generative Recommendation where an LLM generates a sequence of tokens representing items based on instruction and user history
Inputs: Instruction input and sequence of user historical interactions x = x_1...x_n
Outputs: Sequence of tokens y = y_1...y_m representing a recommended item (or list of items)
Pipeline Flow
- RecLLM (Main Generator)
- Text-Free Assistant (Score Provider)
- D3 Decoding (Integration & Selection)
System Modules
RecLLM
Generate token probabilities for the next step based on textual instruction and history
Model or implementation: Large Language Model (Specific architecture not detailed in provided text)
Text-Free Assistant
Provide item scores based on non-textual signals (e.g., collaborative filtering) to encourage diversity
Model or implementation: Text-free recommendation model (details cut off in text)
D3 Decoder
Select final tokens by combining LLM and Assistant scores without applying length normalization
Model or implementation: Modified Beam Search
Novel Architectural Elements
- Integration of a text-free assistant model directly into the token-level decoding loop of an LLM
- Strategic removal of length normalization specifically to address the 'ghost token' phenomenon in item generation
Modeling
Base Model: Large Language Model (Specific variant not reported in provided text)
Compute: Not reported in the paper
Comparison to Prior Work
- vs. Standard Beam Search: D3 removes length normalization and injects external text-free signals
- vs. DBS: D3 uses an auxiliary model for diversity rather than just grouping hypotheses
- vs. Temperature Sampling: D3 addresses structural bias (ghost tokens) rather than just randomizing selection
Limitations
- The approach relies on an auxiliary text-free model, which introduces additional complexity
- The 'ghost token' analysis assumes items are a non-uniformly sampled subset of language space, which may vary by dataset
- Removing length normalization entirely assumes ghost token removal results in uniform lengths (as claimed), which may not hold for all item spaces
Reproducibility
Code is publicly available at https://github.com/SAI990323/DecodingMatters. The provided text does not contain hyperparameter details or specific model sizes.