← Back to Paper List

Modeling Trial-and-Error Navigation With a Sequential Decision Model of Information Scent

Xiaofu Jin, Yunpeng Bai, Antti Oulasvirta
Aalto University, National University of Singapore
arXiv (2026)
Memory Agent RL

📝 Paper Summary

Agentic AI Memory recall
This paper models human navigation as a resource-rational sequential decision process where agents balance information scent against memory decay and capacity limits to replicate trial-and-error behaviors.
Core Problem
Existing models of information scent assume users myopically choose the best visible link, failing to explain why users scan partially, make premature errors, or backtrack when cues are ambiguous.
Why it matters:
  • Predicting user struggles in complex information architectures (e.g., websites, menus) is crucial for automated interface optimization
  • Prior models like SNIF-ACT or CoLiDeS cannot simulate error recovery (backtracking) or non-greedy exploration because they lack memory dynamics and long-term planning
  • Understanding navigation requires modeling the cognitive costs (forgetting, time) that force users to accept 'good enough' options rather than searching exhaustively
Concrete Example: A user looking for 'Return Policy' might see a link for 'Customer Service', quickly select it without reading the rest of the page (premature commitment due to time cost), realize it's wrong, and then have to recall previous options which may have decayed from memory, forcing a backtrack—a sequence myopic models fail to predict.
Key Novelty
Sequential Decision Model of Information Scent (POMDP formulation)
  • Frames navigation not as a series of isolated greedy choices, but as a Partially Observable Markov Decision Process (POMDP) where the agent plans ahead to minimize wasted time
  • Integrates 'Resource Rationality' by explicitly modeling memory as a constrained resource: cues fade over time (decay) and only a limited number can be retained (capacity)
  • Distinguishes between 'Local Panel' (current screen) and 'Global Memory' (retained cues), allowing the agent to decide when to stop scanning and select or return based on accumulated belief
Evaluation Highlights
  • Qualitatively reproduces three key human behaviors: partial scanning of pages, backtracking after errors, and revisiting previously seen items
  • Replicates known empirical effects of information architecture: task difficulty adaptation, hierarchy depth effects, and positional layout biases
  • Demonstrates robustness of the learned policy under parameter perturbations of ±5%, ±10%, and ±25% for memory and noise values
Breakthrough Assessment
7/10
Significant theoretical advance by unifying Information Foraging Theory with POMDPs and memory constraints. It moves beyond static/myopic models to explain dynamic error recovery, though it is a simulation study rather than a new SOTA LLM.
×