← Back to Paper List

Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning

Hourui Deng, Hongjie Zhang, Jie Ou, Chaosheng Feng
College of Computer Science, Sichuan Normal University, School of Information and Software Engineering, University of Electronic Science and Technology of China
arXiv (2024)
Agent Reasoning RL Factuality

📝 Paper Summary

Spatial reasoning Embodied AI agent Path planning
S2RCQL improves LLM maze navigation by converting raw coordinates into explicit entity relations to fix spatial hallucinations and using curriculum Q-learning to address long-term reasoning inconsistencies.
Core Problem
LLMs struggle with spatial reasoning and long-term path planning in maze environments due to spatial hallucinations (misunderstanding coordinates) and context inconsistency hallucinations (losing track during long reasoning chains).
Why it matters:
  • Standard prompt engineering (CoT) and even memory-augmented methods (Rememberer) often fail in simple mazes because LLMs intuitively favor shortest straight-line paths, ignoring obstacles.
  • Spatial reasoning is foundational for embodied intelligence, yet current LLMs perform poorly on these tasks compared to humans.
Concrete Example: In a maze with obstacles, an LLM might try to move from (1,0) to (1,1) because the coordinates look similar or geometrically close, even if a wall exists between them. Standard CoT agents often get stuck in forbidden zones or lose direction after a few steps.
Key Novelty
Spatial-to-Relational Transformation and Curriculum Q-Learning (S2RCQL)
  • Transforms implicit spatial coordinates (e.g., '(0,0) to (1,0)') into explicit entity relations (e.g., 'Node A connected to Node F') to prevent LLMs from hallucinating based on coordinate similarity.
  • Integrates Q-learning directly into the prompt context: the agent retrieves Q-values for state-action pairs to guide decision-making, replacing random exploration with LLM prior knowledge.
  • Uses Reverse Curriculum Learning (RCL) to generate simplified intermediate starting points, allowing the LLM to learn from easy-to-hard tasks and reduce reasoning chain length.
Evaluation Highlights
  • Outperforms the 'Rememberer' baseline by 25%–40% in Success Rate across 5x5, 7x7, and 10x10 mazes.
  • Achieves 23%–30% higher Optimality Rate (finding shortest paths) compared to Rememberer.
  • Removing the Spatial-to-Relational (S2R) module causes a ~15% drop in success rates, validating its role in mitigating spatial hallucination.
Breakthrough Assessment
7/10
Novel combination of symbolic transformation (spatial-to-relational) and RL-guided prompting. Significant empirical gains on maze tasks, though tested on a specific proprietary LLM (ERNIE-Bot) rather than open models.
×