LTL: Linear Temporal Logic—a formal language used to specify complex, time-dependent tasks (e.g., 'visit A, then B, never C')
MDP: Markov Decision Process—a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker
LDGBA: Limit-Deterministic Generalized Büchi Automaton—a type of finite state machine used to verify LTL formulas, consisting of deterministic and non-deterministic components
E-LDGBA: Embedded LDGBA—A novel automaton structure proposed here that tracks unvisited accepting sets to enable the use of deterministic policies for LTL satisfaction
Reward Shaping: Modifying the reward function (often using a potential function) to provide more frequent feedback to the agent without changing the optimal policy
Grover's Algorithm: A quantum search algorithm that can find an item in an unsorted database with quadratic speedup; used here to inspire action selection probabilities
EP-MDP: Embedded Product MDP—The combined state space of the environment (MDP) and the task automaton (E-LDGBA)