ToM: Theory of Mind—the ability to impute mental states (beliefs, intents) to oneself and others
False Belief (FB): A scenario where an agent's belief differs from reality (e.g., they didn't see an object move)
True Belief (TB): A scenario where an agent's belief matches reality
MCMC: Markov Chain Monte Carlo—algorithms for sampling from a probability distribution by constructing a Markov chain that has the desired distribution as its equilibrium distribution
Simulated Annealing: An optimization technique that explores a search space at high 'temperature' (randomness) and gradually cools down to settle into an optimal solution
sequence-level distribution: The joint probability of an entire sequence of tokens, as opposed to the conditional probability of just the next token
power sampling: Sampling from a distribution raised to a power α > 1, which sharpens the peaks (makes likely sequences relatively more likely)
CoT: Chain-of-Thought—prompting the model to generate intermediate reasoning steps before the final answer