contextual heads: A specific subset of attention heads in a Transformer that allocate the most attention weight to relevant context spans during correct answer generation.
focus directions: Vectors found in the key and query activation spaces that, when added to the model's activations, increase the attention weights assigned to relevant context spans.
split-softmax: A technique to artificially re-weight attention distributions by applying different scaling factors to specific token spans (e.g., boosting relevant context) before the softmax normalization.
contextual scoring: A metric proposed in this paper to quantify how much an attention head focuses on the gold-standard relevant context tokens during the generation of response tokens.
activation addition: A steering method where a specific vector is added to the internal representations (activations) of a model during inference to modify its behavior.
attention sink: The phenomenon where attention heads allocate a large amount of attention to the initial tokens (like the start-of-sequence token) regardless of their semantic importance.
Exact Match (EM): An evaluation metric that counts a prediction as correct only if it exactly matches one of the ground truth answers.