ARL: Agentic Reinforcement Learning—training LLMs to use tools and reason via reinforcement learning signals.
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique that freezes the main model weights and trains small rank-decomposition matrices.
LEAS: Linear Effect Attribution System—a diagnostic framework proposed in this paper to decompose agent performance into individual capability effects and interaction terms.
Gradient Conflict: When the gradient vectors for two different tasks (e.g., reasoning vs. tool use) point in different, often orthogonal, directions, making joint optimization difficult.
Seesaw Phenomenon: A situation where improving one metric or capability causes a simultaneous decline in another.
Exact Match (EM): A strict evaluation metric where the generated answer must exactly match the ground truth.
Token Routing: The process of directing specific tokens (based on their role) to specific network modules or adapters.