LGA: Layered Governance Architecture—the four-layer defense framework proposed in this paper
prompt injection: An attack where a user inputs malicious instructions that override the model's original system instructions
RAG poisoning: Retrieval-Augmented Generation poisoning—inserting malicious data into a knowledge base so the agent retrieves and executes it
malicious skill plugins: Third-party extensions that perform unauthorized operations (e.g., data exfiltration) while executing legitimate functions, similar to supply-chain attacks
intent verification: The process of checking if a proposed tool call semantically aligns with the user's original stated intent
seccomp: Secure Computing Mode—a Linux kernel feature that restricts the system calls a process can make
HMAC-SHA256: Hash-Based Message Authentication Code using SHA-256—a cryptographic method to verify data integrity and authenticity
NLI: Natural Language Inference—a task determining if a hypothesis logically follows from a premise, used here as a baseline for intent verification
RAG: Retrieval-Augmented Generation—enhancing model outputs by retrieving relevant documents from a knowledge base
IR: Interception Rate—the percentage of malicious attacks successfully blocked by the defense
FPR: False Positive Rate—the percentage of benign/legitimate tool calls incorrectly blocked by the defense
TTL: Time To Live—a limit on the period of time or number of hops that a packet or token is valid
PPV: Positive Predictive Value—the probability that a positive result (flagged attack) is truly a malicious attack
unshare: A Linux command used to run a program with some namespaces unshared from the parent, creating isolation (basis for containers)