COP: Contrastive Optimization on a Passage—an optimization method that updates an adversarial passage to be similar to triggered queries and dissimilar to clean queries
AaaA: Alignment as an Attack—a technique where the attacker injects content that triggers the LLM's safety filters (e.g., privacy warnings), causing it to refuse to answer
SFaaA: Selective-Fact as an Attack—injecting real but biased factual articles to steer the sentiment of the LLM's response without triggering hallucination filters
DoS: Denial of Service—an attack preventing the system from providing a valid response (in this context, causing the LLM to refuse to answer)
Trigger Scenario: A set of queries sharing specific semantic characteristics (e.g., discussing a specific politician) that activate the attack
Retriever Backdoor: A vulnerability where the retrieval system functions normally for clean queries but always retrieves a specific adversarial passage for triggered queries
Embedding: A vector representation of text used to calculate similarity between queries and documents