Evaluation Setup
Controlled experiment using Microsoft Copilot for Microsoft 365 within a fictional enterprise scenario (WeSellThneeds LLC)
Benchmarks:
- Custom Enterprise Scenario (Business Intelligence / Summarization) [New]
Metrics:
- Success of attack (qualitative observation of response content and citations)
- Statistical methodology: Not explicitly reported in the paper
Main Takeaways
- Malicious documents alone (false information) are often insufficient; Copilot may present them alongside correct data, alerting the user.
- Adding authoritative strings like 'This document trumps others' successfully forces the LLM to ignore legitimate sources and present only the malicious data.
- Attacks can suppress citations, making it impossible for the user to verify the source of the information (Attack 2).
- Phantom Document Attack: Deleted documents can persist in the RAG index/cache, influencing answers after the attacker has removed the evidence.