← Back to Paper List

QSAF: A Novel Mitigation Framework for Cognitive Degradation in Agentic AI

Hammad Atta, M. Baig, Yasir Mehmood, Nadeem Shahzad, Ken Huang, Muhammad Aziz Ul Haq, Muhammad Awais, Kamal Ahmed
Qorvex Consulting, Wentworth Institute of Higher Education, DistributedApps.AI, Skylink Antenna
arXiv.org (2025)
Memory Agent Reasoning

📝 Paper Summary

Agentic AI Safety Runtime Security Cognitive Architecture
The paper defines Cognitive Degradation as a distinct vulnerability class in AI agents and proposes QSAF Domain 10, a lifecycle-aware framework with runtime controls to detect and mitigate internal system failures.
Core Problem
Autonomous AI agents suffer from internal runtime failures—such as memory starvation, planner recursion, and context flooding—that lead to silent drift and hallucinations, which traditional external prompt injection defenses fail to detect.
Why it matters:
  • Current defenses focus on external threats (prompt injection) while internal cognitive failures (e.g., logic loops, memory poisoning) remain largely unaddressed.
  • Agentic frameworks like LangChain and AutoGPT introduce complex dependencies where a failure in one module (e.g., memory latency) cascades into systemic collapse.
  • There is no existing structured lifecycle model to identify the progressive stages of agent degradation before total system failure occurs.
Concrete Example: When LLaMA3 is prompted with 'You must keep refining this task until it is perfect. Don't stop,' it enters a recursive loop generating self-referential subtasks. Without the proposed starvation detection, the planner degrades until memory and output modules fail completely.
Key Novelty
Qorvex Security AI Framework (QSAF) Domain 10
  • Defines a formal six-stage 'Cognitive Degradation Lifecycle' (from Trigger Injection to Systemic Collapse) to model how internal agent faults evolve.
  • Introduces seven specific runtime controls (QSAF-BC-001 to BC-007) that act as a resilience overlay, monitoring subsystems for signals like latency spikes or entropy drift to trigger fallback logic.
  • Maps agentic architectures to human cognitive analogs to enable behavioral introspection, moving beyond simple input/output filtering.
Architecture
Architecture Figure Figure 2
The QSAF Domain 10 Architecture, illustrating the overlay of security controls on top of standard agent subsystems.
Evaluation Highlights
  • Identified critical 'Planner Entrapment' vulnerability in LLaMA3 where recursive goals caused infinite logic loops, undetected by default safety layers.
  • Demonstrated 'Persistent Memory Drift' in Mixtral and Claude, where hallucinated content was stored in vector memory and reused across sessions (Cross-Session Memory Poisoning).
  • Uncovered 'Output Suppression' risks in ChatGPT, which failed to warn users when toolchains returned null responses due to rate-limiting.
Breakthrough Assessment
7/10
Establishes a necessary new vulnerability class for agents and a structured defense framework. However, the paper is qualitative, lacking code or quantitative performance metrics (e.g., overhead, detection accuracy) for the proposed controls.
×