← Back to Paper List

Temporal Attack Pattern Detection in Multi-Agent AI Workflows: An Open Framework for Training Trace-Based Security Models

Ron F. Del Rosario
SAP, OWASP Gen AI Security Project - Agentic Security Initiative (ASI)
arXiv (2025)
Agent Benchmark Factuality

📝 Paper Summary

Agentic AI Security Trace-based Analysis
This paper introduces a reproducible framework for training small language models to detect malicious multi-step agent behaviors by fine-tuning on synthetic OpenTelemetry traces and curated cybersecurity datasets.
Core Problem
Existing AI safety mechanisms focus on single-turn text generation (like prompt injection) but fail to detect multi-step attack patterns that emerge across agent workflows, such as stealth privilege escalation or multi-agent coordination attacks.
Why it matters:
  • Commercial vendors use trace-based monitoring but keep their methodologies closed, preventing practitioners from building custom security models adapted to specific threat landscapes
  • Benign individual actions (e.g., 'list directory') can be malicious in aggregate (e.g., reconnaissance), requiring temporal context that single-prompt safety filters miss
  • Current benchmarks focus on harmful task completion rather than the trace-based behavioral analysis needed for operational security monitoring
Concrete Example: A workflow like 'read_file(/etc/passwd) → http_request(attacker.com)' represents data exfiltration. While 'read_file' might be benign in isolation, the sequence reveals malicious intent. Standard safety filters examining only the 'read_file' prompt would miss the broader attack context.
Key Novelty
Trace-Based Temporal Pattern Detection via Fine-Tuning
  • Treats security monitoring as a sequence modeling problem by fine-tuning LLMs on OpenTelemetry traces, allowing the model to analyze timestamps, agent IDs, and tool outputs collectively
  • Generates synthetic 'attack traces' using templates to simulate complex multi-agent scenarios (coordination attacks, regulatory violations) that are scarce in public datasets
Architecture
Architecture Figure Not explicitly labeled as Figure 1 but described in Methodology
Conceptual flow of the security monitoring pipeline: Raw OpenTelemetry Traces → Trace Parser → Prompt Construction → Fine-Tuned LLM → Security Verdict
Evaluation Highlights
  • +31.4% accuracy improvement (42.86% → 74.29%) on a custom cybersecurity benchmark after iterative fine-tuning
  • Achieved statistically significant gains (p < 0.001) using only 0.148 epochs of training on ARM64 hardware
  • Demonstrated that adding just 30 targeted adversarial examples yielded a +7.2 point gain in the final refinement stage
Breakthrough Assessment
7/10
First open methodology for trace-based agentic security with strong educational value. However, the model suffers from severe false positives (66.7%) in practice, limiting autonomous deployment.
×