BFSI: Banking, Financial Services, and Insurance—the specific regulated domain this paper targets
RAHS: Risk-Adjusted Harm Score—a novel metric quantifying the operational severity of a harmful disclosure, accounting for disclaimers and judge consensus
FinRedTeamBench: The proposed benchmark dataset comprising 989 adversarial prompts across 7 financial risk categories
Jailbreak: A prompt or interaction strategy designed to bypass an LLM's safety guardrails
AML: Anti-Money Laundering—regulations preventing the disguise of illegally obtained funds
Ensemble Judging: Using multiple LLMs with different specializations (safety, reasoning, efficiency) to evaluate model outputs
Adaptive Red-Teaming: An attack process where the adversary model updates its strategy based on the target's previous responses