RapidAPI: A large marketplace of third-party APIs used here for diverse, real-time financial services
AkShare: An open-source Python library providing reliable, research-oriented financial data interfaces
TMR: Timeliness Mismatch Rate—fraction of questions where the agent used tools with insufficient data freshness (e.g., daily vs. real-time)
IMR: Intent Mismatch Rate—fraction of questions where the agent violated intent constraints (e.g., executing a transaction when only information was requested)
DMR: Domain Mismatch Rate—fraction of questions where the agent used tools from the wrong regulatory domain (e.g., accessing crypto tools for a stock query)
FATR: Finance-Aware Tool Retrieval—a baseline method proposed in the paper that injects finance attributes into tool cards and stabilizes execution
Soft Score: A correctness metric for structured/free-text answers evaluated by an LLM judge
CSS: Compliance-aware Soft Score—the mean Soft Score over samples with successful tool execution