← Back to Paper List

FinMoE: A MoE-based Large Chinese Financial Language Model

X Zhang, Q Yang
Du Xiaoman Financial
Proceedings of the Joint Workshop of the 9th Financial …, 2025 (2025)
Pretraining Reasoning QA Benchmark

📝 Paper Summary

Financial Large Language Models Mixture-of-Experts (MoE) Architecture
FinMoE is a dense Mixture-of-Experts financial language model that activates all experts simultaneously to balance specialized financial expertise with general reasoning capabilities.
Core Problem
General-purpose models lack depth in financial specifics, while models tailored exclusively to finance lose broader reasoning abilities needed for complex tasks.
Why it matters:
  • Financial tasks require both precise domain terminology and general common-sense reasoning to solve real-world problems like risk assessment.
  • Sparse MoE models can suffer from training instability and uneven expert utilization, limiting their effectiveness in integrating diverse knowledge types.
Concrete Example: Answering a financial question often requires integrating specific methodologies (domain knowledge) with general contextual awareness; a purely general model might miss the methodology, while a purely financial one might fail the reasoning steps.
Key Novelty
Dense Mixture-of-Experts for Domain Adaptation
  • Unlike standard Sparse MoE that selects top-k experts, FinMoE uses a Dense MoE where all experts are activated for every input token.
  • Outputs from all expert networks are combined via a dynamic weighted summation based on the input, ensuring every expert contributes to the final representation.
Evaluation Highlights
  • Achieves a score of 80 on the Finance benchmark, significantly outperforming Qwen-7B (30.2) and Yi-6B (19.4).
  • Maintains strong general capabilities, scoring 70.6 on Knowledge tasks compared to Qwen-7B's 67.6.
  • Demonstrates balanced performance across 6 domains (Language, Knowledge, Reasoning, Subject, Code, Finance) with an average score of 62.5, higher than baselines.
Breakthrough Assessment
7/10
Strong empirical results on financial benchmarks using a dense MoE approach. While the architecture (Dense MoE) is known, applying it specifically to balance financial vs. general trade-offs is a solid application contribution.
×