MASE: Multi-Agent Self-Evolving—a paradigm where agents continuously optimize their internal components and interaction patterns based on environmental feedback
MOP: Model Offline Pretraining—traditional static training of models on fixed corpora
MOA: Model Online Adaptation—post-deployment updates via fine-tuning or RLHF
MAO: Multi-Agent Orchestration—coordinating fixed agents to solve tasks without structural evolution
RAG: Retrieval-Augmented Generation—enhancing model responses by retrieving relevant information from external memory
SFT: Supervised Fine-Tuning—training a model on labeled examples to adapt it to specific tasks
RLHF: Reinforcement Learning from Human Feedback—aligning models using rewards derived from human preferences
MCTS: Monte Carlo Tree Search—a heuristic search algorithm used for decision-making processes
LoRA: Low-Rank Adaptation—efficiently fine-tuning models by updating a small set of parameters
topology: The structural configuration defining how agents are connected and communicate within a multi-agent system
meta-rewards: Higher-level reward signals used to guide the long-term evolution and optimization of agent systems
MCP: Model Context Protocol—standardized communication protocols for connecting AI agents to data sources and tools