SOAEs: State-Owned Assets and Enterprises—the specific Chinese industrial and economic sector targeted by this domain model.
Domain-Progressive SFT: A curriculum learning strategy that gradually shifts training data from weakly relevant general conversations to highly specialized expert data.
Speculative Decoding: An inference acceleration technique where a small, fast model drafts tokens and a larger, accurate model verifies them in parallel.
Logit Distillation: A training technique that forces a smaller model to mimic the probability distributions (logits) of a larger model to ensure their outputs align.
Rouge-1: A metric evaluating text generation quality by measuring the overlap of unigrams (single words) between the generated text and a reference.
BLEU-4: A metric evaluating text quality based on the overlap of 4-gram phrases between the model output and reference text.
SFT: Supervised Fine-Tuning—training a language model on high-quality instruction-response pairs to teach it how to follow user commands.
Catastrophic forgetting: When a neural network completely forgets previously learned general information upon learning new, specific information.
Curriculum learning: A training strategy that presents data in a meaningful order (e.g., from general to specific) rather than randomly.
GLM4: General Language Model 4, a proprietary large language model used in this paper to generate synthetic Q&A pairs.