Depth-up Scaling: A method to scale up a model's depth (number of layers) efficiently by initializing new layers based on existing ones, avoiding training from scratch
Korea-centric AI: AI designed to internalize unique Korean values, cognitive frameworks, and commonsense reasoning, beyond simple translation
KMMLU: Korean Massive Multitask Language Understanding—a benchmark for evaluating LLMs on various subjects in Korean
Organic data: Naturally occurring, human-authored text (e.g., web pages, books, news)
Synthetic data: Text generated or augmented by AI models (e.g., translations, rewrites, Chain-of-Thought reasoning)
Chain-of-Thought (CoT): A prompting technique where the model is encouraged to generate intermediate reasoning steps before the final answer
DuS: Depth-up Scaling—the specific scaling strategy used to grow the 8B model to 11.5B
Perplexity: A measurement of how well a probability model predicts a sample; lower perplexity indicates the model is less surprised by the text