_comment: REQUIRED: Define ALL technical terms, acronyms, and method names used ANYWHERE in the entire summary. After drafting the summary, perform a MANDATORY POST-DRAFT SCAN: check every section individually (Core.one_sentence_thesis, evaluation_highlights, core_problem, Technical_details, Experiments.key_results notes, Figures descriptions and key_insights). HIGH-VISIBILITY RULE: Terms appearing in one_sentence_thesis, evaluation_highlights, or figure key_insights MUST be defined—these are the first things readers see. COMMONLY MISSED: PPO, DPO, MARL, dense retrieval, silver labels, cosine schedule, clipped surrogate objective, Top-k, greedy decoding, beam search, logit, ViT, CLIP, Pareto improvement, BLEU, ROUGE, perplexity, attention heads, parameter sharing, warm start, convex combination, sawtooth profile, length-normalized attention ratio, NTP. If in doubt, define it.
SFT: Supervised Fine-Tuning—training a pre-trained model on a specific labeled dataset to adapt it to a downstream task.
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique that freezes pre-trained weights and injects trainable rank decomposition matrices.
PEFT: Parameter-Efficient Fine-Tuning—a set of methods (like LoRA) to fine-tune models with minimal compute and memory.
ROUGE: Recall-Oriented Understudy for Gisting Evaluation—a set of metrics used to evaluate automatic summarization and translation by comparing n-grams.
BERTScore: A metric that computes semantic similarity between candidate and reference sentences using contextual embeddings from BERT.
Llama-3: A family of open-weights large language models developed by Meta.
RNN: Recurrent Neural Network—a class of neural networks where connections between nodes form a directed graph along a temporal sequence, often used for text.
5-core dataset: A subset of data where each user and item has at least 5 interactions, ensuring sufficient history for modeling.