_comment: REQUIRED: Define ALL technical terms, acronyms, and method names used ANYWHERE in the entire summary. After drafting the summary, perform a MANDATORY POST-DRAFT SCAN: check every section individually (Core.one_sentence_thesis, evaluation_highlights, core_problem, Technical_details, Experiments.key_results notes, Figures descriptions and key_insights). HIGH-VISIBILITY RULE: Terms appearing in one_sentence_thesis, evaluation_highlights, or figure key_insights MUST be defined—these are the first things readers see. COMMONLY MISSED: PPO, DPO, MARL, dense retrieval, silver labels, cosine schedule, clipped surrogate objective, Top-k, greedy decoding, beam search, logit, ViT, CLIP, Pareto improvement, BLEU, ROUGE, perplexity, attention heads, parameter sharing, warm start, convex combination, sawtooth profile, length-normalized attention ratio, NTP. If in doubt, define it.
SAM: Segment Anything Model—a foundation model for image segmentation capable of zero-shot transfer via prompting.
PTQ: Post-Training Quantization—reducing the precision of a pre-trained model (e.g., 32-bit float to 8-bit integer) without full re-training.
QAT: Quantization-Aware Training—re-training a model with simulated quantization errors to adapt weights.
Bimodal Distribution: A probability distribution with two distinct peaks (modes), separated by a sparse region.
Post-Key-Linear: The activations resulting from the linear projection that produces the 'Key' vectors in a Transformer attention block.
BIG: Bimodal Integration—the proposed method to merge bimodal activation peaks into a single unimodal distribution via sign flipping.
AGQ: Adaptive Granularity Quantization—the proposed method to dynamically adjust quantization precision for softmax outputs using a base-2 log scale.
mIoU: Mean Intersection over Union—a standard metric for segmentation accuracy measuring overlap between predicted and ground truth masks.
FLOPs: Floating Point Operations—a measure of computational cost.