Knowledge Preservation: The percentage of previously known scientific claims that remain correctly classified after a model update
Knowledge Acquisition: The proportion of new scientific claims successfully learned by the model through the update method
Knowledge Projection: The ability of the updated model to correctly infer or anticipate claims from 'future' papers not yet seen
Distortion: An error type where a model confidently predicts the wrong label (e.g., claiming a false fact is true), considered worse than simply not knowing
Atomic Scientific Claim: A verifiable statement expressing a finding about one aspect of a scientific entity or process, verifiable against a single source
RAG: Retrieval-Augmented Generation—providing new information to a model via the context/prompt rather than training weights
SFT: Supervised Fine-Tuning—training a model on labeled examples
LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique that freezes pre-trained weights and injects trainable rank decomposition matrices
OLMo: Open Language Model—an open-source large language model series
Semantic Scholar API: A service used to retrieve scientific papers and citation graphs for dataset construction