falsification: The principle that scientific hypotheses cannot be proven true, only rejected (falsified) by contrary evidence
e-value: A non-negative random variable with expectation ≤ 1 under the null hypothesis, used for accumulating evidence in sequential testing
Type-I error: The probability of incorrectly rejecting a true null hypothesis (false positive)
p-to-e calibrator: A function that converts a standard p-value into an e-value (e.g., e = κ * p^(κ-1)) to allow evidence aggregation
optional stopping: The ability to stop gathering evidence at any time based on the data observed so far without invalidating statistical guarantees
sub-hypothesis: A specific, measurable implication derived from a broad, abstract main hypothesis (e.g., 'Expression of X correlates with Y' implies 'X regulates Y')
ReAct: Reasoning + Acting—a paradigm where LLMs generate reasoning traces and task-specific actions (like code execution) in an interleaved manner