Atomic statement: A polarized fact expressing a user's opinion about a single attribute or topic of an item (e.g., 'battery is long-lasting')
NLI: Natural Language Inference—determining if a hypothesis is entailed by, contradicts, or is neutral to a premise
BERTScore: A metric evaluating text generation quality by computing similarity between contextual embeddings of candidate and reference texts
Hallucination: Generated content that is grammatically fluent but factually incorrect or unsupported by the input source
Triplets: Structured representations consisting of (statement, topic, sentiment) extracted from reviews
St2Exp-P: Statement-to-Explanation Precision—measures what proportion of statements extracted from the ground truth are supported by the generated explanation
StEnt-P: Statement Entailment Precision—an NLI-based metric calculating the entailment probability between generated statements and ground truth statements
Coherence score: The difference between entailment probability and contradiction probability in NLI evaluation