Long-tail entities: Entities that appear infrequently in training corpora and real-world data, often leading to poorer model performance compared to popular (head) entities
KGQA: Knowledge Graph Question Answering—the task of answering natural language questions by retrieving and reasoning over structured facts in a knowledge graph
Hallucination: The generation of content by an LLM that contradicts ground truth facts or is nonsensical
StrategyQA: A benchmark dataset requiring multi-step implicit reasoning to answer True/False questions
CREAK: A benchmark dataset for claim verification requiring commonsense reasoning about entities
Wikidata: A large-scale, collaboratively edited, open knowledge graph
SPARQL: A semantic query language for databases, used to retrieve specific data from knowledge graphs like Wikidata
QID: Unique identifier for an item in Wikidata (e.g., Q42)
Inference rule: A logical statement (axiom) expressing the commonsense knowledge required to answer a query (e.g., 'If X has property Y, then Z')
Head entities: Popular, well-known entities that appear frequently in datasets (e.g., Barack Obama)