LM-as-KB: Language Models as Knowledge Bases—using an LLM to retrieve facts directly via natural language queries rather than querying a structured database
Exact Match (EM): Evaluation metric where the generated answer must be identical to the ground truth label
Alias Matching (AM): Evaluation metric where the answer is correct if it matches the label or any known alternative names (aliases) from Wikidata
Fuzzy Matching (FM): Evaluation metric where the answer is correct if it *contains* the label or any of its aliases, allowing for verbose responses
SaulLM: A 7-billion parameter language model based on Mistral-7B, further pre-trained on a large corpus of legal documents
Abstention: The ability of a model to refuse to answer ('I don't know') when uncertain, used to increase precision by reducing hallucinations
Zero-shot: Prompting the model with only the question, without providing example question-answer pairs
Few-shot: Prompting the model with the question plus a few (e.g., 5) example question-answer pairs to guide format and context