SLR: Systematic Literature Review—a research method that rigorously identifies, selects, and appraises all relevant research on a specific topic
PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses—a standard guideline for reporting systematic reviews
TP/FP/TN/FN: True Positive, False Positive, True Negative, False Negative—standard classification metrics
Recall: The percentage of relevant papers successfully found by the model (crucial for SLRs to avoid missing work)
Precision: The percentage of papers flagged by the model that are actually relevant (important for reducing human workload)
Consensus Voting: A strategy where the final decision is determined by the agreement of multiple different LLM models
Zero-shot learning: The model performs the task without seeing any specific training examples, relying only on the prompt instructions
RAG: Retrieval-Augmented Generation—enhancing LLMs by retrieving relevant external data (mentioned as future work)
F1 score: Harmonic mean of precision and recall, used here to select the best models for consensus
Snowballing: A search technique where references of included papers are recursively checked to find more relevant papers