Coarse-grained keywords: Representative words extracted from query/context used as an initial filter for retrieved chunk relevance (e.g., 'intelligent cars')
Fine-grained keywords: Lists of specific information points (spans of text) extracted from context that must be present to answer the query accurately
Golden chunks: Pre-annotated specific text segments considered the 'correct' retrieval target; fragile because they become invalid if chunking strategies change
Factual query: Queries seeking specific, clear facts or evidence (e.g., 'Where is the capital of the US?')
Analytical query: Queries seeking analysis for concepts or terms (e.g., 'Why is the earth warming?')
Comparative query: Queries seeking comparisons across dimensions (e.g., 'Differences between A and B?')
Tutorial query: Queries seeking steps to perform a task (e.g., 'Steps to install TensorFlow?')
Pass Rate: The ratio of generated examples where the LLM-evaluator assigns a correctness score greater than or equal to 4 (on a 1-5 scale)