| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Results on the personalized (Pers) dataset version, where relevance is strictly the user-selected best answer, show significant gains from the TAG model. | ||||
| SE-PQA Pers | MAP@100 | 0.510 | 0.528 | +0.018 |
| SE-PQA Pers | P@1 | 0.417 | 0.440 | +0.023 |
| SE-PQA Pers | NDCG@10 | 0.525 | 0.543 | +0.018 |
| Results on the Base dataset version (relevance = any positive score) show smaller but still significant gains. | ||||
| SE-PQA Base | MAP@100 | 0.443 | 0.457 | +0.014 |