| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Analysis of the reliability (correctness) of the generated dataset itself compared to baselines. | ||||
| MemDaily (Reliability) | Correctness (%) | 36.0 | 100.0 | +64.0 |
| MemDaily (Reliability) | Correctness (%) | 86.0 | 99.0 | +13.0 |
| Benchmarking of various memory mechanisms using the generated MemDaily dataset. | ||||
| MemDaily (Simp.) | Accuracy | 56.3 | 93.3 | +37.0 |
| MemDaily (Aggr.) | Accuracy | 39.0 | 55.0 | +16.0 |