| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Vulnerability Identification results show massive gains on real-world CVE data using VSP compared to baselines. | ||||
| CVE Dataset | F1 | 8.96 | 58.48 | +49.52 |
| SARD Dataset | F1 | 56.28 | 65.29 | +9.01 |
| Vulnerability Discovery and Patching results further confirm VSP's superiority, especially on complex real-world cases. | ||||
| CVE Dataset | F1 (Discovery) | 33.16 | 45.25 | +12.09 |
| CVE Dataset | F1 (Patching) | 15.29 | 20.00 | +4.71 |
| Zero-day Discovery | Count of True Positives | 9 | 22 | +13 |