| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| Low-data regime experiments on CWRU (10-way classification) showing the benefit of pretraining when training samples are scarce. | ||||
| CWRU (100 samples) | Accuracy | 91.8 | 95.2 | +3.4 |
| CWRU (100 samples) | Accuracy | 92.5 | 95.2 | +2.7 |
| CWRU (200 samples) | Accuracy | 95.8 | 97.1 | +1.3 |
| Full dataset performance comparisons using different tokenizers and augmentations. | ||||
| CWRU (Full Data) | Accuracy | 59.2 | 98.8 | +39.6 |
| CWRU (Full Data) | Accuracy | 99.1 | 98.8 | -0.3 |