| Benchmark | Metric | Baseline | This Paper | ฮ |
|---|---|---|---|---|
| SemEval-SS results demonstrate the model's ability to disambiguate word senses. The 'Frozen' setting tests intrinsic knowledge without task-specific training. | ||||
| SemEval-SS | Accuracy | 65.1 | 75.6 | +10.5 |
| SemEval-SS | Accuracy | 67.3 | 79.5 | +12.2 |
| SemEval-SS | Accuracy | 81.1 | 83.7 | +2.6 |
| Word in Context (WiC) | Accuracy | 69.6 | 72.1 | +2.5 |
| Word in Context (WiC) | Accuracy | 70.9 | 72.1 | +1.2 |
| GLUE (average) | Score | 77.5 | 77.9 | +0.4 |