| Benchmark | Metric | Baseline | This Paper | Δ |
|---|---|---|---|---|
| DeepSpeed-HE demonstrates massive speedups over existing systems in the RLHF training phase (Step 3). | ||||
| RLHF Step 3 | Throughput | 1.0 | 6.0 | +5.0x (minimum speedup claimed) |
| RLHF Step 3 | Throughput | 1.0 | 1.4 | +0.4x (minimum speedup claimed) |
| Cost and time benchmarks for training various OPT model sizes on Azure A100 GPUs. | ||||
| OPT-13B Training | Time (Hours) | Not reported in the paper | 9 | Not reported in the paper |
| OPT-30B Training | Time (Hours) | Not reported in the paper | 18 | Not reported in the paper |
| OPT-175B Training | Time (Hours) | Not reported in the paper | 20 | Not reported in the paper |