Evaluation Setup
Field experiment with staggered rollout across 5,172 customer support agents
Benchmarks:
- Pre-adoption baseline (Customer Support Resolution)
Metrics:
- Resolutions per hour (Productivity)
- Average Handle Time (AHT)
- Customer Sentiment
- Request for Manager
- Worker Attrition
- Statistical methodology: Difference-in-Differences (DiD) and Event Study specifications
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| Productivity analysis shows significant gains for the average worker, driven heavily by improvements among low-skilled and novice agents. |
| Firm Data |
Resolutions per hour |
Not reported in the paper |
Not reported in the paper |
+15%
|
| Firm Data |
Resolutions per hour (Low-skill) |
Not reported in the paper |
Not reported in the paper |
+30%
|
| Firm Data |
Performance equivalence |
6 months |
2 months |
4 months acceleration
|
Main Takeaways
- Generative AI compresses the productivity distribution: low performers improve dramatically, while high performers stay flat or decline slightly
- The tool acts as a mechanism for 'upskilling', effectively transferring the tacit knowledge of experienced workers to novices
- AI assistance improves the experience of work: customers are more polite, less likely to ask for a manager, and worker attrition decreases (especially for newer workers)
- Gains are persistent even during software outages, suggesting durable learning occurs through usage