Evaluation Setup
Prediction of real-valued targets from semantic inputs (text/metadata)
Benchmarks:
- House Price Prediction (Hedonic Regression)
- Netflix Movie Embeddings (Cold-start Collaborative Filtering)
Metrics:
- Median Relative Error
- Cosine Similarity
- Statistical methodology: Not explicitly reported in the paper
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| House Price Prediction |
Median Relative Error |
0.38 |
0.12 |
-0.26
|
| Netflix Movie Embeddings |
Cosine Similarity |
0.0 |
0.59 |
+0.59
|
Main Takeaways
- Domain knowledge in LLMs is insufficient for specific datasets; LLMs alone yield 38% error on house prices while data-driven discovery yields 12%.
- Semantic features discovered by GenZ can effectively proxy for massive amounts of interaction data (4000 ratings) in recommender systems.
- The iterative 'add-feature' and 'remove-feature' algorithms successfully refine the semantic description of latent variables to better explain statistical outliers.