MLE: Machine Learning Engineer—the human role this system aims to automate
North Star Metrics: High-level business goals (e.g., long-term user retention, total watch time) that are often delayed and non-differentiable
Proxy Reward: A differentiable function used during training to approximate the non-differentiable North Star metrics
Inner Loop: The offline phase where agents generate and filter candidates using cheap proxy metrics (e.g., offline loss, SQL analysis) before live testing
Outer Loop: The online phase where surviving candidates are deployed to live traffic to measure actual North Star metrics
DCN: Deep Cross Network—a neural architecture designed to learn explicit feature interactions
RL: Reinforcement Learning—training agents to take actions that maximize cumulative reward
AutoML: Automated Machine Learning—tools that automate parts of the ML pipeline, typically limited to hyperparameter tuning
Gemini 2.5 Pro: A specific multimodal Large Language Model from Google with advanced reasoning capabilities