SLM: Small Language Model—a compact neural network (here ~100M parameters) designed for efficiency
UDRL: Upside-Down Reinforcement Learning—a paradigm where desired rewards or goals (like length) are provided as inputs to the model, treating RL as a supervised learning problem
T2I: Text-to-Image—generating images from text descriptions
T2T: Text-to-Template—generating design templates from text descriptions
Distillation: Training a smaller student model to mimic the behavior or outputs of a larger teacher model
vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs
nanoGPT: A simple, clean repository for training GPT-2 style models
MSE: Mean Squared Error—a metric measuring the average squared difference between estimated values and the actual value
LLM-as-a-judge: Using a strong LLM (like GPT-4) to evaluate the quality of outputs from other models