Hard Samples: Data points that are difficult for the model to classify correctly (high loss) but contain valuable information for refining the decision boundary.
Noisy Samples: Data points with incorrect labels (e.g., accidental clicks) that confuse the model and should be removed or down-weighted.
False Positive Noise: Items interacted with by the user that do not reflect actual preference (e.g., misclicks).
False Negative Noise: Items not interacted with by the user despite being preferred (e.g., due to position bias).
Semantic Relevance: Similarity between user and item derived from their textual descriptions/embeddings.
Logical Relevance: Reasoning-based assessment of whether an item logically fits a user's past behavior and preferences.
Objective Alignment: Projecting embeddings from one task space (e.g., language modeling) to another (e.g., recommendation) to ensure compatibility.