InfoNCE: A loss function used in contrastive learning that maximizes the probability of selecting the correct positive sample from a set of negatives.
Max-Margin Loss: A loss function that enforces a minimum distance (margin) between positive and negative pairs in the embedding space.
Temperature (tau): A scalar parameter in InfoNCE that controls the sharpness of the probability distribution; low tau focuses on hard negatives, high tau treats negatives more uniformly.
Long-tail data: Data distributions where a few classes are very frequent (head) and many classes are rare (tail).
Instance discrimination: Learning to distinguish every single data point as unique (encouraged by low temperature).
Group-wise discrimination: Learning to group semantically similar data points together (encouraged by high temperature).
CC3M: Conceptual Captions 3M, a large-scale dataset of image-caption pairs.
EPIC-KITCHENS-100: A large-scale egocentric video dataset recording daily kitchen activities.
YouCook2: A dataset of cooking videos with recipe texts.