Transformer: A neural network architecture based on self-attention mechanisms, dominating modern NLP
Pretraining: Training a model on a massive corpus of text (e.g., Wikipedia) to learn general language features before adapting it to a specific task
Fine-tuning: Taking a pretrained model and training it further on a smaller, task-specific dataset
Model Hub: A centralized repository hosted by Hugging Face where users can upload and download pretrained model weights
Tokenizer: A component that breaks raw text into smaller units (tokens) and maps them to numerical indices for the model
Head: A final neural network layer added on top of the base Transformer to project outputs into task-specific formats (e.g., classification labels)
ONNX: Open Neural Network Exchange—an open format for representing machine learning models, allowing interoperability between different frameworks and optimization tools
TorchScript: An intermediate representation of a PyTorch model that allows it to be run in high-performance environments (like C++) independent of Python
BPE: Byte-Pair Encoding—a subword tokenization algorithm that iteratively merges frequent pairs of characters
SOTA: State-of-the-Art—the current best performance level achieved by researchers