Lakehouse: A data architecture combining the flexibility of data lakes (cheap storage) with the management features of data warehouses (transactions, schemas)
MVCC: Multi-Version Concurrency Control—a database method where multiple versions of data exist simultaneously, allowing readers to see a consistent snapshot while writers update data without locking
DAG: Directed Acyclic Graph—a representation of a data pipeline where nodes are processing steps and edges are dependencies
FaaS: Function-as-a-Service—a cloud computing model where users write code functions and the platform manages the infrastructure, scaling, and isolation
Copy-on-Write: An optimization strategy where data is shared between snapshots until it is modified, at which point a copy is made, ensuring efficient branching
RBAC: Role-Based Access Control—restricting system access based on the roles of individual users or agents
ReAct: Reasoning + Acting—a paradigm where LLMs alternate between generating reasoning traces and executing actions (like calling an API)