Evaluation Setup
Theoretical analysis of retrieval error and proposed Active Exemplar Selection strategy
Metrics:
- Retrieval Error (epsilon)
- Instance Error (||Delta z||)
- Contextual Error
- Statistical methodology: Not explicitly reported in the paper
Main Takeaways
- ICL is not learning but retrieval: The paper theoretically proves that ICL with self-attention is equivalent to retrieving patterns from a Hopfield Network.
- More exemplars is not always better: Increasing the number of exemplars (M) increases the Contextual Error term (interference) unless the new exemplars have very low Instance Error.
- Random selection works via mode approximation: Random selection relies on the law of large numbers to approximate the mode of the pattern distribution, requiring many exemplars to be effective.
- Active selection is more efficient: By explicitly selecting exemplars with high expected value (low Instance Error), the model can achieve lower retrieval error with fewer exemplars.