← Back to Paper List

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

(CMU, AWS) Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig
Language Technologies Institute, Carnegie Mellon University, Amazon AWS, AWS AI Labs
ICML (2022)
RAG Memory

📝 Paper Summary

Modularized RAG pipeline Retrieval efficiency
RETOMATON approximates expensive nearest-neighbor searches in retrieval-based language models by traversing a weighted finite automaton built from datastore clusters and pointers.
Core Problem
Retrieval-based language models (like kNN-LM) perform a computationally expensive nearest-neighbor search over a massive datastore for every single generated token, causing severe inference latency.
Why it matters:
  • High inference costs hinder the practical deployment of retrieval-based models despite their accuracy gains
  • Standard kNN search is repetitive; retrieving a specific neighbor at step t strongly suggests which neighbors will be relevant at t+1, but current models ignore this structure
  • Current approximate methods often sacrifice perplexity for speed or require training auxiliary networks
Concrete Example: In the phrase 'The U.S. president is Joe Biden', if the model retrieves the context for 'The U.S. president is', the next token 'Joe' and its subsequent context are physically adjacent in the training corpus. Standard kNN-LM ignores this and re-searches the whole datastore for 'Joe', whereas RETOMATON simply follows a pointer to the next entry.
Key Novelty
RETOMATON (Retrieval Automaton)
  • Constructs a Weighted Finite Automaton (WFA) over the datastore by saving pointers between consecutive text entries and clustering similar contexts into states
  • Replaces frequent kNN searches with graph traversal: if the model follows a known path (pointer), it skips the search; if it deviates, it restarts with a fresh kNN search
  • Completely unsupervised construction that requires no additional training or auxiliary networks, unlike concurrent adaptive retrieval methods
Architecture
Architecture Figure Figure 1
Concept of RETOMATON traversal. Shows how context 'The U.S.' triggers a kNN search, finding 'president' nodes. Instead of searching again for 'Joe', the model follows pointers (transitions) within the automaton.
Evaluation Highlights
  • Saves 81% of nearest neighbor searches on WIKITEXT-103 while matching standard kNN-LM perplexity
  • Reduces perplexity by 1.85 (from 16.65 to 14.80 roughly estimated from chart, or 16.08 at FoSS=0) on WIKITEXT-103 when used purely for accuracy enhancement
  • Outperforms fine-tuning on Law-MT domain adaptation: 17.5% relative perplexity reduction compared to a fine-tuned Transformer baseline
Breakthrough Assessment
8/10
Offers a significant efficiency breakthrough for kNN-LMs without requiring extra training. The graph-based approximation is an elegant neuro-symbolic solution to a brute-force bottleneck.
×