Multilingual Knowledge Editing with Language-Agnostic Factual Neurons

📝 Paper Summary

Knowledge Editing Multilingual LLMs

LU-LAFNs enables simultaneous multilingual knowledge editing by identifying and updating shared 'language-agnostic' neurons that encode the same fact across different languages.

Core Problem

Existing Multilingual Knowledge Editing (MKE) methods often treat languages separately or fail to model the shared semantic connections of facts, leading to conflicts where updating one language degrades performance in others.

Why it matters:

Large Language Models (LLMs) are increasingly deployed in multilingual settings, requiring synchronized updates of facts across all supported languages.
Ignoring cross-lingual connections causes 'knowledge conflicts' where updates fail to propagate correctly or damage the model's general abilities.
Directly applying monolingual editing methods (like ROME or MEMIT) to multilingual contexts often degrades edit reliability and locality.

Concrete Example: When adapting monolingual editors like MEMIT to update a fact in multiple languages, the edit performance often degrades (e.g., success rate drops significantly compared to single-language editing) because the method does not account for the shared internal representation of that fact across languages.

Key Novelty

Locating and Updating Language-Agnostic Factual Neurons (LU-LAFNs)

First, the paper discovers 'Language-Agnostic Factual Neurons' (LAFNs)—a shared set of neurons in Feed-Forward Networks (FFNs) that activate for the same fact regardless of the input language.
Second, it proposes a method to locate these specific shared neurons using paraphrased inputs and then optimize update values to modify them.
Finally, instead of permanently altering weights, it caches these update values and retrieves them during inference when the relevant subject is detected.

Architecture

The overall framework of LU-LAFNs.

Evaluation Highlights

Outperforms state-of-the-art MKE methods on the Bi-ZsRE benchmark, achieving the highest reliability and generality scores.
Achieves superior performance on the MzsRE benchmark across multiple language pairs (e.g., English-Chinese, English-French), significantly reducing knowledge conflicts.
Demonstrates that editing shared LAFNs is more effective than editing language-specific neurons, validating the existence of cross-lingual knowledge sharing in LLMs.

Breakthrough Assessment

7/10

Solid contribution identifying a specific biological-inspired mechanism (shared neurons) for multilingual editing. The caching mechanism is practical, though the reliance on exact subject matching for retrieval might limit robustness in the wild.

⚙️ Technical Details

Problem Definition

Setting: Multilingual Knowledge Editing (MKE) aiming to update a model to answer queries about a specific fact correctly across multiple languages while preserving other knowledge.

Inputs: A set of edit descriptors containing questions and new answers in multiple languages.

Outputs: An edited model (or inference mechanism) that produces the new answer for the target fact in all languages.

Pipeline Flow

Neuron Localization (Locate LAFNs using multilingual paraphrases)
Update Optimization (Calculate delta values for neurons)
Inference (Retrieve updates from cache upon subject match)

System Modules

Neuron Locator

Identify shared FFN neurons (LAFNs) activated by the target fact across languages

Model or implementation: Based on original LLM (Llama-3/Qwen2)

Update Optimizer

Compute the value update vector (Delta V) for the located neurons

Model or implementation: Optimization process (not a separate model)

Cache Retriever

Intercepts inference to inject updates if the query subject matches a cached edit

Model or implementation: Lookup mechanism

Novel Architectural Elements

Identification and explicit targeting of 'Language-Agnostic Factual Neurons' (intersection of activation sets across languages).
Cache-based injection of updates specifically into these shared neurons during inference rather than permanent weight modification.

Modeling

Base Model: Llama-3.1-8B and Qwen2-7B

Training Method: Optimization of neuron update values (similar to MEMIT's closed-form or gradient-based updates, but applied to specific neurons)

Objective Functions:

Purpose: Maximize probability of the new answer.

Formally: L_target = - log P(y^e | x^e, theta')
Purpose: Preserve predictions for unrelated facts/relations.

Formally: L_kl = KL(P_theta(.|q) || P_theta'(.|q))
Purpose: Combined objective.

Formally: L_MKE = lambda1 * L_target + lambda2 * L_kl

Adaptation: Neuron-specific value modification (stored in cache)

Key Hyperparameters:

beta_threshold: 0.9 (Llama-3), 0.8 (Qwen2) for neuron selection
lambda1: Not explicitly reported in the paper
lambda2: Not explicitly reported in the paper

Compute: Not reported in the paper

Comparison to Prior Work

vs. M-MEMIT: LU-LAFNs targets shared language-agnostic neurons instead of treating languages as separate optimization constraints or simply aggregating them.
vs. ReMaKE: LU-LAFNs modifies internal representations (via cached updates) rather than retrieving external context.
vs. Bi-LoRA: LU-LAFNs avoids 'knowledge conflicts' inherent in parameter-efficient fine-tuning by isolating specific factual neurons.

Limitations

Relies on exact string matching for subjects during inference to trigger the cached update, which may be brittle.
The intersection method for finding LAFNs requires parallel data or translation, which might introduce noise.
Evaluation is limited to specific benchmarks (Bi-ZsRE, MzsRE) and may not cover all types of knowledge or languages.

Reproducibility

Code: https://github.com/XZhang00/LU-LAFNs

Code is publicly available at https://github.com/XZhang00/LU-LAFNs. The paper specifies the LLMs used (Llama-3.1-8B, Qwen2-7B) and the benchmarks (Bi-ZsRE, MzsRE). Hyperparameters for neuron selection thresholds are provided.

📊 Experiments & Results

Evaluation Setup

Multilingual Knowledge Editing on factual QA tasks.

Benchmarks:

Bi-ZsRE (Bilingual Zero-Shot Relation Extraction (Question Answering))
MzsRE (Multilingual Zero-Shot Relation Extraction)

Metrics:

Reliability (Success rate on the edited language)
Generality (Success rate on other languages / paraphrases)
Locality (Preservation of unrelated knowledge)
Statistical methodology: Not explicitly reported in the paper

Experiment Figures

Distribution of factual neurons (English, Chinese, and Shared/LAFNs) across layers for Llama-3 and Qwen2.

Main Takeaways

LU-LAFNs consistently outperforms baselines (M-MEMIT, Bi-LoRA, ReMaKE) across Reliability, Generality, and Locality metrics.
The method is effective because it targets the intersection of neurons (LAFNs), which represent the semantic core of the fact shared across languages, avoiding conflicts arising from disjoint updates.
Performance is sensitive to the number of updated neurons; editing too few limits reliability, while too many harms locality, with an optimal sweet spot.
Analyses show that LAFNs are primarily located in the middle and late layers of the LLMs (e.g., layers 12-18 for Llama-3/Qwen2).

📚 Prerequisite Knowledge

Prerequisites

Understanding of Transformer architectures, specifically Feed-Forward Networks (FFNs)
Knowledge Editing concepts (Locate-then-Edit)
Basic linear algebra (matrix operations in neural networks)

Key Terms

MKE: Multilingual Knowledge Editing—updating factual knowledge simultaneously across multiple languages in an LLM.

LAFNs: Language-Agnostic Factual Neurons—neurons in the FFN layers that activate for a specific fact regardless of the language used to express it.

FFN: Feed-Forward Network—a component within Transformer layers where knowledge is often hypothesized to be stored.

ROME: Rank-One Model Editing—a monolingual method that locates and edits knowledge by viewing MLP modules as key-value memories.

MEMIT: Mass-Editing Memory in a Transformer—a method enabling mass editing of factual knowledge in LLMs.

LoRA: Low-Rank Adaptation—a parameter-efficient fine-tuning technique.

Bi-ZsRE: A bilingual Zero-Shot Relation Extraction benchmark for evaluating knowledge editing.

MzsRE: A multilingual Zero-Shot Relation Extraction benchmark.

KL divergence: Kullback-Leibler divergence—a statistical distance measure used here to ensure the model's predictions on unrelated facts don't drift.