One Persona, Many Cues, Different Results: How Sociodemographic Cues Impact LLM Personalization

📝 Paper Summary

LLM Personalization Bias and Fairness Prompt Robustness

Different methods of signaling a user's sociodemographic identity to an LLM (names, explicit mentions, conversation history) yield significantly different personalization biases, suggesting reliance on single cues is methodologically flawed.

Core Problem

Prior research typically evaluates LLM personalization bias using a single persona cue (e.g., just a name or just a system prompt statement), ignoring how sensitive models are to prompt variations.

Why it matters:

Reliance on single cues threatens external validity: findings might be artifacts of the specific prompt rather than true model behavior
High-stakes domains like health and legal advice show disparate outcomes based on persona, but the extent of this bias varies by how the persona is introduced
Some common cues (explicit mentions) are unnatural in real-world usage, potentially overestimating bias compared to natural cues (conversation history)

Concrete Example: When a user asks 'Should I go to the emergency room?', Gemma-3-27B answers 'No' if prompted with 'The user is female', but 'Yes' if prompted with a conversation history typical of a female user. This inconsistency means bias audits using only one method could be misleading.

Key Novelty

Systematic Multi-Cue Personalization Evaluation

Compare six different persona cues (names, explicit mentions, conversation histories) across three sociodemographic variables (gender, race, age) to measure consistency
Introduce the concept of 'external validity' for persona cues, distinguishing between artificial explicit prompts and natural implicit identity markers like conversation history

Evaluation Highlights

Explicit mentions in user prompts cause significant disparities across personas in 20/24 experimental combinations, vastly more than names in system prompts (1/24)
High correlation between cues (ρ > 0.9) masks significant distributional differences: on medical advice tasks, explicit cues trigger different decisions than natural conversation histories
Non-binary personas face significant disparities in 6/8 tasks (e.g., lower accuracy on AITA verdict prediction), often receiving more liberal or cautious advice than male/female personas

Breakthrough Assessment

7/10

Strong methodological critique that invalidates single-cue bias studies. Provides actionable recommendations for future personalization research, though the technical novelty is in the rigorous comparison rather than a new model architecture.

⚙️ Technical Details

Problem Definition

Setting: Controlled generation from LLMs conditioned on specific sociodemographic personas using various prompting strategies

Inputs: A prompt containing a question q and a persona cue c representing a specific attribute a (age, gender, race)

Outputs: Generated response r (either open-ended text or closed-ended decision)

Pipeline Flow

Prompt Construction (combine Question + Persona Cue)
LLM Inference (generate 3 responses per prompt)
Response Parsing (extract Yes/No, number, or text)
Evaluation (Ground truth comparison or LLM-as-a-judge)

System Modules

Prompt Constructor

Injects persona information via one of 6 cues (e.g., name-system, history-human) into the prompt template

Model or implementation: Rule-based templates

Target LLM

Generates responses to the personalized prompts

Model or implementation: Various (Llama 3.1, Gemma 3, Qwen 2.5, GPT-4o-mini)

Evaluator

Scores the response based on task type (Accuracy, Stance, or Salary value)

Model or implementation: Regex matching or Llama-3.3-70B (Judge)

Novel Architectural Elements

Comparison framework integrating 6 distinct persona cues across varying levels of explicitness and external validity

Modeling

Base Model: Evaluated 7 models: Llama-3.1-70B-Instruct, Llama-3.1-8B-Instruct, Gemma-3-12b-it, Gemma-3-27b-it, Qwen2.5-14B-Instruct, Qwen2.5-72B-Instruct, gpt-4o-mini

Reproducibility

Code: https://github.com/frawee/persona_cues/

📚 Prerequisite Knowledge

Prerequisites

Familiarity with prompt engineering and persona-based prompting
Understanding of LLM bias and fairness metrics
Basic knowledge of statistical significance testing (ANOVA, Tukey-Kramer)

Key Terms

persona cue: The specific method used to introduce a sociodemographic profile to the model (e.g., a name, a system prompt, or a chat history)

external validity: The extent to which research findings (here, bias measurements) generalize to real-world settings (real user interactions)

implicit identity markers: Subtle cues in language use or conversation history that signal demographics without explicitly stating them

Spearman correlation: A statistical measure of rank correlation, used here to check if different cues lead to similar ranking of model outputs

LLM-as-a-judge: Using an LLM (here Llama-3.3-70B) to evaluate the quality or stance of text generated by another LLM

Tukey-Kramer test: A post-hoc statistical test used after ANOVA to determine exactly which means differ significantly from each other, accounting for unequal sample sizes

AITA: Am I The Asshole? — a dataset based on Reddit posts where users describe a conflict and ask for moral judgment