← Back to Paper List

Using Contrastive Learning to Improve Two-Way Reasoning in Large Language Models: The Obfuscation Task as a Case Study

SL Nikiema, J Samhi, MB Moumoula, AE Djiré…
University of Luxembourg
arXiv, 9/2025 (2025)
Reasoning Benchmark

📝 Paper Summary

Code Understanding LLM Evaluation Fine-tuning Techniques
The paper proposes bidirectional reasoning (performing both obfuscation and deobfuscation) as a test for true understanding and introduces Contrastive Fine-Tuning to overcome the 'cognitive specialization' that limits current models to unidirectional pattern matching.
Core Problem
Standard fine-tuning creates 'cognitive specialization,' where models learn to perform a forward task (like obfuscation) but lose or fail to develop the ability to reverse it (deobfuscation), indicating mere pattern matching rather than genuine semantic understanding.
Why it matters:
  • Models deployed in high-stakes software engineering require genuine semantic understanding, not just surface-level pattern replication, to be reliable and robust
  • Current evaluation benchmarks often fail to distinguish between sophisticated memorization of training patterns and true comprehension of underlying logic
  • Adversarial robustness studies show code models are brittle to simple semantic-preserving transformations, limiting their generalizability
Concrete Example: A model fine-tuned to obfuscate variable names (changing 'userIndex' to 'i') achieves 81% success, but when asked to reverse this process (deobfuscate 'i' back to 'userIndex' or a meaningful name), it fails completely (~0% success), even though the transformation is logically reversible.
Key Novelty
Bidirectional Reasoning Hypothesis & Contrastive Fine-Tuning
  • Proposes that true understanding implies reversibility: if a model understands a transformation (like obfuscation), it should naturally be able to reverse it (deobfuscate) without explicit training
  • Identifies 'cognitive specialization' as a pathology where models optimize for one direction at the expense of the other
  • Adapts Contrastive Fine-Tuning (CFT) from vision learning to code, using triplets (original, obfuscated, and negative examples) to force the model to learn deep semantic representations rather than surface patterns
Evaluation Highlights
  • Standard fine-tuning achieves ~0% success on reverse (deobfuscation) tasks despite high forward performance (>80% for some models), confirming cognitive specialization
  • Contrastive Fine-Tuning (CFT) enables 39-52% reverse performance (deobfuscation) across multiple models without explicit reverse training, compared to 0% for standard fine-tuning
  • CFT maintains forward task capabilities while unlocking bidirectional reasoning, effectively bridging the gap between pattern matching and semantic understanding
Breakthrough Assessment
8/10
Identifies a fundamental 'cognitive specialization' failure mode in LLMs and provides a successful training fix (CFT) that unlocks zero-shot reversibility. Significant for understanding vs. memorization debates.
×