Evaluation Setup
Qualitative assessment of scientific derivation and quantitative assessment of fact-checking
Benchmarks:
- Deconfabulation Trials (Fact-checking/Hallucination Detection) [New]
- Physics Derivation (Scientific Reasoning/Problem Solving) [New]
Metrics:
- Success rate (detection of confabulations)
- Qualitative accuracy (physics derivation vs. published literature)
- Statistical methodology: Not explicitly reported in the paper
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| Deconfabulation experiments tested the ability of simulated Sherlock Holmes and Watson personae to correctly identify unsupported claims. |
| Deconfabulation Trials |
Success rate |
0 |
90 |
+90
|
| Physics derivation experiments assessed the model's ability to solve a problem outside its training horizon. |
| Physics Derivation |
Qualitative Match |
Low quality/Vague |
High quality/Exact Match |
Qualitative improvement
|
Main Takeaways
- Simulated personae can access and synthesize knowledge more effectively than direct prompting, likely due to behavioral cues encoded in the training data
- The strategy scales to complex tasks like deriving new physics equations and generating visualization code without specialized fine-tuning
- Adding 'props' (whiteboard) and 'stage directions' prevents the model from giving lazy summaries and encourages detailed step-by-step derivation