Official code for "Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models" published at EMNLP'25
Find a file
2025-08-21 10:33:47 +02:00
README.md Update README.md 2025-08-21 10:33:47 +02:00

Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models

Matteo Bortoletto,   Constantin Ruhdorfer,   Lei Shi,   Andreas Bulling

Findings of EMNLP 2025, Suzhou, China
ICML 2024 Workshop on Mechanistic Interpretability, Vienna, Austria

[Paper]

Under construction