public-projects/mental-states-in-LMs

Official code for "Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models" published at EMNLP'25

Find a file

Matteo Bortoletto a30b9cbe15 Update README.md		2025-08-21 10:33:47 +02:00
README.md	Update README.md	2025-08-21 10:33:47 +02:00

README.md

Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models

Matteo Bortoletto, Constantin Ruhdorfer, Lei Shi, Andreas Bulling

Findings of EMNLP 2025, Suzhou, China
ICML 2024 Workshop on Mechanistic Interpretability, Vienna, Austria

[Paper]

Under construction