Code for the paper "Benchmarking Mental State Representations in Language Models", ICML 2024 Workshop on Mechanistic Interpretability
README.md |
Benchmarking Mental State Representations in Language Models
Matteo Bortoletto, Constantin Ruhdorfer, Lei Shi, Andreas Bulling
ICML 2024 Workshop on Mechanistic Interpretability, Vienna, Austria
[Paper]
Citation
@inproceedings{
bortoletto2024benchmarking,
title={Benchmarking Mental State Representations in Language Models},
author={Matteo Bortoletto and Constantin Ruhdorfer and Lei Shi and Andreas Bulling},
booktitle={ICML 2024 Workshop on Mechanistic Interpretability},
year={2024},
url={https://openreview.net/forum?id=yEwEVoH9Be}
}
Under construction