limits-of-tom/README.md

<div align="center">
<h1> Neural Reasoning about Agents' Goals, Preferences, and Actions </h1>

**[Matteo Bortoletto][1], &nbsp; [Constantin Ruhdorfer][5], &nbsp; [Adnen Abdessaied][6], &nbsp; [Lei Shi][2], &nbsp; [Andreas Bulling][3]** <br> <br>
**ACL 2024, Bangkok, Thailand** <br>
**[[Paper][4]]**

</div>

# Citation

```bibtex
@inproceedings{bortoletto24_acl,
  author = {Bortoletto, Matteo and Ruhdorfer, Constantin and Abdessaied, Adnen and Shi, Lei and Bulling, Andreas},
  title = {Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition},
  booktitle = {Proc. 62nd Annual Meeting of the Association for Computational Linguistics (ACL)},
  year = {2024},
  pages = {1--16},
  doi = {}
}
```

## Dataset

[Original Dataset](https://huggingface.co/datasets/sled-umich/MindCraft)

[Extended Dataset](https://huggingface.co/datasets/sled-umich/MindCraft2)


## Code overview

The code is based on the [original implementation](https://github.com/sled-group/collab-plan-acquisition/tree/main).

### ToM tasks

Baselines:
- training code: `baselines_with_dialogue_moves.py`
- bash script: `baselines_with_dialogue_moves.sh`
- model: `src/models/model_with_dialogue_moves.py`

Graph models:
- training code: `baselines_with_dialogue_moves_graphs.py`
- bash script: `baselines_with_dialogue_moves_graphs.sh`
- model: `src/models/model_with_dialogue_moves_graphs.py`

To extract ToM features, run `intermediate_representations.py`

### CPA tasks

Baselines:
- training code: `plan_predictor.py`
- bash script: `run_plan_predictor.sh`
- model: `src/models/plan_model.py`

Baselines with ToM ground-truth as input:
- trainining code: `plan_predictor_oracle.py`
- bash script: `run_plan_predictor_oracle.sh`
- model: `src/models/plan_model_oracle.py`

Graph models:
- training code: `plan_predictor_graphs.py`
- bash script: `run_plan_predictor_graphs.sh`
- model: `src/models/plan_model_graphs.py`

Graph models with ToM ground-truth as input:
- training code: `plan_predictor_graphs_oracle.py`
- bash script: `run_plan_predictor_graphs_oracle.sh`
- model: `src/models/plan_model_graphs_oracle.py`

Graph models with full candidate sampling (i.e. considering all possible candidate edges):
- training code: `plan_predictor_graphs_allcand.py`
- bash script: `run_plan_predictor_graphs_allcand.sh`
- model: `src/models/plan_model_graphs.py` (same as the constrained selection)

Train graph model with int0 features as input: `run_plan_predictor_graphs_int0.sh`

### Analysis
- correlation analysis: `compare_tom_cpa.ipynb`
- diagnostic probing: `logistic_regression_tom_feats.py`. To extract CPA features, use `plan_predictor_graphs_test.py`

[1]: https://mattbortoletto.github.io/
[2]: https://perceptualui.org/people/shi/
[3]: https://perceptualui.org/people/bulling/
[4]: https://arxiv.org/pdf/2405.12621
[5]: https://perceptualui.org/people/ruhdorfer/
[6]: https://perceptualui.org/people/abdessaied/