Int-HRL/README.md

# Int-HRL

This is the official repository for [Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning](https://collaborative-ai.org/publications/penzkofer24_ncaa/)<br> 

Int-HRL uses eye gaze from human demonstration data on the Atari game Montezuma's Revenge to extract human player's intentions and converts them to sub-goals for Hierarchical Reinforcement Learning (HRL). For further details take a look at the corresponding paper. 

## Dataset 
Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset available at [https://zenodo.org/record/3451402#.Y5chr-zMK3J](https://zenodo.org/record/3451402#.Y5chr-zMK3J) <br>
![Atari-HEAD Montezuma's Revenge](supplementary/291_RZ_7364933_May-08-20-23-25.gif)

To pre-process the Atari-HEAD data run [Preprocess_AtariHEAD.ipynb](Preprocess_AtariHEAD.ipynb), yielding the `all_trials.pkl` file needed for the following steps. 

## Sub-goal Extraction Pipeline

1. [RAM State Labeling](RAMStateLabeling.ipynb): annotate Atari-HEAD data with room id and level information, as well as agent and skull location    
2. [Subgoals From Gaze](SubgoalsFromGaze.ipynb): run sub-goal proposal extraction by generating saliency maps
3. [Alignment with Trajectory](TrajectoryMatching.ipynb): run expert trajectory to get order of subgoals 

## Intention-based Hierarchical RL Agent
The Int-HRL agent is based on the hierarchically guided Imitation Learning method (hg-DAgger/Q), where we adapted code from [https://github.com/hoangminhle/hierarchical_IL_RL](https://github.com/hoangminhle/hierarchical_IL_RL) <br>

![results](supplementary/results_updated.png)

Due to the novel sub-goal extraction pipeline, our agent does not require experts during training and is more than three times more sample efficient compared to hg-DAgger/Q. <br>

To run the full agent with 12 separate low-level agents for sub-goal execution, run [agent/run_experiment.py](agent/run_experiment.py), for single agents (one low-level agent for all sub-goals) run [agent/single_agent_experiment.py](agent/single_agent_experiment.py). 

## Extension to Venture and Hero 
under construction


## Citation 
Please consider citing these paper if you use Int-HRL or parts of this repository in your research:
```
@article{penzkofer24_ncaa,
  author = {Penzkofer, Anna and Schaefer, Simon and Strohm, Florian and Bâce, Mihai and Leutenegger, Stefan and Bulling, Andreas},
  title = {Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning},
  journal = {Neural Computing and Applications (NCAA)},
  year = {2024},
  pages = {1--7},
  doi = {10.1007/s00521-024-10596-2},
  volume = {36}
}
@inproceedings{penzkofer23_ala,
  author = {Penzkofer, Anna and Schaefer, Simon and Strohm, Florian and Bâce, Mihai and Leutenegger, Stefan and Bulling, Andreas},
  title = {Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning},
  booktitle = {Proc. Adaptive and Learning Agents Workshop (ALA)},
  year = {2023},
  doi = {10.48550/arXiv.2306.11483},
  pages = {1--7}
}
```
Initial commit 2024-02-27 16:09:59 +01:00			`# Int-HRL`

add Int-HRL agent scripts 2025-03-12 18:46:09 +01:00			`This is the official repository for [Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning](https://collaborative-ai.org/publications/penzkofer24_ncaa/)<br>`
add subgoal generation pipeline 2025-03-12 18:20:56 +01:00
			`Int-HRL uses eye gaze from human demonstration data on the Atari game Montezuma's Revenge to extract human player's intentions and converts them to sub-goals for Hierarchical Reinforcement Learning (HRL). For further details take a look at the corresponding paper.`

			`## Dataset`
			`Atari-HEAD: Atari Human Eye-Tracking and Demonstration Dataset available at [https://zenodo.org/record/3451402#.Y5chr-zMK3J](https://zenodo.org/record/3451402#.Y5chr-zMK3J) <br>`
			`![Atari-HEAD Montezuma's Revenge](supplementary/291_RZ_7364933_May-08-20-23-25.gif)`

			To pre-process the Atari-HEAD data run [Preprocess_AtariHEAD.ipynb](Preprocess_AtariHEAD.ipynb), yielding the `all_trials.pkl` file needed for the following steps.

			`## Sub-goal Extraction Pipeline`

			`1. [RAM State Labeling](RAMStateLabeling.ipynb): annotate Atari-HEAD data with room id and level information, as well as agent and skull location`
			`2. [Subgoals From Gaze](SubgoalsFromGaze.ipynb): run sub-goal proposal extraction by generating saliency maps`
			`3. [Alignment with Trajectory](TrajectoryMatching.ipynb): run expert trajectory to get order of subgoals`

			`## Intention-based Hierarchical RL Agent`
add Int-HRL agent scripts 2025-03-12 18:46:09 +01:00			`The Int-HRL agent is based on the hierarchically guided Imitation Learning method (hg-DAgger/Q), where we adapted code from [https://github.com/hoangminhle/hierarchical_IL_RL](https://github.com/hoangminhle/hierarchical_IL_RL) <br>`

add results figure 2025-03-12 18:50:01 +01:00			`![results](supplementary/results_updated.png)`
add results figure 2025-03-12 18:48:39 +01:00
add Int-HRL agent scripts 2025-03-12 18:46:09 +01:00			`Due to the novel sub-goal extraction pipeline, our agent does not require experts during training and is more than three times more sample efficient compared to hg-DAgger/Q. <br>`

			`To run the full agent with 12 separate low-level agents for sub-goal execution, run [agent/run_experiment.py](agent/run_experiment.py), for single agents (one low-level agent for all sub-goals) run [agent/single_agent_experiment.py](agent/single_agent_experiment.py).`

			`## Extension to Venture and Hero`
add subgoal generation pipeline 2025-03-12 18:20:56 +01:00			`under construction`

add Int-HRL agent scripts 2025-03-12 18:46:09 +01:00
add subgoal generation pipeline 2025-03-12 18:20:56 +01:00			`## Citation`
			`Please consider citing these paper if you use Int-HRL or parts of this repository in your research:`
			```
			`@article{penzkofer24_ncaa,`
			`author = {Penzkofer, Anna and Schaefer, Simon and Strohm, Florian and Bâce, Mihai and Leutenegger, Stefan and Bulling, Andreas},`
			`title = {Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning},`
			`journal = {Neural Computing and Applications (NCAA)},`
			`year = {2024},`
			`pages = {1--7},`
			`doi = {10.1007/s00521-024-10596-2},`
			`volume = {36}`
			`}`
			`@inproceedings{penzkofer23_ala,`
			`author = {Penzkofer, Anna and Schaefer, Simon and Strohm, Florian and Bâce, Mihai and Leutenegger, Stefan and Bulling, Andreas},`
			`title = {Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning},`
			`booktitle = {Proc. Adaptive and Learning Agents Workshop (ALA)},`
			`year = {2023},`
			`doi = {10.48550/arXiv.2306.11483},`
			`pages = {1--7}`
			`}`
			```