From 08b730f342347fbbeb32bf3bdb8cb4a0148a4f82 Mon Sep 17 00:00:00 2001 From: Zhiming Hu Date: Tue, 3 Jun 2025 14:06:04 +0200 Subject: [PATCH] update readme --- README.md | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/README.md b/README.md index e69de29..1764412 100644 --- a/README.md +++ b/README.md @@ -0,0 +1,57 @@ +# HaHeAE: Learning Generalisable Joint Representations of Human Hand and Head Movements in Extended Reality + + +## Abstract +``` +Human hand and head movements are the most pervasive input modalities in extended reality (XR) and are significant for a wide range of applications. +However, prior works on hand and head modelling in XR only explored a single modality or focused on specific applications. +We present HaHeAE - a novel self-supervised method for learning generalisable joint representations of hand and head movements in XR. +At the core of our method is an autoencoder (AE) that uses a graph convolutional network-based semantic encoder and a diffusion-based stochastic encoder to learn the joint semantic and stochastic representations of hand-head movements. +It also features a diffusion-based decoder to reconstruct the original signals. +Through extensive evaluations on three public XR datasets, we show that our method 1) significantly outperforms commonly used self-supervised methods by up to 74.1% in terms of reconstruction quality and is generalisable across users, activities, and XR environments, 2) enables new applications, including interpretable hand-head cluster identification and variable hand-head movement generation, and 3) can serve as an effective feature extractor for downstream tasks. +Together, these results demonstrate the effectiveness of our method and underline the potential of self-supervised methods for jointly modelling hand-head behaviours in extended reality. +``` + + +## Environment: +Ubuntu 22.04 +python 3.8+ +pytorch 1.8.1 + + +## Usage: +Step 1: Create the environment +``` +conda env create -f ./environment/haheae.yaml -n haheae +conda activate haheae +``` + +Step 2: Follow the instructions at the [Pose2Gaze project][1] to process the datasets. + + +Step 3: Set 'data_dir' in 'config.py' and 'main.py' for the processed datasets. Run 'train.sh' to evaluate the pre-trained models. If you want to train the model from scratch, you can remove the pre-trained models and uncomment the training command (the command with "mode" set to "train"). + + +## Citation + +```bibtex +@inproceedings{hu25hoigaze, + title={HOIGaze: Gaze Estimation During Hand-Object Interactions in Extended Reality Exploiting Eye-Hand-Head Coordination}, + author={Hu, Zhiming and Haeufle, Daniel and Schmitt, Syn and Bulling, Andreas}, + booktitle={Proceedings of the 2025 ACM Special Interest Group on Computer Graphics and Interactive Techniques}, + year={2025}} + +@article{hu24pose2gaze, + author={Hu, Zhiming and Xu, Jiahui and Schmitt, Syn and Bulling, Andreas}, + journal={IEEE Transactions on Visualization and Computer Graphics}, + title={Pose2Gaze: Eye-body Coordination during Daily Activities for Gaze Prediction from Full-body Poses}, + year={2024}} +``` + + +## Acknowledgements +Our work is built on the codebase of [Diffusion Autoencoders][2] and [DisMouse][3]. Thanks to the authors for sharing their codes. + +[1]: https://github.com/CraneHzm/Pose2Gaze +[2]: https://diff-ae.github.io/ +[3]: https://git.hcics.simtech.uni-stuttgart.de/public-projects/DisMouse