VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs
**[Adnen Abdessaied][5], [Lei Shi][6], [Andreas Bulling][7]**
**WACV'24, Hawaii, USA**
**[[Paper][8]]**
-------------------
# Citation
If you find our code useful or use it in your own projects, please cite our paper:
```bibtex
@inproceedings{abdessaied_vdgr,
author = {Abdessaied, Adnen and Lei, Shi and Bulling, Andreas},
title = {{VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs}},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
year = {2024},
}
```
# Table of Contents
* [Setup and Dependencies](#Setup-and-Dependencies)
* [Download Data](#Download-Data)
* [Pre-trained Checkpoints](#Pre-trained-Checkpoints)
* [Training](#Training)
* [Results](#Results)
# Setup and Dependencies
We implemented our model using Python 3.7 and PyTorch 1.11.0 (CUDA 11.3, CuDNN 8.2.0). We recommend to setup a virtual environment using Anaconda.