diff --git a/README.md b/README.md index 19969a5..c7904f9 100644 --- a/README.md +++ b/README.md @@ -40,14 +40,15 @@ of the art —while only requiring a fraction of training data. Moreover, we dem ## Scene Data -We used CLEVR and Minecraft images in this project. The raw images have a large footprint and we won't upload them. However, we provide their json file as well as their derendedred versions. They can be found in : +We used CLEVR and Minecraft images in this project. The raw images have a large footprint and we won't upload them. However, we provide their json file as well as their derendedred versions: -- ``data/scenes/raw`` -- ``data/scenes/derendered`` +- Original clevr-dialog training and validation raw scenes: [⬇](https://dl.fbaipublicfiles.com/clevr/CLEVR_v1.0.zip) +- Raw scenes we used in our experiments: [⬇](https://1drv.ms/u/s!AlGoPLjLV-BOh1fdB30GscvRnFAt?e=Xtorzr) +- All derendered scenes: [⬇](https://1drv.ms/u/s!AlGoPLjLV-BOh0d00ynwnXQO14da?e=Ub6k33) ## Dialog Data -The dialog data we used can be found in ``data/dialogs``. +The dialog data we used can be found here [⬇](https://1drv.ms/u/s!AlGoPLjLV-BOhzaYs3s2qSLbGTL_?e=oGGrxr) You can also create your own data using the ``generate_dataset.py`` script. # Preprocessing @@ -60,15 +61,21 @@ The derendered scenes do not need any further preprocessing and can be diretly u To preprocess the dialogs, follow these steps: -- ```cd preprocess_dialogs``` +```bash +cd preprocess_dialogs +``` For the stack encoder, execute -- ```python preprocess.py --input_dialogs_json --input_vocab_json '' --output_vocab_json --output_h5_file --split --mode stack``` +```python +python preprocess.py --input_dialogs_json --input_vocab_json '' --output_vocab_json --output_h5_file --split --mode stack +``` For the concat encoder, execute -- ```python preprocess.py --input_dialogs_json --input_vocab_json '' --output_vocab_json --output_h5_file --split --mode concat``` +```python +python preprocess.py --input_dialogs_json --input_vocab_json '' --output_vocab_json --output_h5_file --split --mode concat +``` # Training @@ -80,17 +87,23 @@ First, change directory To train the caption parser, execute -- ```python train_caption_parser.py --mode train --run_dir --res_path --dataPathTr --dataPathVal --dataPathTest --vocab_path ``` +```python +python train_caption_parser.py --mode train --run_dir --res_path --dataPathTr --dataPathVal --dataPathTest --vocab_path +``` ## Question Program Parser To train the question program parser with the stack encoder, execute -- ```python train_question_parser.py --mode train --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type 2``` +```python +python train_question_parser.py --mode train --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type 2 +``` To train the question program parser with the concat encoder, execute -- ```python train_question_parser.py --mode train --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type 1``` +```python +python train_question_parser.py --mode train --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type 1 +``` ## Baselines @@ -102,11 +115,15 @@ To train the question program parser with the concat encoder, execute To evaluate using the *Hist+GT* scheme, execute -- ```python train_question_parser.py --mode test_with_gt --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type <1/2> --questionNetPath --captionNetPath --dialogLen --last_n_rounds ``` +```python +python train_question_parser.py --mode test_with_gt --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type <1/2> --questionNetPath --captionNetPath --dialogLen --last_n_rounds +``` To evaluate using the *Hist+Pred* scheme, execute -- ```python train_question_parser.py --mode test_with_pred --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type <1/2> --questionNetPath --captionNetPath --dialogLen --last_n_rounds ``` +```python +python train_question_parser.py --mode test_with_pred --run_dir --text_log_dir --dataPathTr --dataPathVal --dataPathTest --scenePath --vocab_path --encoder_type <1/2> --questionNetPath --captionNetPath --dialogLen --last_n_rounds +``` # Results