code to train classifiers
This commit is contained in:
parent
0403f2ce55
commit
34ff6100e6
7 changed files with 438 additions and 0 deletions
18
README.md
18
README.md
|
@ -25,7 +25,25 @@ reproducing the paper results step by step:
|
|||
1. __Extract features from raw gaze data__:
|
||||
`python 00_compute_features.py` to compute gaze features for all participants
|
||||
Once extracted, the features are stored in `features/ParticipantXX/window_features_YY.npy` where XX is the participant number and YY the length of the sliding window in seconds.
|
||||
2. __Train random forest classifiers__
|
||||
`./01 train_classifiers.sh` to reproduce the evaluation setting described in the paper in which each classifier was trained 100 times.
|
||||
`./02_train_specialized_classifiers.sh` to train specialized classifiers on parts of the data (specifically on data from inside the shop or on the way).
|
||||
|
||||
If the scripts cannot be executed, you might not have the right access permissions to do so. On Linux, you can try `chmod +x 01_train_classifiers.sh`,`chmod +x 02_train_specialized_classifiers.sh` and `chmod +x 03_label_permutation_test.sh` (see below for when/how to use the last script).
|
||||
|
||||
In case you want to call the script differently, e.g. to speed-up the computation or try with different parameters, you can pass the following arguments to `classifiers.train_classifier`:
|
||||
`-t` trait index between 0 and 6
|
||||
`-l` lowest number of repetitions, e.g. 0
|
||||
`-m` max number of repetitions, e.g. 100
|
||||
`-a` using partial data only: 0 (all data), 1 (way data), 2(shop data)
|
||||
|
||||
In case of performance issues, it might be useful to check `_conf.py` and change `max_n_jobs` to restrict the number of jobs (i.e. threads) running in parallel.
|
||||
|
||||
The results will be saved in `results/A0` for all data, `results/A1` for way data only and `results/A2` for data inside a shop. Each file is named `TTT_XXX.npz`, where TTT is the abbreviation of the personality trait (`O`,`C`,`E`,`A`,`N` for the Big Five and `CEI` or `PCS` for the two curiosity measures). XXX enumerates the classifiers (remember that we always train 100 classifiers for evaluation because there is some randomness involved in the training process).
|
||||
|
||||
3. __Evaluate Baselines__
|
||||
* To train a classifier that always predicts the most frequent personality score range from its current training set, please execute `python 03_train_baseline.py`
|
||||
* To train classifiers on permuted labels, i.e. perform the so-called label permutation test, please execute `./04_label_permutation_test.sh`
|
||||
|
||||
|
||||
## Citation
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue