classifiers | ||
config | ||
data | ||
featureExtraction | ||
info | ||
.gitignore | ||
00_compute_features.py | ||
01_train_classifiers.sh | ||
02_train_specialized_classifiers.sh | ||
03_train_baseline.py | ||
04_label_permutation_test.sh | ||
05_plot_weights.py | ||
06_baselines.py | ||
07_evaluation_across_contexts.py | ||
08_descriptive.py | ||
09_plot_ws_hist.py | ||
__init__.py | ||
LICENSE | ||
README.md |
Eye movements during everyday behavior predict personality traits
Sabrina Hoppe, Tobias Loetscher, Stephanie Morey and Andreas Bulling
This repository provides all data and code used for the publication in Frontiers in Human Neuroscience.
Dataset
- Gaze data recorded at 60Hz from 42 participants is stored in
data/ParticipantXX
.
For each participant there are three files:events.csv
is a list of gaze events as provided by the SMI eye tracker software. The list contains saccades, fixations and blinks but only the blink information was used in the code.gaze_positions.csv
is a table with three columns: time in seconds, x gaze coordinate and y gaze coordinate. The x and y coordinates describe the participants' gaze direction normalised to the range from 0 to 1.pupil_diameter.csv
is another table with three columns: time in seconds, diameter of the right eye and diameter of the left eye. The diameter values are absolute gaze estimates in mm.
All files are of the same length and each row corresponds to one data sample. That is, the n-th row in all three files belongs to the same point in time.
-
Ground truth personality scores from the respective questionnaires, participant age and sex (1: male, 2: female) can be found in
info/personality_sex_age.csv
. -
Personality score ranges that were obtained by binning the questionnaire scores are provided in
info/binned_personality.csv
. -
Timestamps indicating the times when participants entered and left the shop are given in
info/annotation.csv
in seconds.
Code
reproducing the paper results step by step:
-
Extract features from raw gaze data:
python 00_compute_features.py
to compute gaze features for all participants
Once extracted, the features are stored infeatures/ParticipantXX/window_features_YY.npy
where XX is the participant number and YY the length of the sliding window in seconds. -
Train random forest classifiers
./01 train_classifiers.sh
to reproduce the evaluation setting described in the paper in which each classifier was trained 100 times.
./02_train_specialized_classifiers.sh
to train specialized classifiers on parts of the data (specifically on data from inside the shop or on the way).If the scripts cannot be executed, you might not have the right access permissions to do so. On Linux, you can try
chmod +x 01_train_classifiers.sh
,chmod +x 02_train_specialized_classifiers.sh
andchmod +x 03_label_permutation_test.sh
(see below for when/how to use the last script).In case you want to call the script differently, e.g. to speed-up the computation or try with different parameters, you can pass the following arguments to
classifiers.train_classifier
:
-t
trait index between 0 and 6
-l
lowest number of repetitions, e.g. 0
-m
max number of repetitions, e.g. 100
-a
using partial data only: 0 (all data), 1 (way data), 2(shop data)In case of performance issues, it might be useful to check
_conf.py
and changemax_n_jobs
to restrict the number of jobs (i.e. threads) running in parallel.The results will be saved in
results/A0
for all data,results/A1
for way data only andresults/A2
for data inside a shop. Each file is namedTTT_XXX.npz
, where TTT is the abbreviation of the personality trait (O
,C
,E
,A
,N
for the Big Five andCEI
orPCS
for the two curiosity measures). XXX enumerates the classifiers (remember that we always train 100 classifiers for evaluation because there is some randomness involved in the training process). -
Train baselines
- To train a classifier that always predicts the most frequent personality score range from its current training set, please execute
python 03_train_baseline.py
- To train classifiers on permuted labels, i.e. perform the so-called label permutation test, please execute
./04_label_permutation_test.sh
- To train a classifier that always predicts the most frequent personality score range from its current training set, please execute
-
Performance analysis
- Run
python 05_plot_weights.py
to extract feature importance scores. These scores will be visualized infigures/figure2.pdf
which corresponds to Figure 2 in the paper andfigures/table2.tex
which is shown in Table 2 in the supplementary information. (additionally this step computes F1 scores which are required for the next step, so do not skip it) - The results obtained from both baselines will be written to disk and read once you execute
python 06_baselines.py
. A figure illustrating the actual classifiers' performance along with the random results will be written tofigures/figure1.pdf
as well asfigures/figure1.csv
and correspond to Figure 1 in the paper.
- Run
-
Context comparison
python 07_evaluation_across_contexts.py
to compute the average correlation coefficients between predictions based on data from different contexts. The table with all coefficients will be written tofigures/table1-5.csv
which can be found in Table 1 and Table 5 in supplementary information.
If (some) files in the results folder are missing, try re-running all one of the bash (*.sh) scripts again. -
Descriptive analysis
python 08_descriptive.py
to compute the correlation between each participant's average feature for the most frequently chosen time window and their personality score range. Results are written to four filesfigures/table4-1.tex
,figures/table4-2.tex
,figures/table4-3.tex
,figures/table4-4.tex
and are shown together in Table 4 in the supplementary information. -
Window Size Histogram
python 09_plot_ws_hist.py
to plot a histogram of window sizes chosen during the nested cross validation routine tofigures/ws_hist.pdf
.
All these scripts write intermediate results to disk, i.e. if you start a script a second time, it will be much faster - but the first run can take some time, e.g. up to 8 hours to train classifiers for one context on a 16 core machine; 1 hour to compute correlations between contexts.
Citation
If you want to cite this project, please use the following Bibtex format:
@article{hoppe18_fhns,
title = {Eye Movements During Everyday Behavior Predict Personality Traits},
author = {Sabrina Hoppe and Tobias Loetscher and Stephanie Morey and Andreas Bulling},
url = {https://perceptual.mpi-inf.mpg.de/files/2018/04/hoppe18_fhns.pdf
https://github.molgen.mpg.de/sabrina-hoppe/everyday-eye-movements-predict-personality
https://www.newscientist.com/article/2167850-ai-can-predict-your-personality-just-by-how-your-eyes-move/
http://www.dailymail.co.uk/sciencetech/article-5686817/An-incredible-mind-reading-AI-predict-personality-just-studying-eyes-move.html
https://www.digitaltrends.com/cool-tech/ai-personality-eye-movement/
http://www.newsweek.com/artificial-intelligence-algorithm-can-work-out-your-personality-simply-909752
https://www.12news.com/video/syndication/veuer/new-ai-can-predict-personality-from-eye-movements/602-8116328
https://www.usatoday.com/videos/tech/news/2018/05/03/new-ai-can-predict-personality-eye-movements/34526179/},
doi = {10.3389/fnhum.2018.00105},
year = {2018},
date = {2018-03-05},
journal = {Frontiers in Human Neuroscience},
volume = {12},
pages = {105:1-105:8},
abstract = {Besides allowing us to perceive our surroundings, eye movements are also a window into our mind and a rich source of information on who we are, how we feel, and what we do. Here we show that eye movements during an everyday task predict aspects of our personality. We tracked eye movements of 42 participants while they ran an errand on a university campus and subsequently assessed their personality traits using well-established questionnaires. Using a state-of-the-art machine learning method and a rich set of features encoding different eye movement characteristics, we were able to reliably predict four of the Big Five personality traits (neuroticism, extraversion, agreeableness, conscientiousness) as well as perceptual curiosity only from eye movements. Further analysis revealed new relations between previously neglected eye movement characteristics and personality. Our findings demonstrate a considerable influence of personality on everyday eye movement control, thereby complementing earlier studies in laboratory settings. Improving automatic recognition and interpretation of human social signals is an important endeavor, enabling innovative design of human–computer systems capable of sensing spontaneous natural user behavior to facilitate efficient interaction and personalization.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}