2.5 KiB
Processing Pipeline
Conda Environment Setup
conda env create -f conan_windows.yml
conda activate conan_windows_env
Usage
Run ConAn_RunProcessing.ipynb to extract all frames from video and run processing models.
Body Movement
For body movement detection we selected OpenPose. For our case, we used the 18-keypoint model,
which takes the full frame as input and jointly predicts anatomical keypoints and a measurement
for the degree of association between them.
If you're using this processing step in your research please cite:
@article{8765346,
author = {Z. {Cao} and G. {Hidalgo Martinez} and T. {Simon} and S. {Wei} and Y. A. {Sheikh}},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
title = {OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},
year = {2019}
}
Eye Gaze
For eye gaze estimation we selected RT-GENE. In addition to feeding each video frame to the model,
we also input a version of the frame where the left side and the right side are wrapped together.
This enables us to detect when a person moves over the edge of the video, as none of the models account for this.
As this is a single frame estimation, we then track all subjects throughout the video using a minimal euclidean distance heuristic.
If you're using this processing step in your research please cite:
@inproceedings{FischerECCV2018,
author = {Tobias Fischer and Hyung Jin Chang and Yiannis Demiris},
title = {{RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments}},
booktitle = {European Conference on Computer Vision},
year = {2018},
month = {September},
pages = {339--357}
}
Notes:
- Before using process_RTGene.py you need to run install_RTGene.py!
- [OPTIONAL] You can provide a camera calibration file calib.pkl to improve detections.
- You need to provide maximum number of people in the video for the sorting algorithm.
Facial Expression
Under construction
Speaking Activity
Under construction
Object Tracking
We assume that you are most likely able to define your own study procedure, therefore we decided to simplify object tracking by employing the visual fiducial system AprilTag 2, where the tag positions are extracted with their tailored detector.
Note: For Windows we use pupil_apriltags.