We are developing new computational models and a system for accurate measurement of human motion. Our primary goal is to develop markerless vision-based tracking algorithms for use with the industry-proven anthropometric human model RAMSIS (in collaboration with the TUM Ergonomics Department/Faculty of Mechanical Engineering). By providing RAMSIS with markerless tracking capabilities, we open up new fields of application in ergonomic studies and industrial design. On the other hand, we believe that a far-developed, flexible and accurate model such as RAMSIS is beneficial for human motion tracking given the ergonomic expertise that has affected its design.
The human model consists of an inner model that is accurately modeled after a real human skeleton, and an outer model that can be adapted to different body types (anthropometries) and gender. It is parametrizable via the articulated joint angles of the inner model. Absolute motion limits in the joints ensure physiologically realistic postures. Originally consisting of 65 degrees of freedom (65 DOF), we have reduced the model to 51 DOF by applying an ergonomically sound interpolation of the spine joints.
Figure: Inner model and corresponding joints (left) and outer model for different anthropometries and gender (right).
To improve performance of the model in tracking applications, we have incorporated optimizations such as caching of body part relative pose calculations and body part dependant inter-frame motion limits into the model. We plan to extend the model with biomechanical preferences and cost functions related to internal/external forces as well as discomfort of the postures. Such extensions have already been presented in the original RAMSIS model.
Tracking is performed in a Bayesian framework using a set of hierarchically coupled local particle filters. This makes it possible to sample efficiently from the high dimensional space of articulated human poses without constraining the allowed movements. Currently we are using a minimum of three cameras for tracking, to account for self-occlusions of the model. We will also investigate other setups, e.g. stereo cameras, to facilitate future use e.g. on mobile robots. Our current research focuses on robust weight functions suitable for changing environments and on reliable motion prediction based on extracted image features. We are applying our methods in ergonomic studies conducted in collaboration with the TUM Ergonomics Department, as well as for the recognition of manipulation tasks in the Assistive Kitchen demonstration scenario of the CoTeSys cluster of excellence.
Here are several videos to showcase the performance and different aspects of our work:
A short presentation video for a kitchen sequence with manipulation activities:
Tracking two subjects in a joint action scenario in the kitchen:
Tracking results on the HumanEva2 benchmark:
Ergonomic case studies on a car mock-up sequence:
A 6.5 minute long sequence tracked at once:
Showing the effect of implicit environment modeling on the tracking outcome:
Self-training of motion models on the HumanEva2 S4 sequence. The tracker learns to improve its prediction of repetitive motion patterns over time. Previously observed motion patterns are shown in greyscale:
Using learned motion models to improve the prediction for environment- and task-specific activities. Trained motion patterns are shown in greyscale. The learned motion models can also be used for activity recognition:
A kitchen sequence with random pick and place actions:
Another kitchen sequence with random pick and place actions:
Another kitchen sequence, different actor:
21 DOF upper-body-only motions showing the accuracy of the model:
This project was partly funded by the DFG as part of the MeMoMan project.
A Self-Training Approach for Visual Tracking and Recognition of Complex Human Activity Patterns , In International Journal of Computer Vision, volume 99, 2012. [bib] |
Towards Automated Models of Activities of Daily Life , In Technology and Disability, IOS Press, volume 22, 2010. [bib] [pdf] |
Human Action Recognition using Global Point Feature Histograms and Action Shapes , In Advanced Robotics journal, Robotics Society of Japan (RSJ), 2009. [bib] |
MeMoMan - Model Based Markerless Capturing of Human Motion , In The 17th World Congress on Ergonomics (International Ergonomics Association, IEA), 2009. [bib] |
Towards Automated Models of Activities of Daily Life , In First International Symposium on Quality of Life Technology -- Intelligent Systems for Better Living, 2009. [bib] [pdf] |
The Assistive Kitchen -- A Demonstration Scenario for Cognitive Technical Systems , In IEEE 17th International Symposium on Robot and Human Interactive Communication (RO-MAN), Muenchen, Germany, 2008.(Invited paper.) [bib] [pdf] |
Evaluation of Hierarchical Sampling Strategies in 3D Human Pose Estimation , In Proceedings of the 19th British Machine Vision Conference (BMVC), 2008. [bib] [pdf] |
Accurate Human Motion Capture Using an Ergonomics-Based Anthropometric Human Model , In Proceedings of the Fifth International Conference on Articulated Motion and Deformable Objects (AMDO), 2008. [bib] [pdf] |
Action Recognition in Intelligent Environments using Point Cloud Features Extracted from Silhouette Sequences , In IEEE 17th International Symposium on Robot and Human Interactive Communication (RO-MAN), Muenchen, Germany, 2008. [bib] [pdf] |
The TUM Kitchen Data Set of Everyday Manipulation Activities for Motion Tracking and Action Recognition , In IEEE International Workshop on Tracking Humans for the Evaluation of their Motion in Image Sequences (THEMIS), in conjunction with ICCV2009, 2009. [bib] [pdf] |
Tracking Humans Interacting with the Environment Using Efficient Hierarchical Sampling and Layered Observation Models , In IEEE Int. Workshop on Human-Computer Interaction (HCI). In conjunction with ICCV2009, 2009. [bib] [pdf] |
The Assistive Kitchen --- A Demonstration Scenario for Cognitive Technical Systems , In Proceedings of the 4th COE Workshop on Human Adaptive Mechatronics (HAM), 2007. [bib] [pdf] |