Multimodal Sleep Posture Estimation and Recognition

Background
Abnormal sleep posture can affect symptoms of many medical complications and it requires consistent monitoring throughout the night. The need for automated behavior monitoring is becoming more and more important, especially because of the recent pandemic. In such circumstances, a solution is needed that can be employed at home or medical centers to aid hospital staff in monitoring a patient.

Challenges in Existing Research
Most In-bed sleep posture monitoring systems rely heavily on wearable devices or manual reports. These wearables are costly and cannot be offered beyond a professional hospital setting. For computer vision-based estimation there are certain illumination and occlusion challenges that impact pose estimation and detection. Also, the scale of data is limited to only a few participants and there aren’t many publicly released datasets.

Project Goal
An inexpensive vision-based sleep posture recognition system that considers multiple modalities including Depth, Infrared, and RGB to deal with challenges that hinder accurate pose estimation is required. This system can be installed unobtrusively in any room to detect abnormal sleep behaviors while considering darkness and occlusions.

Proposed Methodology 
To deal with adverse vision conditions we propose a system that considers image modalities along with RGB including depth data and long-wavelength infrared. Depth modality has been used because of its invulnerability to illumination problems. Whereas, IR deals with occlusions that can occur due to blankets and sheets.

Dataset
Due to the unavailability of publicly available datasets, we collected a large-scale dataset with multiple sensing modalities collected simultaneously in different bedroom settings to ensure diversity. A Visualization tool was created to investigate and correct joint labels provided by the sensors.

Algorithm
Homography transformation techniques were used to align data from different modalities sensors. A deep convolutional network similar to Pyramid Residual Modules and stacked Hourglass learns human joint heatmaps as well as labels for pose estimation.

Results
We have used the Probability of correct key points accuracy(PCK) as our metric and achieved a subtle 80% at PCKh@0.5. These results are approximately the same for multiple settings which shows the robustness of multiple modalities of data in various vision conditions.

Conclusion
The results show that data from multiple modalities can be helpful to deal with adverse vision conditions in multiple settings. Due to the robust nature of this solution, it can be mounted in a certain setting anywhere to continuously detect abnormal sleep poses throughout the day.

What are you looking for?

Simply enter your keyword and we will help you find what you need.