skip to main content
10.1145/3568562.3568639acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

A study on skeleton-based action recognition and its application to physical exercise recognition

Published: 01 December 2022 Publication History

Abstract

In recent years, human action recognition (HAR) has been an attractive research topic in computer vision since HAR has been widely applied in various fields, such as gaming, healthcare, surveillance, and human-machine interaction. In this paper, we present a feasible solution to develop a real-time HAR application that helps one monitor physical exercises. As an end-to-end solution, the proposed framework consists of techniques that recognize the physical exercises, spot each in real-time and assess the quality of the practicing users. Firstly, we define nine common physical exercises, such as arm circles, squats, jumping jacks, etc. We then construct a dataset named COMVIS-FITNESS consisting of the physical exercises dataset of nine subjects. We argue the advantages of two different deep neural networks for HAR to recognize the exercises. One is a compact HAR neuronal network named DD-Net, and another is a high-performance graph convolutional neural network called FF-AAGCN. For spotting the exercises from an image sequence, a sliding window method is combined with these two networks to make a real-time classification. A technique to evaluate the workout of the practicing user is proposed. Experimental results on COMVIS-FITNESS show that deep neural networks can recognize all exercises in the datasets with high accuracy. The accuracies and F1-scores of DD-Net are 99.24% and 99.23%, while those obtained by FF-AAGCN are 98.48% and 98.32%, respectively. The completed pipeline of the proposed method is integrated in an application. The application works in real-time on edge device as Jetson Xavier AGX. The proposed techniques and dataset are made publicly available and can be downloaded from https://bit.ly/3F9J1qb.

Supplementary Material

MP4 File (demo_video.mp4)
Demo video

References

[1]
Jung-euk Ahn, Eun-Surk Yi, Ji-Youn Kim, and Byung Mun Lee. 2018. A Multiracial Data-Based Machine Learning Model for Exercise Motion Recognition. International Journal of Control and Automation 11, 2(2018), 89–102.
[2]
Alexandre Bernardino, Christian Vismara, Sergi Bermudez i Badia, Élvio Gouveia, Fátima Baptista, Filomena Carnide, Simão Oom, and Hugo Gamboa. 2016. A dataset for the automatic assessment of functional senior fitness tests using Kinect and physiological sensors. In 2016 1st International Conference on Technology and Innovation in Sports, Health and Wellbeing (TISHW). IEEE, 1–6.
[3]
Igor Lopes de Faria and Vaninha Vieira. 2018. A comparative study on fitness activity recognition. In Proceedings of the 24th Brazilian Symposium on Multimedia and the Web. 327–330.
[4]
Bappaditya Debnath, Mary O’brien, Motonori Yamaguchi, and Ardhendu Behera. 2021. A review of computer vision-based approaches for physical rehabilitation and assessment. Multimedia Systems (2021), 1–31.
[5]
Andre Ebert, Michael Till Beck, Andy Mattausch, Lenz Belzner, and Claudia Linnhoff-Popien. 2017. Qualitative assessment of recurrent human motion. In 2017 25th European Signal Processing Conference (EUSIPCO). IEEE, 306–310.
[6]
Shrajal Jain, Aditya Rustagi, Sumeet Saurav, Ravi Saini, and Sanjay Singh. 2021. Three-dimensional CNN-inspired deep learning architecture for Yoga pose recognition in the real-world environment. Neural Computing and Applications 33, 12 (2021), 6427–6441.
[7]
Li Li, Tara Martin, and Xu Xu. 2020. A novel vision-based real-time method for evaluating postural risk factors associated with musculoskeletal disorders. Applied Ergonomics 87(2020), 103138.
[8]
Chhaihuoy Long, Eunhye Jo, and Yunyoung Nam. 2021. Development of a yoga posture coaching system using an interactive display based on transfer learning. The Journal of Supercomputing(2021), 1–16.
[9]
Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, 2019. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172(2019).
[10]
Andreas Möller, Luis Roalter, Stefan Diewald, Johannes Scherr, Matthias Kranz, Nils Hammerla, Patrick Olivier, and Thomas Plötz. 2012. Gymskill: A personal trainer for physical exercises. In 2012 IEEE International Conference on Pervasive Computing and Communications. IEEE, 213–220.
[11]
Sakchai Muangsrinoon and Poonpong Boonbrahm. 2018. Using ensemble algorithms for physical activity recognition prediction. In AIP Conference Proceedings, Vol. 2016. AIP Publishing LLC, 020102.
[12]
Julien Pansiot, Rachel C King, Douglas G McIlwraith, Benny PL Lo, and Guang-Zhong Yang. 2008. ClimBSN: Climber performance monitoring with BSN. In 2008 5th International Summer School and Symposium on Medical Devices and Biosensors. IEEE, 33–36.
[13]
Dinh-Tan Pham, Quang-Tien Pham, Thi-Lan Le, and Hai Vu. 2021. An Efficient Feature Fusion of Graph Convolutional Networks and Its Application for Real-Time Traffic Control Gestures Recognition. IEEE Access 9(2021), 121930–121943.
[14]
Amir Shahroudy, Jun Liu, Tian-Tsong Ng, and Gang Wang. 2016. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1010–1019.
[15]
Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2020. Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Transactions on Image Processing 29 (2020), 9532–9545.
[16]
David Strömbäck, Sangxia Huang, and Valentin Radu. 2020. Mm-fit: Multimodal deep learning for automatic exercise logging across sensing devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 4 (2020), 1–22.
[17]
Quang-Tien Pham et al.2022. Automatic Recognition and Assessment of Physical Exercises from RGB Images. In International Conference on Communications and Electronics (ICCE 2022). IEEE, 349–354.
[18]
Thanh-Hai Tran, Thi Le, Dinh-Tan Pham, Van-Nam Hoang, Van-Minh Khong, Quoc-Toan Tran, Nguyen Son, and Cuong Pham. 2018. A multi-modal multi-view dataset for human fall analysis and preliminary investigation on modality. 1947–1952. https://doi.org/10.1109/ICPR.2018.8546308
[19]
Manisha Verma, Sudhakar Kumawat, Yuta Nakashima, and Shanmuganathan Raman. 2020. Yoga-82: A New Dataset for Fine-grained Classification of Human Poses. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 4472–4479.
[20]
Andrew J. Weitz, Lina Colucci, Sidney R Primas, and Brinnae Bent. 2021. InfiniteForm: A synthetic, minimal bias dataset for fitness applications. ArXiv abs/2110.01330(2021).
[21]
Yubin Wu, Qianqian Lin, Mingrun Yang, Jing Liu, Jing Tian, Dev Kapil, and Laura Vanderbloemen. 2022. A Computer Vision-Based Yoga Pose Grading Approach Using Contrastive Skeleton Feature Representations. In Healthcare, Vol. 10. Multidisciplinary Digital Publishing Institute, 36.
[22]
Fan Yang, Yang Wu, Sakriani Sakti, and Satoshi Nakamura. 2019. Make skeleton-based action recognition model smaller, faster and better. In Proceedings of the ACM Multimedia Asia. ACM, 1–6.

Cited By

View all
  • (2024)Recognition and Scoring Physical Exercises via Temporal and Relative Analysis of Skeleton Nodes Extracted from the Kinect SensorSensors10.3390/s2420671324:20(6713)Online publication date: 18-Oct-2024
  • (2024)PAR-Net: An Enhanced Dual-Stream CNN–ESN Architecture for Human Physical Activity RecognitionSensors10.3390/s2406190824:6(1908)Online publication date: 16-Mar-2024
  • (2024)Part-Aware Unified Representation of Language and Skeleton for Zero-Shot Action Recognition2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01775(18761-18770)Online publication date: 16-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication Technology
December 2022
474 pages
ISBN:9781450397254
DOI:10.1145/3568562
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. action recognition
  2. dataset of physical exercises
  3. fitness coaching

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SoICT 2022

Acceptance Rates

Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)63
  • Downloads (Last 6 weeks)8
Reflects downloads up to 07 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Recognition and Scoring Physical Exercises via Temporal and Relative Analysis of Skeleton Nodes Extracted from the Kinect SensorSensors10.3390/s2420671324:20(6713)Online publication date: 18-Oct-2024
  • (2024)PAR-Net: An Enhanced Dual-Stream CNN–ESN Architecture for Human Physical Activity RecognitionSensors10.3390/s2406190824:6(1908)Online publication date: 16-Mar-2024
  • (2024)Part-Aware Unified Representation of Language and Skeleton for Zero-Shot Action Recognition2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01775(18761-18770)Online publication date: 16-Jun-2024
  • (2023)Use K-Means-Generated Nodes to Distinguish Learned from Non-Learned ExercisesIECON 2023- 49th Annual Conference of the IEEE Industrial Electronics Society10.1109/IECON51785.2023.10312453(1-6)Online publication date: 16-Oct-2023
  • (2023)Accurate continuous action and gesture recognition method based on skeleton and sliding windows techniques2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)10.1109/APSIPAASC58517.2023.10317368(284-290)Online publication date: 31-Oct-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media

  NODES
Association 3
innovation 1
INTERN 11
Note 1
USERS 1