Author: Farida Asriani, Azhari Azhari, Wahyono
Problem and Challenge
Recognizing complex and dynamic movements in sports such as badminton remains a major challenge in human action recognition (HAR). Traditional recognition models often fail to capture the fast-paced nature of strokes, similarities in player posture, and temporal dependencies across motion sequences. Moreover, the lack of accurate motion segmentation and contextual reasoning further complicates the classification process. These challenges are illustrated in Figure 1, which shows overlapping body postures commonly observed during badminton stroke execution.
Goal of Experimentation
This research aims to develop an intelligent simulation model that combines agent-based reasoning and ensemble learning to recognize badminton strokes with high precision. The model simulates how players move and perform actions in a temporal-spatial space, mimicking real game conditions and aiding sports analytics.
Methods
A hybrid approach is applied in this study to improve the recognition of badminton strokes. Spatial features are extracted using 3D skeleton coordinates obtained from pose estimation, with the right hip serving as the anchor point for consistent joint positioning. Temporal dynamics are captured through Fast Dynamic Time Warping (FDTW), which aligns motion sequences to reflect the progression of movement over time. For classification, an ensemble learning strategy is employed by combining Support Vector Machine (SVM), Logistic Regression (LR), and AdaBoost using a weighted soft voting mechanism to boost performance and stability. The entire simulation framework is developed using real video datasets of badminton athletes, with each action segmented into 15 representative frames.
Architecture System
Figure 2 illustrates the system architecture, which begins with pose extraction from RGB video frames using MediaPipe. From the extracted skeletons, spatial features are derived from a key frame, while temporal features are computed using Fast Dynamic Time Warping (FDTW) across 15 frames. These features are then used to train and test a weighted ensemble model combining SVM, Logistic Regression, Random Forest, and AdaBoost classifiers. The final output consists of six classified badminton stroke types.
Results and Discussion
Figure 3 shows an overhead forehand stroke with 3D skeleton pose estimation used to extract spatiotemporal features. The overlaid joints represent key motion points critical for action classification. The bar chart highlights the strong performance of the weighted ensemble model, achieving high scores in accuracy, precision, recall, and F1-score. This demonstrates the model’s reliability in recognizing dynamic badminton strokes using skeleton-based input.
Value Proposition
SimBadAI transforms sports action data into intelligent insights:
– Enables coaches and analysts to monitor performance.
– Enhances training feedback with reliable simulation.
– Supports innovative sports systems for future tournaments.
Applicable for sports science, real-time feedback, augmented coaching, and academic research.