The imputation of missingness in cyclic and non-cyclic electromyography signaling data

2024-12-16
Sarasir, Fatemeh
Multidimensional datasets in healthcare and life sciences often reflect temporal variations but are frequently incomplete, complicating analysis and reducing statistical accuracy. To address missing data, imputation techniques are widely used, with machine learning algorithms like Random-Forest (RF) and K-Nearest Neighbors (K-NN) and nonparametric methods such as Spline and Linear interpolation among the common approaches. This study examines Electromyography (EMG) data, a time-series biomedical dataset, by evaluating eleven imputation methods across four types of EMG datasets. We introduce four innovative imputation approaches—Normal-Ratio (NR), Weighted-Normal-Ratio (WNR), Expectation-Maximization (EM), and Gibbs Sampling—and assess each for accuracy and computational efficiency in handling the specific characteristics of EMG data. Two scenarios were simulated: unaltered and down-sampled EMG data, each with varied data loss states of scattered, and intermittent missingness patterns. The comparative assessment emphasizes the notable imputation accuracy of the EM method, with the Random Forest emerging as a robust alternative post-EM algorithm. Moreover, the NR and WNR methods demonstrate computational efficiency akin to fundamental Mean and Median imputation techniques, while improving accuracy. Additionally, we address the cyclic EMG data, an overlooked yet critical factor for enhancing imputation accuracy. Using Fourier transformation, Spline, and Autoregressive models, we identify frequencies in periodic EMG data and propose two novel approaches—Pattern-based and Sinusoidal-based—for modifying EMG data structure into cyclic form to improve imputation outcomes in K-NN and EM techniques. Results indicate that Pattern-based improves accuracy with EM and K-NN imputations, while Sinusoidal-based offers computational efficiency, particularly for K-NN, across random and partial missing patterns.
Citation Formats
F. Sarasir, “The imputation of missingness in cyclic and non-cyclic electromyography signaling data,” M.S. - Master of Science, Middle East Technical University, 2024.