The imputation of missingness in cyclic and non-cyclic Electromyography(EMG) signaling data

2025-01-01
Sarasir, Fatemeh
Purutçuoğlu Gazi, Vilda
Multidimensional datasets in healthcare and life sciences often reflect temporal variations, but are often incomplete, complicating the analysis, and reducing statistical accuracy. To address missing data, imputation techniques are widely used, with machine learning algorithms like random forest and k-nearest neighbors and nonparametric methods such as spline and linear interpolation among the common approaches. This study examines electromyography data, a time-series biomedical data set, by evaluating 11 imputation methods in four datasets. We introduce four approaches, normal ratio, weighted normal ratio, expectation maximization, and Gibbs sampling, and assess each for accuracy and computational efficiency. Two scenarios were simulated: unaltered and down-sampled data, each with scattered and intermittent missingness. The comparative assessment emphasizes the notable precision of the expectation maximization method, with the random forest emerging as a robust alternative. Moreover, the normal ratio and weighted normal ratio methods demonstrate computational efficiency akin to mean and median imputation while improving accuracy. We also address cyclic data, a critical factor for improving accuracy. Using Fourier transformation, spline, and autoregressive models, we propose pattern-based and sinusoidal-based approaches to improve imputation. Results indicate that pattern-based improves accuracy, while sinusoidal-based offers efficiency, particularly for k-nearest neighbors.
HACETTEPE JOURNAL OF MATHEMATICS AND STATISTICS
Citation Formats
F. Sarasir and V. Purutçuoğlu Gazi, “The imputation of missingness in cyclic and non-cyclic Electromyography(EMG) signaling data,” HACETTEPE JOURNAL OF MATHEMATICS AND STATISTICS, vol. 54, no. 5, pp. 2036–2067, 2025, Accessed: 00, 2025. [Online]. Available: https://hdl.handle.net/11511/117097.