Harnessing Large Language Models for Automatic Evaluation of Mobile Health Applications Based on Persuasive System Design Principles and Mobile Application Rating Scale

2024-01-01
Mobile applications have seen a growing prevalence in the healthcare sector, yet the absence of comprehensive regulations and preliminary assessments can lead to significant frustration and time loss for users. To address this, Persuasive System Design (PSD) principles and the Mobile App Rating Scale (MARS) have emerged as popular tools for gauging application quality and user engagement. However, their manual assessment requirements hinder scalability, especially given the high volume of mobile health applications in the market. This study introduces a novel automatic evaluation approach designed to enhance the assessment of mobile health applications, leveraging PSD and MARS. The proposed method mainly relies on large language models to filter user reviews and generate sentence embeddings for classifying the PSD principles implemented in these applications. The results, calculated using performance metrics that compare the model’s predictions with expert evaluations, demonstrate the feasibility of predicting the application’s implementation of PSD principles based on user reviews while also highlighting the limitations of using application descriptions alone for successful prediction. Furthermore, the study augments the predicted classification probabilities of PSD principles with supplementary descriptive data, such as installation counts and user ratings, to predict MARS scores. Regression models, trained using these techniques, consistently outperform basic models, with feature importance scores showing the significant contribution of predicted classification probabilities of PSD principles to the models. In summary, this study suggests that automatic evaluation techniques can effectively assess the quality and user engagement of mobile health applications, offering a viable alternative to manual assessments.
19th International Conference on Persuasive Technology, PERSUASIVE 2024
Citation Formats
Y. Afşin and T. Taşkaya Temizel, “Harnessing Large Language Models for Automatic Evaluation of Mobile Health Applications Based on Persuasive System Design Principles and Mobile Application Rating Scale,” Wollongong, Avustralya, 2024, vol. 14636 LNCS, Accessed: 00, 2024. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85192165505&origin=inward.