Privacy-preserving federated machine learning on FAIR health data: A real-world application

Date

2024-12-01

Author

Sınacı, Ali Anıl
Gencturk, Mert
Alvarez-Romero, Celia
Laleci Erturkmen, Gokce Banu
Martinez-Garcia, Alicia
Escalona-Cuaresma, María José
Parra-Calderon, Carlos Luis

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

1948
views

0
downloads

Objective: This paper introduces a privacy-preserving federated machine learning (ML) architecture built upon Findable, Accessible, Interoperable, and Reusable (FAIR) health data. It aims to devise an architecture for executing classification algorithms in a federated manner, enabling collaborative model-building among health data owners without sharing their datasets. Materials and methods: Utilizing an agent-based architecture, a privacy-preserving federated ML algorithm was developed to create a global predictive model from various local models. This involved formally defining the algorithm in two steps: data preparation and federated model training on FAIR health data and constructing the architecture with multiple components facilitating algorithm execution. The solution was validated by five healthcare organizations using their specific health datasets. Results: Five organizations transformed their datasets into Health Level 7 Fast Healthcare Interoperability Resources via a common FAIRification workflow and software set, thereby generating FAIR datasets. Each organization deployed a Federated ML Agent within its secure network, connected to a cloud-based Federated ML Manager. System testing was conducted on a use case aiming to predict 30-day readmission risk for chronic obstructive pulmonary disease patients and the federated model achieved an accuracy rate of 87%. Discussion: The paper demonstrated a practical application of privacy-preserving federated ML among five distinct healthcare entities, highlighting the value of FAIR health data in machine learning when utilized in a federated manner that ensures privacy protection without sharing data. Conclusion: This solution effectively leverages FAIR datasets from multiple healthcare organizations for federated ML while safeguarding sensitive health datasets, meeting legislative privacy and security requirements.

Subject Keywords

Distributed datasets, FAIR data, Federated machine learning, Privacy-preserving machine learning

URI

https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85186119906&origin=inward
https://hdl.handle.net/11511/109037

Journal

Computational and Structural Biotechnology Journal

DOI

https://doi.org/10.1016/j.csbj.2024.02.014

Collections

Department of Computer Engineering, Article

Citation Formats

A. A. Sınacı et al., “Privacy-preserving federated machine learning on FAIR health data: A real-world application,” Computational and Structural Biotechnology Journal, vol. 24, pp. 136–145, 2024, Accessed: 00, 2024. [Online]. Available: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85186119906&origin=inward.