BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data

Download

index.pdf

Date

2014-1-01

Author

Gencturk, Mert
Sınacı, Ali Anıl
Cicekli, Nihan Kesim

Metadata

Show full item record

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Item Usage Stats

205
views

73
downloads

The application of federated learning on ensemble methods is a common practice with the goal of increasing the predictive power of local models. However, although existing federated solutions utilizing ensemble methods can achieve this when the datasets of sites are balanced and of good quality, i.e., the local models are already above a certain accuracy threshold, they usually fail to provide the same level of improvement to the models of sites that have an unsuccessful classifier because of their poor quality or imbalanced data. To address this challenge, we propose a novel federated ensemble classification algorithm for horizontally partitioned data, namely Boosting-based Federated Random Forest (BOFRF), which not only increases the predictive power of all participating sites, but also provides significantly high improvement on the predictive power of sites having unsuccessful local models. We implement a federated version of random forest, which is a well-known bagging algorithm, by adapting the idea of boosting to it. We introduce a novel aggregation and weight calculation methodology that assigns weights to local classifiers based on their classification performance at each site without increasing the communication or computation cost. We evaluate the performance of our proposed algorithm in different federated environments that we set up by using four healthcare datasets. The empirical results show that BOFRF improves the predictive power of local random forest models in all cases. The advantage of BOFRF is that the level of improvement it provides for sites having unsuccessful local models is significantly high unlike existing solutions.

Subject Keywords

Data models, Random forests, Predictive models, Classification algorithms, Collaborative work, Boosting, Prediction algorithms, Machine learning, Privacy, Collaborative work, Ensemble learning, federated learning, machine learning, privacy-preservation, random forest classification

URI

https://hdl.handle.net/11511/99668

Journal

IEEE ACCESS

DOI

https://doi.org/10.1109/access.2022.3202008

Collections

Department of Computer Engineering, Article

Suggestions

OpenMETU
Core

Privacy-preserving horizontal federated learning methodology through a novel boosting-based federated random forest algorithm Gençtürk, Mert; Çiçekli, Fehime Nihan; Department of Computer Engineering (2023-1-04) In this thesis, a novel federated ensemble classification algorithm for horizontally partitioned data called Boosting-based Federated Random Forest (BOFRF) is proposed, which not only increases the predictive power of all participating sites, but also provides significantly high improvement on the predictive power of sites having unsuccessful local models. In this regard, a federated version of random forest, which is a well-known bagging algorithm, is implemented by adapting the idea of boosting to it. In ...
EXTRACTION OF INTERPRETABLE DECISION RULES FROM BLACK-BOX MODELS FOR CLASSIFICATION TASKS GALATALI, EGEMEN BERK; ALEMDAR, HANDE; Department of Computer Engineering (2022-8-31) In this work, we have proposed a new method and ready to use workflow to extract simplified rule sets for a given Machine Learning (ML) model trained on a classifi- cation task. Those rules are both human readable and in the form of software code pieces thanks to the syntax of Python programming language. We have inspired from the power of Shapley Values as our source of truth to select most prominent features for our rule sets. The aim of this work to select the key interval points in given data in order t...
A Bayesian Approach to Learning Scoring Systems Ertekin Bolelli, Şeyda (2015-12-01) We present a Bayesian method for building scoring systems, which are linear models with coefficients that have very few significant digits. Usually the construction of scoring systems involve manual efforthumans invent the full scoring system without using data, or they choose how logistic regression coefficients should be scaled and rounded to produce a scoring system. These kinds of heuristics lead to suboptimal solutions. Our approach is different in that humans need only specify the prior over what the ...
A method for quadruplet sample selection in deep feature learning Derin Öznitelik Öǧrenme için Dördüz Örnek Seçme Yöntemi Karaman, Kaan; Gundogdu, Erhan; Koc, Aykut; Alatan, Abdullah Aydın (2018-07-05) Recently, the deep learning based feature learning methodologies have been developed to recognize the objects in fine-grained detail. In order to increase the discriminativeness and robustness of the utilized features, this paper proposes a sample selection methodology for the quadruplet based feature learning. The feature space is manipulated by using the hierarchical structure of the training set. In the training process, the quadruplets are selected by considering the distances between the samples in the...
Closed-form sample probing for training generative models in zero-shot learning Çetin, Samet; Cinbiş, Ramazan Gökberk; Department of Computer Engineering (2022-2-10) Generative modeling based approaches have led to significant advances in generalized zero-shot learning over the past few-years. These approaches typically aim to learn a conditional generator that synthesizes training samples of classes conditioned on class embeddings, such as attribute based class definitions. The final zero-shot learning model can then be obtained by training a supervised classification model over the real and/or synthesized training samples of seen and unseen classes, combined. Therefor...

Citation Formats

M. Gencturk, A. A. Sınacı, and N. K. Cicekli, “BOFRF: A Novel Boosting-Based Federated Random Forest Algorithm on Horizontally Partitioned Data,” IEEE ACCESS, vol. 4, pp. 89835–89851, 2014, Accessed: 00, 2022. [Online]. Available: https://hdl.handle.net/11511/99668.