BYZANTINE ATTACK ROBUST FEDERATED LEARNING

2021-9-09
Işık Polat, Ece
In federated learning (FL), collaborators train a global model collectively without sharing their local data. The local model parameters of the collaborators obtained from their local training process are collected on a trusted server to form the global model. In order to preserve privacy, the server has no authority over the local training procedure. Therefore, the global model is vulnerable to attacks such as data poisoning and model poisoning. Even though many defense strategies have been proposed against these attacks, they often make strong assumptions that are not compatible with the characteristics of FL. Moreover, these proposals have not been analyzed thoroughly. In this thesis, I propose an assumption-free defense mechanism called Byzantine Attack Robust Federated Learning (BARFED). BARFED does not make assumption about federated learning setting such as malicious collaborator ratio, the data distributions of the collaborators, and gradient update similarity. BARFED examines the distance between the global model and the local models of the collaborators on a layer basis and decides whether the collaborators will participate in the aggregation rule step phase based on the status of being an outlier. In other words, only the collaborators that are not labeled as outliers in any layer of the model architecture can participate in the aggregation step. I have shown that BARFED provides a robust defense against different attacks by performing comprehensive experiments that cover many aspects such as data distribution and whether attackers are organized or not.

Suggestions

ARFED: Attack-Resistant Federated averaging based on outlier elimination
Işık Polat, Ece; Polat, Gorkem; Koçyiğit, Altan (2023-04-01)
In federated learning, each participant trains its local model with its own data and a global model is formed at a trusted server by aggregating model updates coming from these participants. Since the server has no effect and visibility on the training procedure of the participants to ensure privacy, the global model becomes vulnerable to attacks such as data poisoning and model poisoning. Although many defense algorithms have recently been proposed to address these attacks, they often make strong assumptio...
Age of information and unbiased federated learning in energy harvesting error-prone channels
Çakır, Zeynep; Ceran Arslan, Elif Tuğçe; Department of Electrical and Electronics Engineering (2022-8-29)
Federated learning is a communication-efficient and privacy-preserving learning tech nique for collaborative training of machine learning models on vast amounts of data produced and stored locally on the distributed users. In this thesis, unbiased feder ated learning methods that achieve a similar convergence as state-of-the-art federated learning methods in scenarios with various constraints like error-prone channel or in termittent energy availability are investigated. In addition, a prevalent metric call...
A Similarity Based Oversampling Method for Multi-Label Imbalanced Text Data
Karaman, İsmail Hakkı; Köksal, Gülser; Erişkin, Levent; Department of Industrial Engineering (2022-9-1)
In the real world, while the amount of data increases, it is not easy to find labeled data for Machine Learning projects, because of the compelling cost and effort requirements for labeling data. Also, most Machine Learning projects, especially multi-label classification problems, struggle with the data imbalance problem. In these problems, some classes, even, do not have enough data to train a classifier. In this study, an over sampling method for multi-label text classification problems is developed and s...
Detection of clean samples in noisy labelled datasets via analysis of artificially corrupted samples
Yıldırım, Botan; Ulusoy, İlkay; Department of Electrical and Electronics Engineering (2022-8-22)
Recent advances in supervised deep learning methods have shown great successes in image classification but these methods are known to owe their success to massive amount of data with reliable labels. However, constructing large-scale datasets inevitably results with varying levels of label noise which degrades performance of the supervised deep learning based classifiers. In this thesis, we make an analysis of sample selection based label noise robust approaches by providing extensive experimental evaluatio...
Feature Dimensionality Reduction with Variational Autoencoders in Deep Bayesian Active Learning
Ertekin Bolelli, Şeyda (2021-06-09)
Data annotation for training of supervised learning algorithms has been a very costly procedure. The aim of deep active learning methodologies is to acquire the highest performance in supervised deep learning models by annotating as few data points as possible. As the feature space of data grows, the application of linear models in active learning settings has become insufficient. Therefore, Deep Bayesian Active Learning methodology which represents model uncertainty has been widely studied. In this paper, ...
Citation Formats
E. Işık Polat, “BYZANTINE ATTACK ROBUST FEDERATED LEARNING,” M.S. - Master of Science, Middle East Technical University, 2021.