Federated Learning for Model Optimization

Federated learning - shared performance despite separate data

Tim Filzinger

The accuracy of machine learning stands and falls with the data used. As a rule, more is more if it is to be used successfully. This often requires the inclusion of different data sources; however, mixing them can be problematic for data protection reasons. Federated learning aims to resolve this dilemma by allowing model training to take place simultaneously on separate devices. In this way, private information remains private and still generates a general benefit.

What is federated learning?

Federated learning is a machine learning technique in which different local data sets are used to train an AI model. The special feature is the absence of a central database, which is typical of classic learning methods. Sensitive data remains on the respective end devices, while only information for model adaptation is shared. In the case of neural networks, this applies, for example, to changes in the weight of individual neurons. The learning effect, which is based on a larger database, can therefore benefit several cooperation partners. This is because companies in the same industry often pursue similar goals in their AI and data science projects.

Due to the progress made by Machine Learning and growing volumes of data that are subject to corresponding guidelines, federated learning is an increasingly frequently used concept. Implementation is now possible with various clients. These include

  • Edge devices (IoT)
  • Server and cloud infrastructure
  • Desktop computers and laptops
  • In-home devices
  • Smartphones and tablets

In order to tap into more data with a high level of data security, there is a progressive technical differentiation that now reaches into the pockets of end users - a further sign of the increasing applicability and user-centricity of artificial intelligence.

How does federated learning work?

In principle, federated learning is machine learning in the true sense of the word. During training, the model analyzes the data provided for certain correlations and derives its own predictions from this. The ultimate aim is to minimize a loss function, which is tantamount to maximizing the accuracy of the prediction. However, access to the data used is not distributive, but distributed. The process usually comprises the following steps, even if the individual approach can vary greatly:

  • Initialization of the model
  • Distribution to the clients
  • Local training
  • Transfer of updates and model weightings
  • Aggregation
  • Iteration
centralized server for model training
The use of central servers is one of the typical implementation options.

Centralized FL

What sounds paradoxical in relation to a federated learning process only refers to the coordination and orchestration of the participating devices. The training of local model versions takes place on these devices and is initiated by a central server. This also enables the aggregation of the respective updates and weight changes for a global model. This is possible by calculating average values or taking into account the size of the respective data sets. The updated central model is then redistributed to the end devices for the next iteration.

Decentralized FL

This type of learning process does not require centralized coordination via a server. Instead, this takes place between the individual clients, which also exchange the model update data independently. This avoids a central failure, for example due to a large increase in data volume. However, this requires an increased quality of the network architecture, which has a massive influence on the decentralized orchestration of the transfer. If there are too many differences in the system and network environment, this leads to problems that can only be solved using newer approaches.

Heterogeneous FL

When it was first developed, federated learning often assumed a homogeneous distribution of data sets, clients and transfer structures. In the meantime, the requirements have become more complex. Just because two companies want to implement the same model training does not mean that they have the same conditions. The differences can be serious and are categorized into four dimensions by researchers Mang Ye and Xiuwen Fang (et al. 2023):

  • Distribution of data
  • Architecture of the models
  • Network environments
  • Hardware devices

As with any federated system, federated learning also faces certain hurdles due to heterogeneity. Extending the successes that have already been achieved in simple centralized application areas to a more complex environment is the main motive of current research. 

Overcoming heterogeneity

There is now a whole range of methods that can be used to resolve structural differences or limit their negative effects. These are often aimed at problems caused by deviating data quality or performance differences:

Synthetic data

Generative models can calculate additional data points on the basis of a small private data set. These are estimates that remodel previously learned relationships. The new synthetic set is therefore not subject to other data protection regulations and can be used jointly. As a rule, however, the cooperation partners must prove that complete anonymization has taken place. The process is also known as Data Augmentation described.

Knowledge Distillation

Synthetic data is not the only way to pass on knowledge anonymously. "Knowledge" that has already been acquired can also be shared in compliance with data protection regulations, whereby a kind of student-teacher principle is applied. In this way, already trained, high-performance models can help weaker models to perform better. The loss function used follows calculated differences in the respective forecasts of the two models. This method is well suited if there are selective limitations in computing power or other resources.

Matrix Factorization

The multiplication of different entities is intended to create new latent Featurescertain characteristics of objects in a database. This can result in dimensional enlargement or reduction. Corresponding algorithms therefore act as filters and are used for recommendation systems, among other things. Against the background of federated learning, this technology can calculate and share statistical correlations, whereby the exact information behind them remains anonymous.

Architecture sharing

These methods bridge differences in the structural design of the models and networks. Backbone Sharing for example, is intended to reduce computing costs without disregarding individual requirements. Similarly, certain components of neural network structures can be reproduced in order to increase the uniformity of data processing. Complete models that have already undergone training can also be made available to several participants in order to carry out federated fine-tuning.

Advantages of the principle

Even though this type of machine learning can be associated with immense effort and resource consumption, successful implementation is often accompanied by numerous benefits:

  • data protection. Private and sensitive data remain with the respective clients.
  • Data diversity Different data sources provide more informative content.
  • Speed The simultaneous analysis of small databases is faster and more effective than a single, joint run.
  • Traceability and timeliness Training can be carried out with real-time data. Regular updates are possible.
  • Optimization options The optimum settings and model properties can be identified and shared in the different network environments. This can improve results and reduce costs.

Application areas

industry use cases of federated learning

As federated learning is a rather general paradigm of machine learning, the possible applications are not limited to specific industries or project types. In some cases, however, the advantages mentioned become particularly relevant.

Public health

Much of the relevant data in hospitals and other healthcare facilities is personal, so it is subject to special protection. However, it is also a valuable source of information when it comes to capacity utilization and resource consumption. The requirements of hospitals are very similar, so they benefit in particular from joint model training and increased data diversity. The latter is also helpful for the diagnosis of particularly rare diseases.

Manufacturing industry

Federated learning is well suited to predictive maintenance models that use machine data to forecast wear and possible failures. If the same devices are used in different production facilities, they are particularly easy to compare. The challenges of heterogeneity are therefore less pronounced and federated cooperation in AI forecasting is the obvious choice. But even if the IT infrastructure is too weak in some places to process sufficient proprietary data, the process is still worthwhile. Company secrets and individual production processes remain protected.

Mobile applications

Due to the pronounced real-time focus and the possible inclusion of mobile devices, federated learning is suitable for analyzing user behavior. At the same time, various artificial intelligence models are being used directly in more and more applications. This applies to speech recognition, word prediction, facial recognition and many other cases. The respective performance can be improved without the user's input having to be shared. Tech companies such as Meta are currently focusing on this type of federated learning. The Facebook company primarily mentions Data protection as a motif. However, the main interest is likely to be in data diversity. More than 77 percent of Internet users can be found on Meta's platforms.


The main advantage of federated learning is that it improves AI models through greater data diversity without compromising data protection. Countless clients can now be included in the process. Training typically takes place separately with small data sets; only the model adaptation is shared. A central server is often responsible for this step, but the implementation options can vary greatly. A decentralized approach is now also possible. Due to the now highly differentiated network structures, the majority of today's research efforts take place in the heterogeneous area. The aim is to overcome significant differences in the prerequisites and make federated learning as universally applicable as possible. This paradigm and its methods are therefore likely to remain relevant for the foreseeable future.

Would you like to improve AI models through high data diversity and are you concerned about data protection? Feel free to send us a message. Our experts look forward to hearing from you.

    About me

    More Articles

    Random Forest Tale

    What is Random Forest? - Easily and unforgettably explained

    Random Forest is a popular machine learning algorithm used for both classification and regression tasks. It...

    Read article

    Automation explained - definition, examples and benefits

    According to tradition, the first automation already existed in the first century after Christ. At that time, the mathematician and...

    Read article
    Futurised vs. Konfuzio

    Futurised vs. Konfuzio - Duel of two AI companies

    Artificial intelligence (AI) enters the stage of a well-known television format. In the 14th season of the popular show "Die Höhle der Löwen"...

    Read article