Towards more secure and efficient federated learning

Artificial intelligence is transforming entire industries, from healthcare to defense, but one central question remains: how can we train models while ensuring the confidentiality of sensitive data? A promising answer lies in the combination of three complementary technologies: Federated Learning, Homomorphic Encryption, and Knowledge Distillation.

Date: 5 May 2025

Expertises

Data Science

Innovation theme

Artificial Intelligences

About projects

Author

Xavier Lessage

Federated learning at a glance

Federated learning allows multiple institutions — such as hospitals, government agencies, or companies — to collaborate on building a shared model without ever exchanging raw data. Each participant trains a local model on-site and sends only the learned parameters to a central server for aggregation. This paradigm is particularly well-suited for sensitive environments where data privacy is critical.

Of course, its applications go well beyond healthcare. In the defense sector, for instance, data is often classified and inaccessible to external partners. Federated learning enables these actors to leverage collective intelligence while maintaining strict separation between data sources. The same logic applies to domains like finance, justice, and industrial R&D.

However, this distributed model comes with its own set of vulnerabilities: inference attacks, data poisoning, and information leakage through the models themselves. This highlights the growing need for advanced protective mechanisms to ensure both privacy and security.

Protecting models with homomorphic encryption

Homomorphic encryption allows computations to be performed directly on encrypted data, without ever needing to decrypt it. Applied to federated learning, this means that model weights can be encrypted before transmission and then securely aggregated on the server.

However, this enhanced security comes at a cost: homomorphic operations are computationally and memory-intensive, especially when dealing with large models. It thus becomes essential to design lighter models that maintain high performance.

Lightning models without compromising efficiency

Two main techniques are commonly used to compress models:

Pruning, which involves removing the least useful parameters or neurons.
Knowledge distillation, where a smaller “student” model learns to replicate the behavior of a larger “teacher” model.

Not all distillation methods are equally effective. In our approach, we combine two complementary forms:

Logit-based distillation: the student learns to mimic the teacher’s output (logits). This method is simple to implement but provides limited insight into the model’s reasoning process.
Feature-based distillation: the student learns to reproduce the teacher’s internal representations (features), enabling a deeper transfer of knowledge — though it requires a closer architectural match between the two models.

Our approach

We applied this innovative approach to breast cancer detection from mammography images, using an architecture built on three key pillars:

Smart compression: By combining pruning and knowledge distillation, we designed an ultra-lightweight model based on MobileNetV2, specifically optimized for encrypted learning.
Reverse distillation: Each client uses a teacher-student pair. The student models are aggregated centrally, and their knowledge is then redistributed back to the teachers, creating a bidirectional learning cycle.
Full encryption: Thanks to the model’s compactness, it can be fully encrypted using the CKKS homomorphic encryption scheme, ensuring complete confidentiality — even during training.

Want to learn more about this innovative approach? Don’t hesitate to contact our experts.