Skip to main content

We are excited to share that the CyberSEAS publication “DGA Detection Using Similarity-Preserving Bloom Encodings” by Lasse Nitz (Fraunhofer FIT, RWTH Aachen University) and Avikarsha Mandal (Fraunhofer FIT) has won the Best Paper Award at the 2023 European Interdisciplinary Cybersecurity Conference in Stavanger, Norway.

A short description of what the paper is about and why it matters can be found here, and the paper itself is accessible here (open access).

 

Publication

Using an approach for data linkage to enhance privacy in a deep learning cybersecurity use case

Lasse Nitz, Avikarsha Mandal

June 2023, ACM (Association for Computing Machinery)

DOI: 10.1145/3590777.3590795

We used an approach from the area of privacy-preserving record linkage to encode training data samples for the machine learning-based detection of algorithmically generated domain names, which are used to enable communication in botnets. The evaluated approach provides the required property of preserving similarity of data samples, while at the same time allowing to tune encodings in regard to the privacy-utility trade-off. We discuss requirements of different machine learning scenarios as well as privacy implications of this encoding approach for those scenarios. We further evaluated the encoding approach by training deep learning models on encodings generated with different parameter values, and compare their performance to the model trained on cleartext samples.

Why is it important?

For many applications related to classification, machine learning has become the go-to solution. Its use in scenarios involving sensitive training data and the rise of privacy regulations such as the GDPR, however, have led to concerns about potential leakage of sensitive information. We contributed to the goal of improving the understanding of privacy approaches for machine learning by evaluating an approach from the area of privacy-preserving record linkage in the cybersecurity use case of detecting algorithmically generated domains via deep learning. We hope that building bridges between these research areas helps to find innovative solutions for technical privacy protection.