Research on Privacy Patterns

Next week, the 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA) will take place in Vienna, Austria. One of the tracks that this long-standing conference offers is on Systematic Literature Reviews and Mapping Studies in Software Engineering (SMSE) and I am happy to announce that I have a paper on this track.

Euromicro DSD/SEAA

This contribution is titled A Literature Study on Privacy Patterns Research. It is joint work together with my colleagues Lothar Fritsch and Sebastian Herold at Karlstad University. You can find a preprint of the paper on my publications page here or on ResearchGate. You can even access the replication package with the raw data for the paper here. Since I am tied to loads of lecturing in the software engineering course that starts the same week, Sebastian is going to present the paper in Vienna.

These days, I (unfortunately) no longer have the time to introduce each of my papers on my blog, but I feel that this one deserves it. Methods that help to engineer privacy requirements into software are in particularly high demand at the moment. Hats off to the new General Data Protection Regulation of the European Union, which comes into effect in May next year. This law does not only improve privacy protection for individuals across the union, it also states sizeable fines for those that violate it. That is why it is important for companies to make sure right now that their software is compliant to this regulation. Our paper contributes to that, since we identify, structure, and map techniques that are supposed to help in implementing privacy requirements in software. In a nutshell, we’ve built a systematic map of the the current state of peer-reviewed research on patterns for privacy in engineering software.

The new General Data Protection Regulation of the European Union will have far-reaching effects and marks a significant improvement in the protection of the privacy of individuals across the continent

We’ve queried Scopus, the ACM Digital Library, IEEExplore, and SpringerLink for publications relating to privacy, patterns, and software engineering or one of the many related terms for these concepts. On the resulting set of papers, we executed the “usual” mapping process, i.e., screening, backward and forward snowballing (using citation data from Google Scholar), keywording, classification, and mapping. In the end, we arrived at a total of 49 relevant articles and we counted no less than 148 patterns that were proposed in these articles.

We’ve structured the articles along the type of study they perform (e.g. solution proposal, case study, experiment…), the software engineering activity they address (e.g. requirements engineering, design and implementation, …) and the type of contribution they provide (e.g. a pattern proposal, a pattern catalog, a modeling notation for patterns, …). I won’t re-post the maps here in this blog entry, because I am not sure how much the IEEE would like that, but you can find them in the preprints that I have linked above. Essentially, there are many many pattern proposals out there, but very little empirical evidence to support a pattern or even to justify its relevance. Only twelve out of the 49 studies provide empirical evidence at all and most patterns violate the “rule of three” (there should be at least three independent observations of something in practice before you can call it a pattern). What’s more, there are few studies that link patterns over different activities in software engineering, e.g. from architecture to implementation, which is something that’s ultimately necessary if you want to achieve privacy-by-design. I could go on, but essentially this is a strong wake up call for research in this area. We need to discover patterns that are proven to work in practice, and not just because they seem to make sense from a nice line of argumentation. And we need to build approaches that help practitioners from requirements to working software and not just in one of the phases.

We also had a look at the concrete patterns themselves and categorized them according to the software engineering activity they address (as above), their class (e.g. an antipattern, a dark pattern, or a “regular” pattern) and the privacy design strategy according to Jaap-Henk Hoepman that they address. This makes for some interesting insights as well. For example, there are relatively few patterns for the “demonstrate” strategy (nine out of 148). That’s surprising since the GDPR is going to force enterprises to demonstrate that they actually comply with the regulation.

This blog entry is a teaser and if I got you interested, I suggest you have a look at the full paper (linked above). There’s more along the lines of what I have indicated here. At Karlstad University, we are taking the current state as a challenge. There is a high research potential in the area and there is a clear need for more practically-oriented research. We’re working on that, so stay tuned for updates.