Project contact: jan.willemson (at) stacc.ee

Privacy-preserving data mining handles data in such a way that none of the computing parties has access to the microdata. In order to facilitate computing, the data is shared berween several computers (so-called miners) in such a way that all the shares are indistinguishable from random noise, when considered in isolation. This approach facilitates research projects in which no one has to disclose real data and in which only aggregate results are important. Such a solution may be appropriate is sensitive domains, such as the planning and implementation of public auctions, analysis of indicators that are subject to non-disclosure to protect business secrets, or the processing of highly sensitive personal (e.g. medical) data.

The objective of the three main sub-projects is to develop data handling methods to be used with highly private data like health or financial information or any other delicate data.Most parts of the project are linked to the Sharemind secure distributed computing platform.The objectives are:

  • to establish a convenient environment for programming and testing secure distributed algorithms (primarily IDE and debugger),
  • to test secure computing for processing survey data as realistically as possible,
  • to develop a general tool for processing the security proofs of the cryptographic protocols to be used to prove the security features of the distributed computing algorithms.

The fourth sub-project addresses the need to test major information systems before they are adopted. Software developers are naturally interested in testing their applications with data that are as real as possible, which is why test data are often drawn from real operating environments. These data, however, might include various delicate components that software testers do not need to see. Therefore, real databases should first be disguised by generating a base of artificial values following particular rules,and the main object of this research is to find a way to integrate such a solution into the software development workflow.