Evaluating the Privacy Implications of Frequent Itemset Disclosure

Edoardo Serra, Jaideep Vaidya, Haritha Akella, Ashish Sharma

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Scopus citations

Abstract

Frequent itemset mining is a fundamental data analytics task. In many cases, due to privacy concerns, only the frequent itemsets are released instead of the underlying data. However, it is not clear how to evaluate the privacy implications of the disclosure of the frequent itemsets. Towards this, in this paper, we define the k-distant-IFM-solutions problem, which aims to find k transaction datasets whose pair distance is maximized. The degree of difference between the reconstructed datasets provides a way to evaluate the privacy risk. Since the problem is NP-hard, we propose a 2-approximate solution as well as faster heuristics, and evaluate them on real data.

Original languageAmerican English
Title of host publicationIFIP International Conference on ICT Systems Security and Privacy Protection
DOIs
StatePublished - 1 Jan 2017

Keywords

  • column generation
  • inverse frequent itemset mining

EGS Disciplines

  • Computer Sciences

Fingerprint

Dive into the research topics of 'Evaluating the Privacy Implications of Frequent Itemset Disclosure'. Together they form a unique fingerprint.

Cite this