Multi-sorted inverse frequent itemsets mining for generating realistic no-SQL datasets

Domenico Saccà, Edoardo Serra, Antonino Rullo

Research output: Contribution to journalConference articlepeer-review

Abstract

The development of novel platforms and techniques for emerging “Big Data” applications requires the availability of real-life datasets for data-driven experiments, which are however not accessible in most cases for various reasons, e.g., confidentiality, privacy or simply insufficient availability. An interesting solution to ensure high quality experimental findings is to synthesize datasets that reflect patterns of real ones. A promising approach is based on inverse mining techniques such as inverse frequent itemset mining (IFM), which consists of generating a transactional dataset satisfying given support constraints on the itemsets of an input set, that are typically the frequent and infrequent ones. This paper describes an extension of IFM that considers more structured schemes for the datasets to be generated, as required in emerging big data applications, e.g., social network analytics.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume2994
StatePublished - 2021
Event29th Italian Symposium on Advanced Database Systems, SEBD 2021 - Pizzo Calabro, Italy
Duration: 5 Sep 20219 Sep 2021

Keywords

  • IFM
  • Itemset mining
  • No-SQL

Fingerprint

Dive into the research topics of 'Multi-sorted inverse frequent itemsets mining for generating realistic no-SQL datasets'. Together they form a unique fingerprint.

Cite this