Abstract
The Inverse Frequent itemset Mining (IFM) is the problem of computing a transaction database D satisfying specified support constraints on a given set S of itemsets, that are typically the frequent ones. Earlier studies focused on investigating computational and approximability properties of this problem, that is NP-hard. However, this classical formulation of IFM does not enforce any constraint on the other itemsets (i.e., the ones that are not listed in S); D may therefore happen to contain additional (and, perhaps, unsuspected or even undesired) frequent itemsets. A possibility for removing this anomaly is to introduce a more general formulation of IFM in which the supports of itemsets that do not belong to S are explicitly constrained by a given threshold in order not to eventually get unexpected frequent itemsets. This paper investigates this formulation, shows how it can be encoded as an integer linear program, and introduces a no-integer version of it solvable by a decomposition technique, that is a method designed to handle optimization problems with a huge number of variables by a using a limited memory space. As the decomposition technique requires at each step the solution of an auxiliary NP-hard integer linear program, a constructive heuristic for this auxiliary problem has also been defined, which enjoys very good scaling, thereby paving the way for its application over real-life scenarios.
Original language | English |
---|---|
Pages | 66-75 |
Number of pages | 10 |
State | Published - 2011 |
Event | 19th Italian Symposium on Advanced Database Systems, SEBD 2011 - Maratea, Italy Duration: 26 Jun 2011 → 29 Jun 2011 |
Conference
Conference | 19th Italian Symposium on Advanced Database Systems, SEBD 2011 |
---|---|
Country/Territory | Italy |
City | Maratea |
Period | 26/06/11 → 29/06/11 |