TY - GEN
T1 - An effective approach to inverse frequent set mining
AU - Guzzo, Antonella
AU - Saccà, Domenico
AU - Serra, Edoardo
PY - 2009
Y1 - 2009
N2 - The inverse frequent set mining problem is the problem of computing a database on which a given collection of itemsets must emerge to be frequent. Earlier studies focused on investigating computational and approximability properties of this problem. In this paper, we face it under the pragmatic perspective of defining heuristic solution approaches that are effective and scalable in real scenarios. In particular, a general formulation of the problem is considered where minimum and maximum support constraints can be defined on each itemset, and where no bound is given beforehand on the size of the resulting output database. Within this setting, an algorithm is proposed that always satisfies the maximum support constraints, but which treats minimum support constraints as soft ones that are enforced as long as possible. A thorough experimentation evidences that minimum support constraints are hardly violated in practice, and that such negligible degradation in accuracy (which is unavoidable due to the theoretical intractability of the problem) is compensated by very good scaling performances.
AB - The inverse frequent set mining problem is the problem of computing a database on which a given collection of itemsets must emerge to be frequent. Earlier studies focused on investigating computational and approximability properties of this problem. In this paper, we face it under the pragmatic perspective of defining heuristic solution approaches that are effective and scalable in real scenarios. In particular, a general formulation of the problem is considered where minimum and maximum support constraints can be defined on each itemset, and where no bound is given beforehand on the size of the resulting output database. Within this setting, an algorithm is proposed that always satisfies the maximum support constraints, but which treats minimum support constraints as soft ones that are enforced as long as possible. A thorough experimentation evidences that minimum support constraints are hardly violated in practice, and that such negligible degradation in accuracy (which is unavoidable due to the theoretical intractability of the problem) is compensated by very good scaling performances.
KW - Complexity
KW - Data reconstruction
KW - Inverse frequent mining
UR - http://www.scopus.com/inward/record.url?scp=77951154415&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2009.123
DO - 10.1109/ICDM.2009.123
M3 - Conference contribution
AN - SCOPUS:77951154415
SN - 9780769538952
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 806
EP - 811
BT - ICDM 2009 - The 9th IEEE International Conference on Data Mining
T2 - 9th IEEE International Conference on Data Mining, ICDM 2009
Y2 - 6 December 2009 through 9 December 2009
ER -