DISCOVERING CONFUSING FREQUENT ITEMSETS
Keywords:Correlation, Data mining, Frequent itemset, Taxonomy.
AbstractFrequent itemset mining is one of the most important research areas in the field of association rule mining. Exploiting frequent itemsets at different abstraction levels of data will yield valuable knowledge. However, some Confusing Frequent Itemsets (CFIs) could be included in the mined set. These CFIs represent contrasting knowledge with their low-level descendants. Experts need to analyze CFIs from traditional frequent itemsets to make more accurate recommendations. In this paper, we presented a definition of a CFI, CFI’s interestingness measure and how to apply existing frequent itemset mining techniques to discover CFIs from data by exploiting a taxonomy.
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. Paper presented at The 20th International Conference on Very Large Data Bases, Chile.
Barsky, M., Kim, S., Weninger, T., & Han, J. (2011). Mining flipping correlations from large datasets with taxonomies. Paper presented at The 38th International Conference on Very Large Data Bases, Turkey.
Brin, S., Motwani, R., & Silverstein, C. (1997). Beyond market baskets generalizing association rules to correlations. Paper presented at The ACM SIGMOD International Conference on Management of Data, USA.
Cagliero, L., Cerquitelli, T., Garza, P., & Grimaudo, L. (2014). Misleading generalized itemset discovery. Expert Systems with Applications, 41(4), 1400-1410.
Dheeru, D., & Karra, T. E. (2017). Machine learning repository. Retrieved from http://archive.ics.uci.edu/ml.
Fournier, V. P., Lin, J. C., Vo, B., Truong, C. T., Zhang, J., & Le, H. B. (2017). A survey of itemset mining. WIREs: Data Mining and Knowledge Discovery, 7(4), 1-18.
Han, J., Pei, J., & Yin, Y. (2000). Mining frequent patterns without candidate generation. Paper presented at The ACM SIGMOD International Conference on Management of Data, Canada.
Srikant, R., & Agrawal, R. (1995). Mining generalized association rules. Future Generation Computer Systems, 13(2-3), 161-180.
Tan, P. N., Kumar, V., & Srivastava, J. (2002). Selecting the right interestingness measure for association patterns. Paper presented at The ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
Tan, P. N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining (2nd ed.). Boston, USA: Pearson Addison Wesley.
Uno, T., Kiyomi, M., & Arimura, H. (2004). LCM ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. Paper presented at The IEEE ICDM Workshop Frequent Itemset Mining Implementations, USA.
Wu, T., Chen, Y., & Han, J. (2007). Association mining in large databases: A re-examination of its measures. Paper presented at The European Conference on Principles of Data Mining and Knowledge Discovery, Germany.
Wu, T., Chen, Y., & Han, J. (2010). Re-examination of interestingness measures in pattern mining: A unified framework. Data Mining and Knowledge Discovery, 21(3), 371-397.
Zaki, M. J., & Gouda, K. (2003). Fast vertical mining using diffsets. Paper presented at The ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, USA.
Volume and Issues
Copyright & License
Copyright (c) 2018 Huỳnh Thành Lộc
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.