The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Frequent Itemsets Mining
Posted by: Pablo
Date: April 06, 2014 11:45AM

Hello everyone, I'm trying to solve the problem of given a 0/1 matrix find every submatrices of 1's having at least k rows.

In this paper
http://jmlr.org/papers/volume9/sun08a/sun08a.pdf
appears that my problem has already been solved as the Frequent Itemsets Mining (FIM) problem.
In this form the FIM problem can be stated as follows: given X and k >= 1, find every submatrix of 1s in X having at least k rows, and report the associated set of columns. If the threshold k is allowed to vary, then FIM algorithms essentially seek to find every maximal submatrix of 1s in the data matrix X.

Please is there any algorithm that solves this problem?

Options: ReplyQuote
Re: Frequent Itemsets Mining
Posted by: khairy
Date: April 15, 2014 12:39AM

Hi Sir

It seems that in all datasets H-mine algorithm record the worst execution time compared to FP-Growth, Apriori and Ecalt. Is this true?
Thanks

Options: ReplyQuote
Re: Frequent Itemsets Mining
Date: April 15, 2014 04:00AM

Yes... But it is most likely because the implementation of HMine in SPMF is not good. I should re-implement HMine.

Options: ReplyQuote
Re: Frequent Itemsets Mining
Posted by: khairy
Date: May 16, 2014 10:01AM

Hi Sir,


When i try to run FP-Growth in 0.9 support using Pumsb dataset, i received this message

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

as shown i give a big support value but the algorithm is crash
how can i run the algorithm in this dataset to get frequent item-sets.

Options: ReplyQuote
Re: Frequent Itemsets Mining
Date: May 16, 2014 02:22PM

I think that you should download the latest version of SPMF. There is an optimization of FPGrowth that was added a few weeks ago.

On my computer with psumb dataset and minsup = 0.9 :

============= FP-GROWTH - STATS =============
Transactions count from database : 49046
Max memory usage: 8.296775817871094 mb
Frequent itemsets count : 2607
Total time ~ 6084 ms
===================================================
4404 #SUP: 45106
4404 4428 #SUP: 44325
4404 4428 4438 #SUP: 44229

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.