The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Endless loop in FPGrowth using a file with 1 transaction
Posted by: Marc
Date: October 28, 2013 02:54AM

Hello

I am using the FPGrowth algorithm on a set of 500 files. All files but 2 seem to be work perfectly but the 2 files that do not work correctly eventually result in an out of memory exception.

I started investigating the issue by inspecting the GC and heap dump I've taken and eventually found out that there are 16 million Itemset instances and 16 million int arrays created.

I built a source jar using your provided sources so I could debug more easily and I eventually found out that the cause is due to only 1 transaction being available in those 2 files. I was able to trace the error down to the 'addAllCombinationsForPathAndPrefix' method in the AlgoFPGrowth class. This seems to recursively call itself until eternity until you run out of memory.

I tried to look at it in more detail but I could not exactly figure out what was going wrong. I was hoping that you could help me identify the root cause.

Thanks in advance

Marc

Options: ReplyQuote
Re: Endless loop in FPGrowth using a file with 1 transaction
Posted by: Philippe
Date: October 28, 2013 04:30AM

Hi Marc,

Could you send me the input file (philippe.fv AT gmail.com) ?

The FPGrowth algorithm is recursive. The problem may just be that your transaction is too large and that the algorithm is taking a lot of time.

For example, consider that you have a single transaction of 25 items. If you want to discover all frequent itemsets with minsup = 1, then, there will be 2^25 = 33,554,432 itemsets. This number can grow very fast if your transaction is large, and the more itemset are found, the more execution time the algorithm will require.

Best,
Philippe

Options: ReplyQuote
Re: Endless loop in FPGrowth using a file with 1 transaction
Posted by: Marc
Date: October 28, 2013 04:47AM

Philippe

Thanks for the fast reply. I have sent the file with additional information to the gmail address.

Kind regards

Marc

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.