The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
EUCS structure in FHM Algorithm
Posted by: sharare1370
Date: July 26, 2020 06:48AM

Hi
i have a question about EUCS structure in FHM Algorithm.we can prune the itemset that contains two item like ab in this structure in FHM algorithm or it can prune the itemset that contains 3 items like abc by using EUCS structure?i read FHM paper the EUCS matrix in the page 5 is contains only the itemset with 2 items.
Thank you for he

Options: ReplyQuote
Re: EUCS structure in FHM Algorithm
Date: July 27, 2020 12:04AM

Hi,

Yes, that EUCP structure is a matrix that stores the TWU of all pairs of items (two items). It would be possible to build a structure to precalculate the TWU of itemsets containing more than 2 items, but perhaps that it would require too much memory so that is why we did not try it.

But although, the EUCP structure only stores the TWU of two items, it can be used to reduce the search space for itemsets that have two or more items in this way:

- When FHM combines the itemset "a" and "b" to create the itemset "a,b", FHM will check that the TWU of "a,b" is no less than min_util in the EUCP. If lower, then "a,b" can be eliminated"

- When FHM combines the itemset "a" and "c" to create the itemset "a,c", FHM will check that the TWU of "a,c" is no less than min_util in the EUCP. If lower, then "a,c" can be eliminated"

- When FHM combines the itemset "a,b" and "a,c" to create the itemset "a,b,c", FHM will check that the TWU of "b,c" is no less than min_util in the EUCP. If lower, then "a,b,c" can be eliminated"

By this example, what I want to highlight is that to get to the point of generating "a,b,c", FHM will actually have to check three times the EUCP:
- for "a,b"
- for "a,c"
- for "b,c"

So hope that my explanation makes this more clear. As I said above, we dont use more then 2 items in the EUCP. If we would use more, maybe it would be faster, but may also it will consume much more memory... Maybe for small datasets that don't have many different items it could make sense to use more than 2 items in the EUCP.

Best regards,

Philippe

Options: ReplyQuote
Re: EUCS structure in FHM Algorithm
Posted by: sharare1370
Date: July 27, 2020 09:24AM

thank you for your response my problem is solved.
Regards

Options: ReplyQuote
Re: EUCS structure in FHM Algorithm
Date: July 27, 2020 06:13PM

You are welcome!

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.