Re: EUCS structure in FHM Algorithm
Date: July 27, 2020 12:04AM
Hi,
Yes, that EUCP structure is a matrix that stores the TWU of all pairs of items (two items). It would be possible to build a structure to precalculate the TWU of itemsets containing more than 2 items, but perhaps that it would require too much memory so that is why we did not try it.
But although, the EUCP structure only stores the TWU of two items, it can be used to reduce the search space for itemsets that have two or more items in this way:
- When FHM combines the itemset "a" and "b" to create the itemset "a,b", FHM will check that the TWU of "a,b" is no less than min_util in the EUCP. If lower, then "a,b" can be eliminated"
- When FHM combines the itemset "a" and "c" to create the itemset "a,c", FHM will check that the TWU of "a,c" is no less than min_util in the EUCP. If lower, then "a,c" can be eliminated"
- When FHM combines the itemset "a,b" and "a,c" to create the itemset "a,b,c", FHM will check that the TWU of "b,c" is no less than min_util in the EUCP. If lower, then "a,b,c" can be eliminated"
By this example, what I want to highlight is that to get to the point of generating "a,b,c", FHM will actually have to check three times the EUCP:
- for "a,b"
- for "a,c"
- for "b,c"
So hope that my explanation makes this more clear. As I said above, we dont use more then 2 items in the EUCP. If we would use more, maybe it would be faster, but may also it will consume much more memory... Maybe for small datasets that don't have many different items it could make sense to use more than 2 items in the EUCP.
Best regards,
Philippe