Re: closure Jumping in EFIM Close Algorithm
Date: July 27, 2020 12:28AM
Hi,
I will try to explain this optimization with a simple example.
Let's say that we have a database containing only three transactions:
Transaction 1: A B C D E F
Transaction 2: A B C D E F
Transaction 3: A B C
Then, the algorithm will star to search for the closed itemsets. At some point the algorithm will find {A,B,C} and will then want to extend that itemset further to find larger itemsets starting with {A,B,C}.
The algorithm will count what items could extend {A,B,C}. The answer is:
"D" and it appears two times after {A,B,C}
"E" and it appears two times after {A,B,C}
"F" and it appears two times after {A,B,C}
Because all the items D, E, and F that could extend {A,B,C} have the same support (two times), then we can directly jump to the itemset {A,B,C,D,E,F} and we know that this itemset will be closed.
The other itemsets starting with ABC like ABCD, ABCE, ABCF, ABCDE, ABCDF etc cannot be closed because they will all have a support of 2, and {ABCDEF} has a support of 2.
This is the reason why we can jump directly to ABCDEF... This will save time because we dont need to explicitly check ABCD, ABCE etc...
So this is an optimization for closed itemset mining.
Hope that this is more clear.
Best regards,
Philippe