>>Dear Sir
>>i think local utility and remaining utility are the same in mathematically why we introduce local utility ? and when WE use local utility pruning if utility a<minutil we prune only the child of itemset a or we prune itemset a with the child of item set a .
Thank you for your help
Good evening,
Just to make sure, I think that your question is about the EFIM algorithm. In EFIM, there are two upper-bounds : the
local utility and
sub-tree utility.
I will try to answer your question using the definition from the journal paper about EFIM:
http://philippe-fournier-viger.com/EFIM_JOURNAL_VERSION%20KAIS%202016.pdf
and i will try to give the intuition about these upper-bounds.
The
local utility is defined in Def. 4.10.
Now if we want to calculate the local utility of an itemset, you can consider that there are two cases.
- Let's first consider the case where the itemset has a single item.
For example, we want to calculate the utility of a single item like "a".
In that case, you can consider ALPHA = {} and z = "a" in that definition.
Then, the local utility will be the sum of the utility of all items in the transactions containing "a". This will be
equal to the TWU.- Now let's consider the second case where the local utility is calculated for an itemset containing more than 1 item. For example, lets look at the calculation of the local utility of {a,b}. If we use the definition of local utility then:
ALPHA = {a} and z = {b}.
You can do the calculation.
But intuitively, it will be like if you do the projection of the database by ALPHA, and then you calculate the TWU of {b} in that database.
So what I want to say is that the idea of the local utility is actually a generalization of the TWU. It is not like the remaining utility but instead like the TWU. The difference is that the TWU is always calculated using the whole database. But in EFIM, when an itemset is bigger, we calculate the TWU only using the projected database rather than the whole database. This is the local utility.
The
sub-tree utility is presented in def. 4.11. This one is mathematically equivalent to the remaining utility. However, as I explain in this paragraph in the paper, they are not calculated at the same moment as other algorithms like HUI-Miner, FHM that use the remaining utility:
About the su upper-bound, one can ask what is the difference between this
upper-bound and the reu upper-bound of HUI-Miner and FHM since they are
mathematically equivalent. The major difference between the remaining-utility
upper bound and the proposed su upper-bound is that the su upper-bound is
calculated when the depth-first search is at itemset α in the search tree rather than at the child itemset Y . Thus, if su(α, z) < minutil, EFIM prunes the whole
sub-tree of z including node Y rather than only pruning the descendant nodes
of Y . This is illustrated in Fig. 2, which compares the nodes pruned in the subtree of Y using the su and reu upper-bounds. Thus, as explained here, using su
instead of reu upper-bound, is more effective for pruning the search space.
And then if you look at Definition 4.13, we further changed the definition of sub-tree utility to make it better than the remaining utility. We call it the "
redefined subtree utility". The difference between remaining utilty and redefined subtree utility is that we remove some items in the calculation (the items not in Secondary(ALPHA). Thus, this can be smaller than the remaining utility.
Hope that this helps a little bit to understand the paper. I know the paper is maybe not so easy to understand especially if you are new in this topic :-)
Best regards,
Edited 1 time(s). Last edit at 08/18/2020 05:09AM by webmasterphilfv.