Re: Pruning strategy

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto Topic: Previous•Next

Goto: Forum List•Message List•New Topic•Search•Log In•Print View

EFIM Pruning strategy - local utility vs remaining utility

Posted by: maria567

Date: August 18, 2020 01:43AM

Dear Sir
i think local utility and remaining utility are the same in mathematically why we introduce local utility ? and when WE use local utility pruning if utility a<minutil we prune only the child of itemset a or we prune itemset a with the child of item set a .
Thank you for your help

Edited 2 time(s). Last edit at 08/18/2020 09:21AM by webmasterphilfv.

Options: Reply•Quote

Re: Pruning strategy

Posted by: webmasterphilfv

Date: August 18, 2020 05:07AM

>>Dear Sir
>>i think local utility and remaining utility are the same in mathematically why we introduce local utility ? and when WE use local utility pruning if utility a<minutil we prune only the child of itemset a or we prune itemset a with the child of item set a .
Thank you for your help

Good evening,

Just to make sure, I think that your question is about the EFIM algorithm. In EFIM, there are two upper-bounds : the local utility and sub-tree utility.

I will try to answer your question using the definition from the journal paper about EFIM:

http://philippe-fournier-viger.com/EFIM_JOURNAL_VERSION%20KAIS%202016.pdf

and i will try to give the intuition about these upper-bounds.

The local utility is defined in Def. 4.10.

Now if we want to calculate the local utility of an itemset, you can consider that there are two cases.
- Let's first consider the case where the itemset has a single item.
For example, we want to calculate the utility of a single item like "a".
In that case, you can consider ALPHA = {} and z = "a" in that definition.
Then, the local utility will be the sum of the utility of all items in the transactions containing "a". This will be equal to the TWU.

- Now let's consider the second case where the local utility is calculated for an itemset containing more than 1 item. For example, lets look at the calculation of the local utility of {a,b}. If we use the definition of local utility then:
ALPHA = {a} and z = {b}.
You can do the calculation.
But intuitively, it will be like if you do the projection of the database by ALPHA, and then you calculate the TWU of {b} in that database.

So what I want to say is that the idea of the local utility is actually a generalization of the TWU. It is not like the remaining utility but instead like the TWU. The difference is that the TWU is always calculated using the whole database. But in EFIM, when an itemset is bigger, we calculate the TWU only using the projected database rather than the whole database. This is the local utility.

The sub-tree utility is presented in def. 4.11. This one is mathematically equivalent to the remaining utility. However, as I explain in this paragraph in the paper, they are not calculated at the same moment as other algorithms like HUI-Miner, FHM that use the remaining utility:

About the su upper-bound, one can ask what is the difference between this
upper-bound and the reu upper-bound of HUI-Miner and FHM since they are
mathematically equivalent. The major difference between the remaining-utility
upper bound and the proposed su upper-bound is that the su upper-bound is
calculated when the depth-first search is at itemset α in the search tree rather than at the child itemset Y . Thus, if su(α, z) < minutil, EFIM prunes the whole
sub-tree of z including node Y rather than only pruning the descendant nodes
of Y . This is illustrated in Fig. 2, which compares the nodes pruned in the subtree of Y using the su and reu upper-bounds. Thus, as explained here, using su
instead of reu upper-bound, is more effective for pruning the search space.

And then if you look at Definition 4.13, we further changed the definition of sub-tree utility to make it better than the remaining utility. We call it the "redefined subtree utility". The difference between remaining utilty and redefined subtree utility is that we remove some items in the calculation (the items not in Secondary(ALPHA). Thus, this can be smaller than the remaining utility.

Hope that this helps a little bit to understand the paper. I know the paper is maybe not so easy to understand especially if you are new in this topic :-)

Best regards,

Edited 1 time(s). Last edit at 08/18/2020 05:09AM by webmasterphilfv.

Options: Reply•Quote

Re: Pruning strategy

Posted by: maria567

Date: August 18, 2020 08:27AM

i specially thank you for spending a lot of time to answer my question .in page 22 of the power point of EFIM and EFIM close
EFIM &EFIM close
i want to know pruning with local utility is like the right picture or the left one? and why pruning with SU is stronger than pruning with LU ?
Thank you

Edited 1 time(s). Last edit at 08/18/2020 08:47AM by maria567.

Options: Reply•Quote

Re: Pruning strategy

Posted by: webmasterphilfv

Date: August 18, 2020 08:59AM

Hi again,

The local utility pruning would be different from both pictures on Page 22.

Let me explain with a simple example.

Let say that the order between items is a < b < c < d < e ...

We will look at the picture of the tree on page 22. Let's say that we want to apply the local utility pruning for the itemset {b} containing a single item, then
ALPHA = {} empty set
z = b

Then we calculate the local utility lu(ALPHA,z) = lu({},b). If lu({},b) < minutil, it means that we cannot make any high utility itemsets with item "b".

So what can we remove from the tree that you see on picture 22 ?

The answer is as follows.

First, we can remove the whole subtree that has ALPHA as root. This is like what you see on the left picture. That means all the itemsets starting with "b" like "bc" "bd" "bcd" etc.

But that is not everything! You could also eliminated any node from the whole tree that contain "b". For example, the itemset "ab" which is not in the tree of ALPHA could also be eliminated.

So the local utility is quite powerful. It can remove an item from many subtrees in the picture of Figure 22 (left). Not just the one that is in gray.

Hope this helps.

Options: Reply•Quote