RuleGrowth/ERMiner

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto Topic: Previous•Next

Goto: Forum List•Message List•New Topic•Search•Log In•Print View

RuleGrowth/ERMiner

Posted by: Hao Lei

Date: March 13, 2017 08:00AM

Hello, Prof. Philippe Fournier-Viger. I have a question about the ERMiner or RuleGrowth algorithms. Which sequential pattern discovery algorithm do you use to generate the sequential frequent patterns? I assume you have to generate the sequential patterns first before you generate the sequential rules. I would greatly appreciate your help. Thanks.

Options: Reply•Quote

Re: RuleGrowth/ERMiner

Posted by: webmasterphilfv

Date: March 13, 2017 04:16PM

Hello,

For these algorithms, we find the rules directly

The reason is that the rules found by ERMiner and RUleGrowth are of the form X --> Y where the items in X and Y are unoredered, but X must appear before Y. For example, a rule {a,b,c} --> {e} means that a,b,c appeared in any order but were followed by {e}.

Some other algorithms like RuleGen find the sequential patterns first and produce some rules X-->Y such that X and Y are sequential patterns. For example, they could find patterns such as a,b,c--> d,e,f which means that if a,b,c appears one after the other, then they will be followed by d,e,f in exactly that order. However, the problem with these rules are that they are too specific and not noise tolerant. For example, the rule a,b,c -> d,e,f would be seen as different from b,a,c -> d,e,f or a,c,b -> d,e,f or a,b,c-> d,f,e but actually they are probably the same rules, because in real life there is noise. In my experiments for web log prediction, I found that these rules are too specific and do not work well for prediction (see my TKDE paper about Rulegrowth or ADMA 2012 paper about sequence prediction for details about these experiments). That is why, in my work such as ERMiner and RUleGrowth, I use the rules X--> Y where the antecedent and consequent are unoredered.

So because the rules found by RulEgrowth/ERMiner have an unordered antecedent and consequent, these algorithms do not find sequential patterns first, and instead find the rules directly. Actually, the ERMiner algorithm is inspired by the Eclat algortihm for itemset mining. My goal was to apply a similar idea for rules. The RuleGrowth algorithms on the other hand is a pattern growth algorithm. The idea was inspired by PrefixSpan for sequential pattern mining.

Best regards,

Edited 3 time(s). Last edit at 03/13/2017 04:22PM by webmasterphilfv.

Options: Reply•Quote

Re: RuleGrowth/ERMiner

Posted by: Hao Lei

Date: March 29, 2017 12:52PM

Thanks for getting back to me. I tried to use following data to run RuleGrowth or ERMiner and it gave me more than one million rules back. May I ask you to run it on your system to see if you have the same results?

0 -1 100 -1 200 -1 300 -1 400 -1 500 -1 600 -1 700 -1 800 -1 900 -1 1000 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1900 -1 2000 -1 2100 -1 2200 -1 -2
0 -1 100 -1 201 -1 300 -1 401 -1 500 -1 600 -1 701 -1 800 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1901 -1 2001 -1 2101 -1 2201 -1 -2
1 -1 100 -1 202 -1 300 -1 402 -1 500 -1 600 -1 701 -1 801 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1901 -1 2001 -1 2102 -1 2201 -1 -2
0 -1 101 -1 202 -1 300 -1 400 -1 500 -1 600 -1 700 -1 801 -1 900 -1 1000 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1900 -1 2000 -1 2100 -1 2200 -1 -2
0 -1 100 -1 203 -1 301 -1 403 -1 500 -1 601 -1 701 -1 800 -1 901 -1 1000 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1801 -1 1901 -1 2002 -1 2101 -1 2201 -1 -2
0 -1 101 -1 201 -1 300 -1 401 -1 500 -1 600 -1 701 -1 801 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1900 -1 2001 -1 2101 -1 2201 -1 -2
1 -1 100 -1 202 -1 300 -1 401 -1 500 -1 600 -1 701 -1 802 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1900 -1 2001 -1 2102 -1 2201 -1 -2
1 -1 101 -1 202 -1 300 -1 402 -1 500 -1 600 -1 701 -1 801 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1901 -1 2000 -1 2102 -1 2201 -1 -2
0 -1 101 -1 202 -1 300 -1 400 -1 500 -1 600 -1 700 -1 803 -1 900 -1 1000 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1900 -1 2003 -1 2101 -1 2200 -1 -2
1 -1 100 -1 201 -1 300 -1 401 -1 500 -1 600 -1 701 -1 802 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1900 -1 2000 -1 2102 -1 2201 -1 -2
0 -1 101 -1 201 -1 300 -1 402 -1 500 -1 600 -1 701 -1 802 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1901 -1 2001 -1 2101 -1 2201 -1 -2
0 -1 101 -1 201 -1 300 -1 401 -1 500 -1 600 -1 701 -1 801 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1900 -1 2000 -1 2102 -1 2201 -1 -2
1 -1 100 -1 201 -1 300 -1 401 -1 500 -1 600 -1 701 -1 804 -1 900 -1 1001 -1 1100 -1 1200 -1 1300 -1 1400 -1 1500 -1 1600 -1 1700 -1 1800 -1 1901 -1 2000 -1 2101 -1 2201 -1 -2

Thanks.

Options: Reply•Quote

Re: RuleGrowth/ERMiner

Posted by: Hao Lei

Date: March 29, 2017 01:06PM

I use min. support = 0.3 and min. confidence = 0.3.

Options: Reply•Quote

Re: RuleGrowth/ERMiner

Posted by: webmasterphilfv

Date: March 30, 2017 07:59AM

Yes, I tried it and it runs very slow because of the search space.

If you use TRuleGrowth, you can set several constraints to reduce the search space.

For example, I run TRuleGrowth on your dataset with minsup = 0.3 minconf = 0.8, window size = 3, Max antecedent size = 4 and max consequent size = 1 (to predict a single event).

The result is 144 rules, and it just takes 16 ms to get the results.

So I would recommend to set some constraints to reduce the number of rules.

Here are a few of the rules found:

0 ==> 100 #SUP: 1 #CONF: 1.0
0 ==> 200 #SUP: 1 #CONF: 1.0
0,100 ==> 200 #SUP: 1 #CONF: 1.0
0 ==> 300 #SUP: 1 #CONF: 1.0
0,100 ==> 300 #SUP: 1 #CONF: 1.0
0,100,200 ==> 300 #SUP: 1 #CONF: 1.0
0,200 ==> 300 #SUP: 1 #CONF: 1.0
800 ==> 900 #SUP: 1 #CONF: 1.0
800 ==> 1000 #SUP: 1 #CONF: 1.0
800,900 ==> 1000 #SUP: 1 #CONF: 1.0
800 ==> 1100 #SUP: 1 #CONF: 1.0
800,900 ==> 1100 #SUP: 1 #CONF: 1.0
800,900,1000 ==> 1100 #SUP: 1 #CONF: 1.0
800,1000 ==> 1100 #SUP: 1 #CONF: 1.0

Options: Reply•Quote

Re: RuleGrowth/ERMiner

Posted by: vikas

Date: March 28, 2017 02:24AM

Any python library or python implementation
for RuleGrowth or ERMiner Algorithm

Options: Reply•Quote

Re: RuleGrowth/ERMiner

Posted by: webmasterphilfv

Date: March 28, 2017 08:22AM

I know someone who may have a Python implementation of ERMiner but it is not public. If you send me an e-mail at philfv8 AT yahoo.com
I can give you the contact information of that person, and you may ask him if he can share it with you.

Options: Reply•Quote