Re: Mininig unfrequent association rules

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto Topic: Previous•Next

Goto: Forum List•Message List•New Topic•Search•Log In•Print View

Mininig unfrequent association rules

Posted by: Al Mahdi

Date: November 19, 2018 09:33AM

Hello everyone,

So, my motivation behind using SMPF was to mine unfrequent association rules.

Other platforms, JAVA libraries, and SMPF itself, have already allowed me to mine association rules (with as low confidence as 12%).

After some data analysis, I found that some interesting association rules happen at a confidence of 1%.

None of AprioriInverse, AprioriRare, etc., have allowed me to go very low confidence.. First, is that their use? Or are there other SPMF algorithms intended to mine unfrequent association rules?

What would you suggest? Should I use SPMF library functions in my JAVA code and trick (override) them somehow?

Options: Reply•Quote

Re: Mininig unfrequent association rules

Posted by: webmasterphilfv

Date: November 19, 2018 03:27PM

Do you really mean low confidence or low support?

The problem for finding the low support rules is that there is perhaps so many rules if you decrease the support threshold. But if you want to make it faster and reduce the number of rules, you can also apply some constraints. For example, if you apply FPGrowth_Association_rules in SPMF, you can set a maximum antecedent length and a maximum consequent length. This will set a maximum on the size of your rules and will decrease the number of rules that may be found, and then you may be able to decrease the minimum support threshold further.

Another possibility is to modify the code to add other constraints. For example, if you are only interested in some specific items in your rules, you could modify the code to only find rules with specific items. This would also reduce the number of possibilities and let you decrease the minimum support or confidence threshold.

But still, a rule with 1% confience is it good? If you have a rule X --> Y with 1 % confidence, it means that if X appears, 1% of the times Y will also be there. This is not a strong rule! So do you really need to find low-confidence rules?

You could also try the MEIT (Memory Efficient Itemset Tree). It will let you do some targeted queries to find association rules. For example, you can ask to find all rules containing the items 1 and 15. Maybe this is what you need!

Hope this helps.

Options: Reply•Quote