Re: Candidates in TNR algorithm

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto Topic: Previous•Next

Goto: Forum List•Message List•New Topic•Search•Log In•Print View

Candidates in TNR algorithm

Posted by: Alex

Date: June 02, 2013 07:27AM

Hello Philippe and forum's users.
First of all, Philippe, let me thank you for your works, cause they are really interesting!

So, I wondering why do we register rules as candidates when their confidence is bellow minimum?

Here is a snippet from your implementation:

if(confidence >= minConfidence){
    // save the rule in current top-k rules
    save(candidate, ruleSupport);
}
// register the rule as a candidate for future expansion
registerAsCandidate(true, candidate);

Thank you.

Options: Reply•Quote

Re: Candidates in TNR algorithm

Posted by: webmasterphilfv

Date: June 02, 2013 09:15AM

Hi Alex,

Thank you, and welcome to the forum.

The reason for keeping candidate rules that do not meet the minconf threshold is that the confidence is not monotone or anti-monotone so it cannot be used for pruning the search space.

I will try to explain this in more details.

The algorithms like TNR and TopKRules use rule expansion for exploring the search space of rules. To generate new rules, they use previous rules and add an item to create a larger rule.

Consider a rule {a} --> {b, c}.

If you add an item x to this rule, the support of the resulting rule will decrease or stay the same.

Because of this, we know that we don't need to keep rules having a support lower than minsup because any larger rules based on this rule would be infrequent.

However, if we add an item to a rule, the confidence of the resulting rule may be lower, equal or higher than the confidence of the original rule. For this reason, we cannot prune the search space based on the confidence.

Lastly, another thing that I want to mention is that although rules with low confidence may be kept as candidate for further exploration, they are not kept in the set of current top-k rules. Therefore, they will not be output in the result (they are only used for generating other rules).

Hope this helps!

Philippe

Edited 2 time(s). Last edit at 06/02/2013 04:24PM by webmasterphilfv.

Options: Reply•Quote

Re: Candidates in TNR algorithm

Posted by: Alex

Date: June 02, 2013 09:55AM

Oh, thank you for so quick and detailed answer. Now it's clear for me. Thank you and good luck.

Options: Reply•Quote

Re: Candidates in TNR algorithm

Posted by: webmasterphilfv

Date: June 02, 2013 04:26PM

You are welcome.

Note that I have edited my answer. I had made a reference to the wrong paper (RuleGrowth) but TNR is based on TopKRules instead.

Best,

Philippe

Options: Reply•Quote