TopKClassRules for rare association rules
Posted by: Irfan
Date: July 28, 2020 11:42PM

Greetings,

I use TopKClassRules in my project and found it very useful. However, it only provides TopKClass frequent association rule.

Are there any means of modifying it so that it can provide TopKClass rare association rule? Or which rule works like TopKClassRule but provide TopKClass rare association rule?

Thank you



Edited 1 time(s). Last edit at 08/18/2020 09:23AM by webmasterphilfv.

Re: TopKClassRules
Date: July 29, 2020 12:00AM

Hi Irfan,

Glad it is useful.

I think it could be modified for rare class association rules. However, there are several definition of what is "rare".

If "rare" just means to have a support lower than some threshold maxsup, then I think it would not be hard to do. But if you use other definitions of what is a rare rule, maybe it is more complicated.

But in any case, when changing an algorithm for doing something else, there can always be some problems that will arise that we did not thought about ;-) I think it is possible, but maybe there is something I did not think about.

Best regards,

Philippe

Re: TopKClassRules
Posted by: Irfan
Date: July 29, 2020 05:35PM

Thank you Prof for your quick reply,

Yes, what I mean is just rare, which has support lower than some threshold maxsup.

So to have that option, in this rule. Whether it is rare/frequent.

But other input like k, minconf(%), and Fixed consequent items to remain as it is as presented in the algorithm.

Re: TopKClassRules
Posted by: Irfan
Date: July 29, 2020 06:24PM

But support for rare TopKClassRule should not be 0%.

Re: TopKClassRules
Posted by: Irfan
Date: July 31, 2020 04:37PM

Dear Prof,

Any suggestion on this about which file/part of the file should I consider for editing to add this functionality.

Or anyone who can help in this.

Sorry for this because am not familiar with java.


Thank you.

Re: TopKClassRules
Date: July 31, 2020 04:43PM

Hi,

I will try to do it for you in the next days.

Best regards,

Philippe

Re: TopKClassRules
Posted by: Irfan
Date: July 31, 2020 05:46PM

Thank you Prof.

Re: TopKClassRules
Date: August 09, 2020 09:08AM

Hi Irfan,

I have added the "maxsup" parameter:



You can try it by downloading the spmf.jar file again from the website.

For the source code version of SPMF, I will upload spmf.zip maybe in a few hours because I want to also update another algorithm.

Best,

Philippe



Edited 1 time(s). Last edit at 08/09/2020 09:10AM by webmasterphilfv.

Re: TopKClassRules
Posted by: Irfan
Date: August 09, 2020 04:43PM

Dear Prof,

Thank you very much.

Re: TopKClassRules
Posted by: Irfan
Date: August 10, 2020 04:24PM

Dear Prof,

I downloaded the new GUI version of SPMF and I tested by putting max conf of 25% and min conf 2% but when I run for generating the rules, still the rules which are being generated are frequent rules and not rare rules which belong on that interval. i.e Still I found the rule which has 90%conf with conf (2%-25%) interval, which also generated when looking for a frequent rule while putting minconf of 60%.

Re: TopKClassRules
Date: August 10, 2020 06:11PM

Hi Irfan,

I would like to understand more clearly.

You said "maxconf" but rare rule is not about confidence but about the support. The parameter is "maxsup" not "maxconf".

I have implemented a parameter "maxsup" for TopKClassRules in the version that you have downloaded.

Do you mean that you would like to have a parameter "maxconf?"

Best regards,

Philippe

Re: TopKClassRules
Posted by: Irfan
Date: August 10, 2020 11:34PM

Yes, Prof,

Because after reading the paper of this algorithm, seems its advantage of the algorithm is to enter conf only.

So, having minconf, and maxconf, I think will handle the issue of rare. But with this maxsup and min conf, the result obtained is not rare. Because still give the same output as a frequent itemset.

Re: TopKClassRules
Date: August 11, 2020 04:38AM

Hi again,

But rare means infrequent... It means something that has a low frequency.

It depends on how you set the parameter. If you set maxsup very low, you will find infrequent (rare) rules. For example, if you set maxsup = 25 %, the algorithm should not give you rule more frequent than 25%.. If you set maxsup = 0.01 %, the algorithm will not give you rules more frequent than 0.01% and so on... Maybe you need to decrease it further if you still find frequent rules.

The confidence is not related to frequent or rare. You can have frequent rules with a low or high confidence and you can also have rare rules that have a high or a low confidence...

For me it doesn't matter. I can add the constraint of maxconfidence for you to the algorithm. But it will not help you much to find rare (infrequent) rules. To find infrequent rules, you should use maxsup and decrease it.

Using maxconfidence will just help you find rules that are less "strong", but not necessarily rare rules.

Yes, in the paper, the point is that we want to use k to avoid setting a constraint on the support. But internally, the algorithm will still use a minimum support. By using the maxsupport, you will force the algorithm to search for less frequent rules.. But you need to set maxsupport low enough otherwise, you will indeed get the same rules!

Best,

Philippe



Edited 2 time(s). Last edit at 08/11/2020 04:41AM by webmasterphilfv.

Re: TopKClassRules
Posted by: Irfan
Date: August 12, 2020 12:23AM

Thank you for quick reply and more explanation.

Let me practise more with dataset having a different scenario, then I will be back.

Thank you very much.

Re: TopKClassRules
Posted by: Irfan
Date: August 28, 2020 04:52PM

Greetings Prof,
I cameback after working on several dataset.

I can say that, there no changes when I put parameter maxsup for capturing rare association rules. The approach is good, but the input maxsup seems has no impact on the intended output. Maybe there might be a little mistake in the code while capturing maxsupport.

As you can see

1. Rare association rule with its input and corresponding output

INPUT: mincof 60%, maxsup 30%

OUTPUT:
Patient1 ==> admitted #SUP: 165 #CONF: 0.6470588235294118
Patient2 ==> admitted #SUP: 177 #CONF: 0.6941176470588235
Patient4 ==> admitted #SUP: 225 #CONF: 0.8823529411764706
Patient3 ==> admitted #SUP: 255 #CONF: 1.0


2. Frequent association rule with its input and corresponding output

INPUT: minconf 60%

OUTPUT:

Patient1 ==> admitted #SUP: 165 #CONF: 0.6470588235294118
Patient2 ==> admitted #SUP: 177 #CONF: 0.6941176470588235
Patient4 ==> admitted #SUP: 225 #CONF: 0.8823529411764706
Patient3 ==> admitted #SUP: 255 #CONF: 1.0

Re: TopKClassRules
Date: August 30, 2020 01:50AM

Hi,

I see.

How many lines is there in your data?

Maybe you can also send me your data to my e-mail: philfv8@yahoo.com and I can also try it if you want.

Best regards,

Philippe

This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.