The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
some cases with confidence>1
Posted by: Ankit
Date: September 12, 2014 12:17AM

Hi,

I've been using spmf for some time now. Lately, I came across some cases in which the spmf java package gave the rules with confidence>1. I was using fp-growth algo with lift to mine the rules. Please comment.

Thanks

Options: ReplyQuote
Re: some cases with confidence>1
Date: September 12, 2014 03:05AM

Hi,

Thanks for using SPMF.

confidence is a value between [0,1]

lift is a value between -infinity, + infinity, thus it can be > 1

Thus, if the lift is >1 it is normal

If the confidence is > 1, it would be a bug. If this is the case, you can tell me in which case it happens (if possible send me your data and tell me which parameter you used) so that i can reproduce the problem and fix it (philippe.fv AT gmail.com)

Best,

Philippe

Options: ReplyQuote
Re: some cases with confidence>1
Posted by: Ankit
Date: September 12, 2014 03:50AM

100000202 100015013 ==> 12004051 #SUP: 61 #CONF: 1.79412 #LIFT: 1,576.02818
100000202 100015069 ==> 12004051 #SUP: 51 #CONF: 1.88889 #LIFT: 1,659.27921
100000202 100015294 ==> 12004051 #SUP: 22 #CONF: 2 #LIFT: 1,756.88387
100000202 100020576 ==> 12004051 #SUP: 20 #CONF: 1.66667 #LIFT: 1,464.06989
100000202 100035746 ==> 12004051 #SUP: 26 #CONF: 2 #LIFT: 1,756.88387
100000202 100036059 ==> 12004051 #SUP: 34 #CONF: 1.88889 #LIFT: 1,659.27921
100000202 101002473 ==> 12004051 #SUP: 22 #CONF: 2 #LIFT: 1,756.88387
100000202 101032733 ==> 12004051 #SUP: 18 #CONF: 1.63636 #LIFT: 1,437.45044
100000202 101033139 ==> 12004051 #SUP: 30 #CONF: 1.875 #LIFT: 1,647.07863
100000204 100015013 ==> 12004051 #SUP: 40 #CONF: 2 #LIFT: 1,756.88387
100000204 100015069 ==> 12004051 #SUP: 51 #CONF: 1.96154 #LIFT: 1,723.09764
100000204 100030207 ==> 12004051 #SUP: 22 #CONF: 2 #LIFT: 1,756.88387
100000204 101005478 ==> 12004051 #SUP: 22 #CONF: 2 #LIFT: 1,756.88387
100000206 100015013 ==> 12004051 #SUP: 57 #CONF: 1.83871 #LIFT: 1,615.19969
100000206 100015069 ==> 12004051 #SUP: 41 #CONF: 1.95238 #LIFT: 1,715.0533
100000206 100020576 ==> 12004051 #SUP: 39 #CONF: 1.77273 #LIFT: 1,557.23798
100000206 100030207 ==> 12004051 #SUP: 23 #CONF: 1.91667 #LIFT: 1,683.68038
100000206 101005478 ==> 12004051 #SUP: 23 #CONF: 1.91667 #LIFT: 1,683.68038
100000206 101032733 ==> 12004051 #SUP: 32 #CONF: 1.77778 #LIFT: 1,561.67455
100000207 100015013 ==> 12004051 #SUP: 39 #CONF: 1.69565 #LIFT: 1,489.53198
100000207 100015069 ==> 12004051 #SUP: 30 #CONF: 1.76471 #LIFT: 1,550.19165
100000208 100015013 ==> 12004051 #SUP: 75 #CONF: 1.875 #LIFT: 1,647.07863
100000208 100015069 ==> 12004051 #SUP: 29 #CONF: 1.8125 #LIFT: 1,592.17601
100000208 100020576 ==> 12004051 #SUP: 33 #CONF: 1.73684 #LIFT: 1,525.71494
100000208 100042357 ==> 12004051 #SUP: 42 #CONF: 2 #LIFT: 1,756.88387
100000208 101002992 ==> 12004051 #SUP: 28 #CONF: 2 #LIFT: 1,756.88387

------------------------------------
these are some of the rules i got using spmf. on manually checking the file i found that the issue is not with the confidence, but with support value which makes the confidence>1. i was using fp-growth with lift.

Options: ReplyQuote
Re: some cases with confidence>1
Date: September 12, 2014 04:08AM

Ok. Another question is :

Does lines in your input file contains the same item more than once? If it is the case, then this might be the reason. For FPGrowth, an item should not appear more than once in a transaction.

Could you send me your input file to my e-mail and tell me the parameters so that I can check what is happening?

Thanks



Edited 1 time(s). Last edit at 09/12/2014 04:09AM by webmasterphilfv.

Options: ReplyQuote
Re: some cases with confidence>1
Posted by: Ankit
Date: September 12, 2014 05:14AM

I've checked the input file. There were no repeat items for any particular 'transaction'.
And the input file is too big and so I'm unable to send it over email.

Thanks

Options: ReplyQuote
Re: some cases with confidence>1
Date: September 12, 2014 05:57AM

Hi,

Ok. I have tried the current version of SPMF on some big datasets such as Chess, Kosarak, accidents that are available on the website and I'm unable to see any rule with confidence > 1.

How big is your dataset?

If you cannot send it by e-mail, you could try this website that allows to send e-mail with files up to 2 GB without registration:

http://free.mailbigfile.com

This would help me to find the problem.

Thanks



Edited 1 time(s). Last edit at 09/12/2014 05:57AM by webmasterphilfv.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.