The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Difficulty understanding the example provided in the spmf documentation
Posted by: suranga
Date: March 02, 2013 09:23PM

Hi everyone,

I'm trying out the IGB Basis of association rules, and being a novice to association mining, I'm having some difficulty understanding the example provided here -

http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php#example7

If you look at the results table provided for that example, you'll see that it looks like,

1 ==> 2, 4, 5 0.50 0.75
4 ==> 1, 2, 5 0.50 0.75
3 ==> 2, 5 0.50 0.75
{} ==> 2, 3 0.66 0.66
{} ==> 1, 2, 5 0.66 0.66
{} ==> 2, 4 0.66 0.66
{} ==> 2, 5 0.83 0.83
{} ==> 2 1 1

My question is, what does the empty curled braces imply in the above results table ? I can understand what the 1 ==> 2,4,5 etc. implies, but am baffled over the empty braces issue.

Options: ReplyQuote
Re: Difficulty understanding the example provided in the spmf documentation
Date: March 02, 2013 10:16PM

Hi,

The empty braces represents the empty set.

So for example the rule
{} ==> 2 5

means that "nothing" imply items 2 and 5 together

This rule has a support and confidence of 1, which means 100 %. This means that the rule appears in all the lines of the input file (support) and that every time that "nothing" appears (always), the items 2 and 5 will also occurs.

If you want more details about the IGB, you can read the article describing IGB. It is in the "Algorithms" section of the website. In the article, they explain why they want to discover rules with empty sets.

Best,

Philippe

Options: ReplyQuote
Re: Difficulty understanding the example provided in the spmf documentation
Posted by: suranga
Date: March 03, 2013 05:34PM

Hi Philippe,

Thank you for your response.
I went over to the 'algorithms' section, but unfortunately, the section on IGB points to a research paper that appears to be in French or Spanish :-)

Regarding the {} ==> 2 5 example we discussed above, I have just one more question -
If I understood what you said correctly, does it mean that numbers 2 and 5 have this level of turning up together per any set of test data ?
Sp basically 'empty ruleset' means, irrespective of any rules, these values have such a chance of turning up each time ?

Options: ReplyQuote
Re: Difficulty understanding the example provided in the spmf documentation
Date: March 03, 2013 06:51PM

Hi,

Yes, it should be a French paper. Sorry about that. I did not remember that it was French.

Let me explain what confidence an support means.

An association rules in general is a relationship between two sets of items X--> Y.

The support of an association rule sup(X-->Y) is defined as the number of transaction that contains XUY divided by the total number of transactions in the database.

The confidence of an association rule is defined as the number of transaction that contains XUY divided by the number of transaction that contains X.

Therefore, if you have an empty rule like {} ==> 2 5, then the support and the confidence are simply the number of times that {2, 5} appears together in the database.

In the article describing IGB, the authors want to find a minimal set of rules that contains information about ALL the rules. They also want to be able to regenerate all the rules from the minimal set of rules that they call IGB. In their definition, they use two types of rules: rules with an empty antecedent and normal rules. They need that to be able to regenerate all the rules it is necessary to include the rules with an empty consequence. That is the basic idea about IGB.

Here are some ENGLISH articles about IGB if you are interested about the details:
- http://lisp.vse.cz/challenge/ecmlpkdd2005/proceedings/Chall05-proc.pdf#page=88
- http://www.cril.univ-artois.fr/spip/publications/IDA384_with_modifications.pdf
- http://ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-110/paper8.pdf

Hope this helps,

Philippe



Edited 1 time(s). Last edit at 03/03/2013 06:53PM by webmasterphilfv.

Options: ReplyQuote
Re: Difficulty understanding the example provided in the spmf documentation
Posted by: suranga
Date: March 08, 2013 02:06AM

Hi Philippe,

Thank you for pointing me to these resources. They were exactly what I was looking for, and helped me bring my task to a satisfactory conclusion.

Thank you !


Best regards,
Suranga

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.