Re: Mining Top K sequential rules
Date: May 13, 2014 02:29PM
Hi,
Here is the link to download the PDF of the powerpoint presentation:
http://www.philippe-fournier-viger.com/spmf/TOPSEQRULES_sequential_rules_pdf
If it still does not work, you may send me an e-mail at philippe.fv AT gmail.com and I can send it directly to your e-mail.
Note however that the powerpoint is not very detailed. It just gives the main points about the algorithm.
Yes, the rule {a, b, c}⇒{e, f, g} could be written as {b,a,c}⇒{g, f, e} and it would have the same meaning. But to make it easier to read, in the examples provided in the paper the letters are always alphabetically ordered. Note also that in the implementations letters are also alphabetically ordered. The reason is that it allows some optimizations (without any loss of generality).
What is the utility? Well, sequential rules can be used to identify which items are followed by which items in a sequence with a high confidence. This may be used for example to perform prediction. For example, if you have some sequences of webpages visited by user, you may discover some rules such as page1, page2 --> page3, meaning that people who have visited page1 and page2 (in any order), will then visit page 3 with a given confidence. I have used this application as an example in this paper:
http://www.philippe-fournier-viger.com/sequential_rules_prediction_2012.pdf
Now, why mining the top-k rules? The reason is that sometimes it may be hard to decide what is the appropriate minsup threshold (it depends on your data). So using a top-k algorithm, you can say that you want to find only the k=1000 most frequent rules for example.
This is the main idea.
Best,