Re: dofferences among Sequential pattern mining and continuous association rule mining algorithm (Carma) and sequence algorithm?
Date: August 12, 2012 04:03PM
Hi Naser,
I will try to answer your question. I will give you the main idea. I will not go deep into details.
The input of an association rule mining algorithm is a transaction database and two thresholds called minsup and minconf. A transaction database is a set of transactions. A transaction is a set of items. For example, this is a transaction database containing four transactions:
transaction 1: bread, milk, beer
transaction 2: chips, fish, beer
transaction 3: bread, potatoes, milk, beer.
transaction 3: bread, milk.
There is no time in a transaction database.
The goal of association rule mining is to find association between items in the transactions. There are two thresholds that the user need to set called minsup and minconf. The result of applying an association rule mining algorithm is all the association rules that respect there thresholds in the transaction database. For example, in the previous database, we could get the following association rule:
bread --> beer
This rule has a support of 66 % because it appears in 2 transactions out of 4. Moreover, it has a confidence of 66 % because bread appears in three transactions but it only appears together with beer in two transactions, so 2 / 3 = 66 %.
I just give you the main idea.
Now let's talk about sequential pattern mining. There are two input : a sequence database and a threshold named minsup.
A sequence database is different from a transaction database, because it also considers time.
A sequence database is a set of sequences. A sequence is a time-ordered list of transactions.
For example, here is a sequence database containing four sequences.
sequence 1: {bread, butter} {beer} {bread}, {milk, fish}
sequence 2: {bread, butter, fish}, {apple, sugar}, {flour}, {fish}
sequence 3: {butter} {bread}, {milk, fish}, {beer} , {milk, fish}
sequence 4: {apple}, {orange}
For example, sequence 1 means that "bread" and "butter appeared together. Then they are followed by "beer", which is followed by "bread", which is followed by "milk" and "fish" at the same time.
If you apply a sequential pattern mining algorithm on this sequence database, you will find all subsequences that appear in at least minsup sequences. For example, if you set minsup to 50 %, you may find several patterns including the following pattern:
{bread}, {beer}, {milk, fish}
This sequential patterns has a support of 50 % because it appears in 2 sequences out of 4 in the sequence database. This pattern means that bread appear before beer and that it is followed by milk and fish at the same time.
For sequential patterns, there is no confidence.
So to summarize, there are some major differences between association rule mining and sequential pattern mining. The input is not the same (transaction database vs sequence database). Also the thresholds that are set by the user are not the same (minsup + minconf vs minsup). The result is also different. An association rule does not consider time. It just means that some items often appear together. A sequential pattern is a sequence that is time-ordered.
I have explained the difference for association rules and sequential patterns. For continuous association rules, it is basically the same thing. The difference between continuous association rules and association rules is that the continuous ules are continuously generated (from what i understand).
By the way, you also mention "sequence algorithm". I don't know what you mean by that.
Hope this helps,
Best,
Philippe
Edited 3 time(s). Last edit at 08/12/2012 04:09PM by webmasterphilfv.