IMPORTANT: This is the old Data Mining forum.

I keep it online so that you can read the old messages.

Please post your new messages in the

Sequential pattern mining question

Posted by:
**
leetcat
**

Date: September 18, 2014 12:40PM

Are these rules supposed to be omitted?

1 ==> 1 #SUP:** #CONF:**

For example if I have the following transactions

1 -1 1 2 3 -1 1 3 -1 4 -1 3 6 -1 -2

1 4 -1 1 3 -1 2 3 -1 1 5 -1 -2

1 5 6 -1 1 2 -1 4 6 -1 3 -1 2 -1 -2

I would expect that a rule would be found:

1 ==> 1 #SUP:** #CONF:**

Thanks!

Edited 2 time(s). Last edit at 09/18/2014 12:43PM by leetcat.

1 ==> 1 #SUP:** #CONF:**

For example if I have the following transactions

1 -1 1 2 3 -1 1 3 -1 4 -1 3 6 -1 -2

1 4 -1 1 3 -1 2 3 -1 1 5 -1 -2

1 5 6 -1 1 2 -1 4 6 -1 3 -1 2 -1 -2

I would expect that a rule would be found:

1 ==> 1 #SUP:** #CONF:**

Thanks!

Edited 2 time(s). Last edit at 09/18/2014 12:43PM by leetcat.

Posted by:
**
webmasterphilfv
**

Date: September 18, 2014 03:54PM

Hi,

For the sequential rule mining algorithms like**RuleGrowth, TRuleGrowht, CMRules, CMDeo, TNS, TopSeqRules, ERMiner**, the rules such as 1-->1 will be omitted. The definition of sequential rule used by these algorithms does not allow a rule such that the left side contains an item appearing also in the right side.

If you use a sequential pattern mining algorithm instead like**PrefixSpan, CM-SPAM, CM-SPADE, ClaSP, BIDE, SPAM, SPADE, GSP..**, you may get patterns containing several times the same item. But they will not be rules.

Lastly, if you use the**RuleGen** algorithm for sequential rule mining, you could also get the rule 1 -> 1 because it uses a different definition. But RuleGen is quite slow.

Hope this helps!

Best,

Philippe

Edited 2 time(s). Last edit at 09/18/2014 03:56PM by webmasterphilfv.

For the sequential rule mining algorithms like

If you use a sequential pattern mining algorithm instead like

Lastly, if you use the

Hope this helps!

Best,

Philippe

Edited 2 time(s). Last edit at 09/18/2014 03:56PM by webmasterphilfv.

Posted by:
**
malsoru
**

Date: September 24, 2014 11:04AM

Hi,

What is the difference between "Sequential pattern mining" and

"Sequential Rule mining"

What is the difference between "Sequential pattern mining" and

"Sequential Rule mining"

Posted by:
**
webmasterphilfv
**

Date: September 24, 2014 01:10PM

Hi,

I will explain the difference briefly. You may look at the papers for a formal definition.

A**sequential pattern** is a subsequence that appears frequently in a set of sequences.

For example, consider three sequences:

A,B,C,A

A,C,B,A

A,B,C,D

For this database of three sequences, a sequential pattern is A,C,A which is said to have a "support" (a.k.a frequency) of 2 because it appears in two sequences.

Although sequential patters are useful, they do not have a measure of confidence or probability that something will occur.

A**sequential rule** can be seen as a kind of sequential pattern that have a confidence or probability that something will happen. It is useful for example, to make some prediction.

For example, for the above example, the rule A B -> C has a support of 2 because it appears in two sequences and a confidence of 66 % because AB is followed by C in two of the three sequences where AB appears. As you can see, having the confidence is helpful because not only we may want patterns that appears often in a database, but we may also want to have a measure of confidence in these patterns.

If you are using the SPMF data mining library, you may also check the examples in the documentation and the articles for more details.

Hope this helps,

Edited 1 time(s). Last edit at 09/24/2014 04:58PM by webmasterphilfv.

I will explain the difference briefly. You may look at the papers for a formal definition.

A

For example, consider three sequences:

A,B,C,A

A,C,B,A

A,B,C,D

For this database of three sequences, a sequential pattern is A,C,A which is said to have a "support" (a.k.a frequency) of 2 because it appears in two sequences.

Although sequential patters are useful, they do not have a measure of confidence or probability that something will occur.

A

For example, for the above example, the rule A B -> C has a support of 2 because it appears in two sequences and a confidence of 66 % because AB is followed by C in two of the three sequences where AB appears. As you can see, having the confidence is helpful because not only we may want patterns that appears often in a database, but we may also want to have a measure of confidence in these patterns.

If you are using the SPMF data mining library, you may also check the examples in the documentation and the articles for more details.

Hope this helps,

Edited 1 time(s). Last edit at 09/24/2014 04:58PM by webmasterphilfv.

Posted by:
**
Hkim
**

Date: October 13, 2014 12:11AM

Hello,

I found sequential data mining is interesting technique in extracting patterns in time series dataset. However, data mining is still new to me and I'm glad that I found this SPMF tools. So, I give a try one of the DM algorithm which is SeqDim_(PrefixSpan+Apriori) along with the test data. I have followed the instruction but, the program given no result even thought I have change the minsup to different values. I not sure why this tool doesn't give any result. What should I do to run this algorithm properly?

Thanks

Hkim

I found sequential data mining is interesting technique in extracting patterns in time series dataset. However, data mining is still new to me and I'm glad that I found this SPMF tools. So, I give a try one of the DM algorithm which is SeqDim_(PrefixSpan+Apriori) along with the test data. I have followed the instruction but, the program given no result even thought I have change the minsup to different values. I not sure why this tool doesn't give any result. What should I do to run this algorithm properly?

Thanks

Hkim

Posted by:
**
webmasterphilfv
**

Date: October 13, 2014 07:24AM

Hi,

Thanks for reporting this problem. It was a bug.

I have fixed the JAR file and the source code of SPMF. You can download it again from the website and it will work.

The problem was in the class:

ca.pfv.spmf.patterns.itemset_array_integers_with_tids_bitset.Itemset

I have replaced this:

/**

* Constructor of an empty itemset

*/

public Itemset(){

transactionsIds = new BitSet();

}

by this:

/**

* Constructor of an empty itemset

*/

public Itemset(){

transactionsIds = new BitSet();

itemset = new int[0];

}

Best,

Philippe

Thanks for reporting this problem. It was a bug.

I have fixed the JAR file and the source code of SPMF. You can download it again from the website and it will work.

The problem was in the class:

ca.pfv.spmf.patterns.itemset_array_integers_with_tids_bitset.Itemset

I have replaced this:

/**

* Constructor of an empty itemset

*/

public Itemset(){

transactionsIds = new BitSet();

}

by this:

/**

* Constructor of an empty itemset

*/

public Itemset(){

transactionsIds = new BitSet();

itemset = new int[0];

}

Best,

Philippe