Hi,
Thanks for using SPMF and posting on the forum.
Here are the answers to your question.
ANSWER TO QUESTION 1Yes, and it is normal because
VMSP is an algorithm for discovering
maximal sequential patterns. A
maximal sequential pattern is a sequential pattern that is frequent (that has a support no less than "minsup"
and that is not included in another larger sequential pattern. Thus, as you notice some patterns are missing. But it is because they are not maximal.
Let me explain that in more details by using the example from the documentation.
If we set minsup = 75 %, we obtain the following patterns with VMSP:
6 -1 SUP: 3
5 -1 SUP: 3
4 -1 3 -1 SUP: 3
2 -1 3 -1 SUP: 3
1 -1 3 -1 3 -1 SUP: 3
1 -1 3 -1 2 -1 SUP: 3
Now, if we set minsup = 50%, we obtain the following patterns with VMSP:
6 -1 2 -1 3 -1 SUP: 2
5 -1 2 -1 3 -1 SUP: 2
4 -1 3 -1 2 -1 SUP: 2
1 2 -1 6 -1 SUP: 2
1 -1 3 -1 3 -1 SUP: 3
1 -1 2 -1 3 -1 SUP: 2
5 -1 6 -1 3 -1 2 -1 SUP: 2
5 -1 1 -1 3 -1 2 -1 SUP: 2
1 2 -1 4 -1 3 -1 SUP: 2
1 -1 2 3 -1 1 -1 SUP: 2
Note that most of the patterns found for minsup = 75 % are not found for minsup = 50 % because they are included in the patterns having a support of 50 % and thus are not maximal anymore when we set minsup = 50%.
But you can still notice that the pattern 1 -1 3 -1 3 -1 is there with a support of 3.
Now if you set minsup = 25%, we will find only four patterns with a support of 1 because the are the maximal patterns. Other patterns are not found because they are all included in these four patterns.
5 -1 7 -1 1 6 -1 3 -1 2 -1 3 -1 SUP: 1
1 4 -1 3 -1 2 3 -1 1 5 -1 SUP: 1
5 6 -1 1 2 -1 4 6 -1 3 -1 2 -1 SUP: 1
1 -1 1 2 3 -1 1 3 -1 4 -1 3 6 -1 SUP: 1
Thus, to answer your question, this behavior is normal by the definition of a maximal sequential pattern.
If you want "all patterns", you may consider using an algorithm such as CM-SPAM that will find all patterns. You may also consider using ClaSP for "closed sequential patterns". Actually, maximal sequential patterns are a subset of closed sequential patterns which are a subset of all sequential patterns.
ANSWER TO QUESTION 2You are right, this was an error in the documentation because that feature was not implemented for
VMSP. I have just updated the code for you on the website so that now it can show sequence identifiers. I have added the feature to VMSP, TKS, VGEN and SPAM. Thanks for reporting this issue. I will update the documentation also later.
Best regards,
Philippe
Edited 3 time(s). Last edit at 08/24/2015 08:40PM by webmasterphilfv.