Issue in My Dataset for HUSPM
Posted by: P N RAMESH
Date: March 29, 2020 06:12AM

I am working with HU Sequential Pattern mining algorithm. I got following error for my dataset.

Index 5 out of bounds for length 5
at SPFM/ca.pfv.spmf.algorithms.sequentialpatterns.uspan.AlgoUSpan.runAlgorithm(AlgoUSpan.java:395)


my data set is

1[9] 2[7] -1 -2 SUtility:16
1[8] 3[8] 4[8] 5[8] -1 -2 SUtility:32
6[9] 7[8] 8[9] -1 -2 SUtility:26
9[8] -1 -2 SUtility:8
10[8] -1 -2 SUtility:8
11[8] 3[8] -1 12[6] 4[8] 5[8] -1 -2 SUtility:38
10[8] -1 -2 SUtility:8
6[6] 8[6] 13[8] -1 -2 SUtility:20
14[8] 15[8] 16[8] -1 -2 SUtility:24


if i remove 6th sequence, it is working.

is it anything wrong in 6th sequence?

Thanks in advance.

Re: Issue in My Dataset for HUSPM
Date: March 29, 2020 09:39AM

Good evening!

Thanks for using SPMF! The problem is the following:

It may not be explained clearly in the documentation, but there is an assumption that the items whithin an itemset are ordered by ascending order (e.g. 1, 2, 3 4...). If that order is not respected then, the algorithm may produce some incorrect results.

So this sequence:
11[8] 3[8] -1 12[6] 4[8] 5[8] -1 -2 SUtility:38

should be replaced by:

3[8] 11[8] -1 4[8] 5[8] 12[6] -1 -2 SUtility:38

so that items are in ascending order.

I will explain this more clearly in the documentation. Why this order? Because it allows to do some optimization.

Then it works.

This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.