The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Problem using clustering algorithm on time series data
Posted by: Marta
Date: May 03, 2019 12:16AM

Hi,

I have gotten a lot of error messages when trying to use any of the clustering algorithms on time series data. I structured my data file like in contexSAX.txt test file, and still get the various error messages on all clustering algorithms.

I am trying to find if groups of learners progressed through the task in similar way; ie. can we cluster students processes into certain types of progress.

Below is an example of the data, I have 9 groups in total.

@NAME=GROUP1
300,300,300,300,200,200,300,300,300,300,200,200,200,200,150,100,100,100,100,100,150,100,300,300,300,300,100,100,150,100,300,300,300,300,350,300,200,200,200,200,300,300,300,300,300,300,300,300,300,300,300,300,100,100,100,100,100,100,200,200,350,300,350,300,350,300,300,300,150,100,350,300,300,300,300,300,300,300,300,300,300,300,300,300,300,300,300,300,350,300,300,300,300,300,300,300,300,300,300,300,275,300,300,300,200,200,400,400,300,300,200,200
@NAME=GROUP2
200,200,200,200,300,300,100,100,100,100,100,100,300,300,100,100,100,100,300,300,100,100,300,300,100,100,300,300,300,300,300,300,300,300,300,300,300,300,350,300,150,100,300,300,350,300,300,300,300,300,300,300,300,300,300,300,300,300,350,300,300,300,300,300,300,300,100,100,150,100,300,300,300,300,350,300,350,300,350,300,350,300,300,300,300,300,300,300,150,100,300,300,150,100,350,300,300,300,300,300,200,200

Any help would be appreciated!
Kind regards,
Marta

Options: ReplyQuote
Re: Problem using clustering algorithm on time series data
Date: May 03, 2019 06:52AM

Hi,

Thanks for using SPMF!

I have checked your file. The problem is that all lines should the same length. The first and second lines have 111 and 101 values. Thus, it is not possible to compare them and it produces some error. If you remove some values such that both lines have 101 values, then it will work.

Best regards,

Philippe

Options: ReplyQuote
Re: Problem using clustering algorithm on time series data
Posted by: Marta
Date: May 05, 2019 11:16PM

Hi,

Thank you for your quick answer!

Kind regards,
Marta

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.