The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
mine sequential patterns in repeated items itemsets
Posted by: Veronica
Date: September 22, 2014 11:58AM

Hello, apologies for this naive question, I'm a newbie in data mining trying to find sequential patterns in sequences of repeated itemsets.

I've been trying to use this algorithms: Sequential Pattern Mining algorithms (PrefixSpan, GSP, SPADE, SPAN..)

But for all of them it is assumed that no items appear twice in the same itemset and that items in an itemset are lexically ordered.
In my case sequences look like this one:
S1 = ((1 1 1 2 2 1 2 1 2 1 2 ), (1 2 3 1 2 2 1), (1 3 1 2 2 1 1 1 2)) or this one:
S2 = (1 1 1 2 1 1 1 1 1 3 3 3 1 1 1 1 2 1 1 1 1 1 1)

If I understand it right I can't use none of Sequential Pattern Mining algorithms because all of them assume that no items appear twice in the same itemset, am I right? which algorithm could I use?

Options: ReplyQuote
Re: mine sequential patterns in repeated items itemsets
Date: September 22, 2014 01:49PM

Hi,

Welcome to the forum.

Yes, algorithms such as PrefixSpan, GSP, SPADE, etc. assumes that no items can appear twice in the same itemset.

If you want to apply algorithms such as PrefixSpan, GSP, SPADE, etc., on your data you would thus need to remove duplicate items in itemsets. For example:

S1 = ((1 1 1 2 2 1 2 1 2 1 2 ), (1 2 3 1 2 2 1), (1 3 1 2 2 1 1 1 2))

would become:

S1 = ((1 2 ), (1 2 3), (1 2 3))

Or another way may be to recode your dataset differently such as:

S1 = (1) (1) (1) (2) (2) (1 2) (1 2) (1 2), (1 2) (1 3) (2) (2) ...

It don't know if it would make sense for your application, though.

Another possibility is to look at "quantitative sequential pattern mining algorithms" such as SQUIRE ( http://www.cs.ubc.ca/~rng/psdepository/shim2007.pdf ), which is not offered in SPMF. This algorithms allows items to have quantities in itemsets. For example, the item 1 may appear with a quantity of 2 in an itemset.

Best,



Edited 2 time(s). Last edit at 09/22/2014 01:50PM by webmasterphilfv.

Options: ReplyQuote
Re: mine sequential patterns in repeated items itemsets
Posted by: Veronica
Date: September 24, 2014 09:07AM

Thanks so much Philippe, I'll treat repeated items as different itemsets as you suggest, that should work for my dataset.

Thanks!

Options: ReplyQuote
Re: mine sequential patterns in repeated items itemsets
Date: September 24, 2014 10:25AM

Ok. Glad it helps. And hope that you will get some good results,

Best,

Philippe

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.