mine sequential patterns in repeated items itemsets

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto Topic: Previous•Next

Goto: Forum List•Message List•New Topic•Search•Log In•Print View

mine sequential patterns in repeated items itemsets

Posted by: Veronica

Date: September 22, 2014 11:58AM

Hello, apologies for this naive question, I'm a newbie in data mining trying to find sequential patterns in sequences of repeated itemsets.

I've been trying to use this algorithms: Sequential Pattern Mining algorithms (PrefixSpan, GSP, SPADE, SPAN..)

But for all of them it is assumed that no items appear twice in the same itemset and that items in an itemset are lexically ordered.
In my case sequences look like this one:
S1 = ((1 1 1 2 2 1 2 1 2 1 2 ), (1 2 3 1 2 2 1), (1 3 1 2 2 1 1 1 2)) or this one:
S2 = (1 1 1 2 1 1 1 1 1 3 3 3 1 1 1 1 2 1 1 1 1 1 1)

If I understand it right I can't use none of Sequential Pattern Mining algorithms because all of them assume that no items appear twice in the same itemset, am I right? which algorithm could I use?

Options: Reply•Quote

Re: mine sequential patterns in repeated items itemsets

Posted by: webmasterphilfv

Date: September 22, 2014 01:49PM

Hi,

Welcome to the forum.

Yes, algorithms such as PrefixSpan, GSP, SPADE, etc. assumes that no items can appear twice in the same itemset.

If you want to apply algorithms such as PrefixSpan, GSP, SPADE, etc., on your data you would thus need to remove duplicate items in itemsets. For example:

S1 = ((1 1 1 2 2 1 2 1 2 1 2 ), (1 2 3 1 2 2 1), (1 3 1 2 2 1 1 1 2))

would become:

S1 = ((1 2 ), (1 2 3), (1 2 3))

Or another way may be to recode your dataset differently such as:

S1 = (1) (1) (1) (2) (2) (1 2) (1 2) (1 2), (1 2) (1 3) (2) (2) ...

It don't know if it would make sense for your application, though.

Another possibility is to look at "quantitative sequential pattern mining algorithms" such as SQUIRE ( http://www.cs.ubc.ca/~rng/psdepository/shim2007.pdf ), which is not offered in SPMF. This algorithms allows items to have quantities in itemsets. For example, the item 1 may appear with a quantity of 2 in an itemset.

Best,

Edited 2 time(s). Last edit at 09/22/2014 01:50PM by webmasterphilfv.

Options: Reply•Quote

Re: mine sequential patterns in repeated items itemsets

Posted by: Veronica

Date: September 24, 2014 09:07AM

Thanks so much Philippe, I'll treat repeated items as different itemsets as you suggest, that should work for my dataset.

Thanks!

Options: Reply•Quote

Re: mine sequential patterns in repeated items itemsets

Posted by: webmasterphilfv

Date: September 24, 2014 10:25AM

Ok. Glad it helps. And hope that you will get some good results,

Best,

Philippe

Options: Reply•Quote