The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
BMSWebView1 dataset with GSP, SPADE, CLOSPAN, etc.
Posted by: sedighe
Date: April 14, 2014 12:19PM

hello. in this page :
http://www.philippe-fournier-viger.com/spmf/index.php?link=datasets.php
Written "BMSWebView1 (Gazelle) ( KDD CUP 2000)" is for sequence pattern mining! i test it in spmf wit GSP, SPADE, CLOSPAN and etc, but do not have any output!!! why? please help me.
i want three Large DB for run:
GSP, SPADE, Prifixspan, clospan, BIDE, seq-dim
what databases i select that help me? motion it, i have at least two large DB.



Edited 2 time(s). Last edit at 04/14/2014 03:06PM by webmasterphilfv.

Options: ReplyQuote
Re: select DB
Date: April 14, 2014 03:02PM

It depends on the parameter "minsup" that you are using.

For BMSWebView1, to get results, you need to set minsup very very low.

For example, for minsup = 0.01 you will get 77 sequential patterns with GSP

The lower you set minsup, the more patterns you will find.


For your experiments, you may use BMSWebView1, BIBLE, Kosarak, SIGN, LEVIATHAN, FIFA....

They are all suitable for sequential pattern mining.



Edited 2 time(s). Last edit at 04/14/2014 03:08PM by webmasterphilfv.

Options: ReplyQuote
Re: BMSWebView1 dataset with GSP, SPADE, CLOSPAN, etc.
Posted by: sedighe
Date: April 14, 2014 09:23PM

thank you. i want two DB that work with many min_sups!! (low min_sup and medum and larg) the DB that is good and complete. what is better?

Options: ReplyQuote
Re: BMSWebView1 dataset with GSP, SPADE, CLOSPAN, etc.
Date: April 15, 2014 03:58AM

The more you set minsup low, the more patterns you will get. And the number of patterns can increase exponentially when minsup is set lower. And when the number of patterns increase the algorithms get slower and consume more memory.

So if you set minsup too low, the algorithm will never terminate or run out of memory. And if you set minsup too high you will get no patterns or not enough.

Now, what is "too high" or "too low" depends on the dataset and the algorithm.

This picture from my PAKDD 2014 paper shows a comparison of the performance of the algorithm for the minsup values on the main datasets on the webpage:





Edited 1 time(s). Last edit at 04/15/2014 03:59AM by webmasterphilfv.

Options: ReplyQuote
Re: BMSWebView1 dataset with GSP, SPADE, CLOSPAN, etc.
Posted by: sedighe
Date: April 15, 2014 01:21AM

for MD-sequence mining, what am i do? for example for seq-dim? what DB i use?

Options: ReplyQuote
Re: BMSWebView1 dataset with GSP, SPADE, CLOSPAN, etc.
Date: April 15, 2014 03:59AM

For SeqDIM, there is no public dataset.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.