The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
How to use custom separator for end of itemset and sequence in PrefixSpan?
Posted by: Huang
Date: May 25, 2020 11:22PM

How to use custom separator for end of itemset and sequence instead of -1 and -2 in PrefixSpan?

Options: ReplyQuote
Re: How to use custom separator for end of itemset and sequence in PrefixSpan?
Date: May 26, 2020 01:57AM

Hi,

1) If you have an input file where other separators are used instead of -1 and -2 and you want to use SPMF, then you could just open the file in a text editor and replace the separators by -1 and -2 using the "find and replace" function.

Another way is to write a small program using any programming language to convert your file.

2) Now, if you want to see a different type of separator in the output file, then you could check the documentation of prefixspan:
http://www.philippe-fournier-viger.com/spmf/PrefixSpan.php

There is an example input file like this:

@CONVERTED_FROM_TEXT
@ITEM=1=apple
@ITEM=2=orange
@ITEM=3=tomato
@ITEM=4=milk
@ITEM=5=bread
@ITEM=6=noodle
@ITEM=7=rice
@ITEM=-1=|
1 -1 1 2 3 -1 1 3 -1 4 -1 3 6 -1 -2
1 4 -1 3 -1 2 3 -1 1 5 -1 -2
5 6 -1 1 2 -1 4 6 -1 3 -1 2 -1 -2
5 -1 7 -1 1 6 -1 3 -1 2 -1 3 -1 -2

and then the result file would look like this:

apple | #SUP: 4
orange | #SUP: 4
tomato | #SUP: 4
apple | orange | #SUP: 4

As you can see in the above results, the -1 separator has been replace by | in the output file.

But this feature works in the GUI of SPMF.

3) If you want to modify the code of PrefixSpan andyou are a bit familiar with java, it is not very hard and you could find the -1 and -2 in the code of PrefixSpan and replace them by something else.

----
So that is the main idea. It could a good idea maybe to add a feature in SPMF to let the user choose a separator. This has not been implemented. But it could be a good idea for the future.

Best regards,

Philippe

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.