The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Item labels in GoKrimp not working?
Posted by: vrodriguezf
Date: January 11, 2019 10:11AM

Hi,

i cannot manage to reproduce the example given in the documentation for using the algorithm GoKrimp with item labels.
link to documentation
I am using a separated lab file just as the documentation says. May anybody double check if this really works?

Best!



Edited 1 time(s). Last edit at 01/11/2019 10:11AM by vrodriguezf.

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Date: January 11, 2019 09:39PM

Hi,

I have checked and it does not work with jmlr_*.dat and jmlr*.lab

But GoKrimp is working correctly with test_goKrimp.dat and test_goKrimp.lab.

I think that there is some error in the file format of jmlr*.dat and jmlr*.lab because it contains the item "0". You can ignore these files and just look at "test_goKrimp.dat"/ "test_goKrimp.lab" as example to see how it works.

Best,

Philippe

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Posted by: vrodriguezf
Date: January 12, 2019 01:23AM

Hi,

Are you running it from the command line o from the source code? I am doing it from the command line, using the command "java -jar spmf.jar run GoKrimp test_goKrimp.dat output.txt test_goKrimp.lab", and I get same results as the documentation in the output file, but no labels at all sad smiley

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Date: January 12, 2019 04:58AM

Hi,

I see. There was indeed a bug in the command line/user interface. It was not working when no label file was provided. I have fixed it and upload a new .jar file and .zip file. You can download it again:

http://www.philippe-fournier-viger.com/spmf/spmf.jar

and it will work. If I type:

java -jar spmf.jar run GoKrimp test_goKrimp.dat output.txt

Then the result is:

519 323 2 #SUP: 1922.0805169745581
284 415 #SUP: 598.4779197322205
2 3 #SUP: 514.3636076964904
357 358 #SUP: 412.97477810794953
463 921 #SUP: 362.77990992658306
295 296 289 #SUP: 359.43121178622823
430 125 #SUP: 210.35737280279864
431 114 #SUP: 187.4191751710605
528 370 #SUP: 176.54564703558572
599 3 #SUP: 160.8764740091283
519 323 #SUP: 148.74998482584488
67 128 #SUP: 138.22594146616757
296 289 #SUP: 21.13237025786657

By the way, since you have found two problems in GoKrimp, and I think it is very useful, you can tell me your name and I will add it to the list of contributors of SPMF for reporting the bug. ;-)

Thanks!

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Posted by: vrodriguezf
Date: January 12, 2019 05:49AM

Hi,

My name is Víctor Rodríguez-Fernández. Thank you for the offer to be added as a bug contributor winking smiley

Regarding the problem with the labels in GoKrimp, actually I did not know that the code did not work without setting a .lab file. My main problem is that the .lab file is ignored, at least when calling GoKrimp from the command line.

Did you manage to reproduce this issue?

Best!

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Date: January 12, 2019 05:59AM

Ok. Great. I will add your name when I will update the website.

I try:

java -jar spmf.jar run GoKrimp test_goKrimp.dat output.txt test_goKrimp.lab

I get the file output.txt as follows:

support vector machin #SUP: 1922.0710148279322
real world #SUP: 598.4753133154009
machin learn #SUP: 514.3586664227769
state art #SUP: 412.9730013575172
high dimension #SUP: 362.7776787300827
reproduc hilbert space #SUP: 359.42939766764175
neural network #SUP: 210.35608129308093
experiment result #SUP: 187.4169747827109
compon analysi #SUP: 176.54417917714454
supervis learn #SUP: 160.87427082075737
support vector #SUP: 148.74911007808987
well known #SUP: 138.22464635269716
hilbert space #SUP: 21.132125171017833

And if I type:

java -jar spmf.jar run GoKrimp test_goKrimp.dat output.txt

the result is:

519 323 2 #SUP: 1922.0805169745581
284 415 #SUP: 598.4779197322205
2 3 #SUP: 514.3636076964904
357 358 #SUP: 412.97477810794953
463 921 #SUP: 362.77990992658306
295 296 289 #SUP: 359.43121178622823
430 125 #SUP: 210.35737280279864
431 114 #SUP: 187.4191751710605
528 370 #SUP: 176.54564703558572
599 3 #SUP: 160.8764740091283
519 323 #SUP: 148.74998482584488
67 128 #SUP: 138.22594146616757
296 289 #SUP: 21.13237025786657

If you have a different result, maybe you can tell me more about how you have been using SPMF and what you get, so that I can try to see what is going on.



Edited 1 time(s). Last edit at 01/12/2019 05:59AM by webmasterphilfv.

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Posted by: vrodriguezf
Date: January 12, 2019 06:29AM

Hi,

It may sound super-weird, but depending on the filepaths I use for setting the input and the lab file, the lab file is used or ignored.

For example, if I put all the files in the same path as the spmf.jar file, and I run the following command:
java -jar spmf.jar run GoKrimp test_goKrimp.dat output.txt

then everything works as expected and the output is:

support vector machin #SUP: 1922.0710148279322
real world #SUP: 598.4753133154009
machin learn #SUP: 514.3586664227769
state art #SUP: 412.9730013575172
high dimension #SUP: 362.7776787300827
reproduc hilbert space #SUP: 359.42939766764175
neural network #SUP: 210.35608129308093
experiment result #SUP: 187.4169747827109
compon analysi #SUP: 176.54417917714454
supervis learn #SUP: 160.87427082075737
support vector #SUP: 148.74911007808987
well known #SUP: 138.22464635269716
hilbert space #SUP: 21.132125171017833



However, if I use absolute paths for the input and label files instead of relative files (but using the same exactly files!), the command is:
java -jar spmf.jar run GoKrimp /home/victor/SPMF/test_goKrimp.dat output.txt /home/victor/SPMF/test_goKrimp.lab

and the results now are:

519 323 2 #SUP: 1922.0805169745581
284 415 #SUP: 598.4779197322205
2 3 #SUP: 514.3636076964904
357 358 #SUP: 412.97477810794953
463 921 #SUP: 362.77990992658306
295 296 289 #SUP: 359.43121178622823
430 125 #SUP: 210.35737280279864
431 114 #SUP: 187.4191751710605
528 370 #SUP: 176.54564703558572
599 3 #SUP: 160.8764740091283
519 323 #SUP: 148.74998482584488
67 128 #SUP: 138.22594146616757
296 289 #SUP: 21.13237025786657


After playing with this, I think that the problem could be something like a "maximum length of command" or something like that, that is trimming the command in a way that the lab filepath, which is the last argument, does not enter the source code entirely, which results in it being ignored.

Does that make any sense? I hope you can reproduce the issue now.

Best!

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Date: January 12, 2019 06:58AM

Thanks a lot for the information.

Yes, in the code, there is an assumption that the data and label files must be in the same folder. Thus, if use the following command, I think it will work:

java -jar spmf.jar run GoKrimp /home/victor/SPMF/test_goKrimp.dat output.txt test_goKrimp.lab

(here: I have removed the path for the label file because it is assumed that they are in the same folder)

Actually, the problem is in my opinion that due to the assumption that the files are in the same folder, the software was concatenating the path of the data file to the path of the label file and the result was like that:
/home/victor/SPMF/home/victor/SPMF/test_goKrimp.lab which is wrong. So by removing the path for the label file, it should work.

If you try it and it works, I will explain this more clearly in the documentation on the website. Actually, it is a "feature". There is the assumption that the files must be in the same folder. But I should make it more clear in the documentation.



Edited 3 time(s). Last edit at 01/12/2019 07:02AM by webmasterphilfv.

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Posted by: vrodriguezf
Date: January 12, 2019 07:27AM

Hi,

Worked like a charm, thank you!! I did not know about that assumption.

I think that the behaviou that you describe is correct. The label filepath is computed as a concatenation of the input argument and the label argument.

Best!

Options: ReplyQuote
Re: Item labels in GoKrimp not working?
Date: January 12, 2019 08:42PM

Glad it works.

I will update the documentation soon to make this more clear ;-)

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.