The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
CPT predictions
Posted by: david
Date: January 16, 2019 06:21AM

Hello

May I ask a question on the CPT algorithm please.

I am using the code from ca.pfv.spmf.test.MainTestCPT on the "contextCPT.txt" data, but I am unable to get predictions when a sequence of length one is added (i.e. only one item).

Apologies if I misunderstand.

David


// example code

package ca.pfv.spmf.test;

import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.net.URL;
import ca.pfv.spmf.algorithms.sequenceprediction.ipredict.database.Item;
import ca.pfv.spmf.algorithms.sequenceprediction.ipredict.database.Sequence;
import ca.pfv.spmf.algorithms.sequenceprediction.ipredict.database.SequenceDatabase;
import ca.pfv.spmf.algorithms.sequenceprediction.ipredict.predictor.CPT.CPT.CPTPredictor;


public class testforweb {

public static void main(String [] arg) throws IOException{

String inputPath = fileToPath("contextCPT.txt" ) ;
SequenceDatabase trainingSet = new SequenceDatabase();
trainingSet.loadFileSPMFFormat(inputPath, Integer.MAX_VALUE, 0, Integer.MAX_VALUE);

String optionalParameters = "splitLength:6 splitMethod:0 recursiveDividerMin:1 recursiveDividerMax:5";

CPTPredictor predictionModel = new CPTPredictor("CPT", optionalParameters);
predictionModel.Train(trainingSet.getSequences());

Sequence sequence = new Sequence(0);
sequence.addItem(new Item(1));
Sequence thePrediction = predictionModel.Predict(sequence);
System.out.println("The prediction for the next symbol is: +" + thePrediction);
}

public static String fileToPath(String filename) throws UnsupportedEncodingException{
URL url = MainTestCPT.class.getResource(filename);
return java.net.URLDecoder.decode(url.getPath(),"UTF-8" ) ;
}
}



Edited 1 time(s). Last edit at 01/16/2019 07:16AM by david.

Options: ReplyQuote
Re: CPT predictions
Date: January 16, 2019 07:57AM

Hi,

I have tried and noticed this issue. Thanks for reporting it.

A quick solution is the following:

1) In the file

ca.pfv.spmf.algorithms.sequenceprediction.ipredict.predictor.CPT.CPT.CPTPredictor.Java

change the line:

if(size <= minSize) {

by:

if(size < minSize) {

2) In the file MainTestCPT.java

replace:

recursiveDividerMin:1

by:

recursiveDividerMin:0

Then, CPT will be able to do the prediction for sequence having a single event.

But I will contact the author of CPT to check that this modification is correct and does not cause other problems. Then, if everything is OK, I will update the code on the website. I just want to make sure that the modification has no side-effects before updating the code on the website. I will let you know about that later.

Thanks



Edited 1 time(s). Last edit at 01/16/2019 08:08AM by webmasterphilfv.

Options: ReplyQuote
Re: CPT predictions
Posted by: david
Date: January 16, 2019 08:08AM

Thank you for the quick response & fix Philippe

best, David

Options: ReplyQuote
Re: CPT predictions
Date: January 26, 2019 03:41AM

Hi,

Actually, the above solution was fixing the problem but increasing the runtime for predictions with more than 1 element. Thus, I have found a better way to fix the problem after discussing with the main author of CPT. I will release a new version of SPMF today or tomorrow with the fix.

Best,

Philippe

Options: ReplyQuote
Re: CPT predictions
Posted by: david_
Date: January 31, 2019 11:25AM

Great stuff, thanks Philippe

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.