The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Problem with recreating results from MRCPPS paper for FIFA dataset
Posted by: Neni
Date: October 27, 2020 10:58AM

I need help recreating results from paper "Discovering Rare Correlated Periodic Patterns in Multiple Sequences" (MRCPPS algorithm) for FIFA dataset. I grouped 3 itemsets per transaction and set maxsup=5. For minRa, minBond and maxStd = 0 i have 6 443 patterns found, and in Table 3. of paper it's 21 639 patterns. Am I missing something in database processing or something else?

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Date: October 27, 2020 05:11PM

Hi Neni,

Thanks for your message. I am not sure at the moment. The student who did this paper has graduated a year ago. The FIFA dataset is on the SPMF website as you know. By looking at the code, I think he used these two parameters to group transactions:

// whether convert the transaction database to a sequential database or not
boolean needGroup = false;

// if needGroup = true, how many transactions can be grouped to make a sequence
int groupNum = 0;

Have you tried setting them to set needGroup = true and groupNum = 3 for FIFA? And still the number of patterns is not the same?

Maybe the student also did some other preprocessing step not explained in the paper or maybe he did some mistake when preparing the results... I could try to reach him to ask about it.


Best
Philippe

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Posted by: Neni
Date: October 28, 2020 09:18AM

Thank you very much,
This solved the problem. I previously created separate program to group transactions by 3, because in his code he used needGroup=false and I couldn't get the same result. I am trying to recreate his results, so I can write new paper based on his. Thank you very much again, I am very greateful

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Date: October 28, 2020 11:27PM

Hi,

I see. Good it works!

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Posted by: Neni
Date: December 26, 2020 11:59AM

Hi,
I have a question about possible research topic based on MRCPPS paper. In the "Conclusion" section of the paper it's proposed adapting algorithm for periodic sequential rules in sequences. Does that mean an algorithm that finds rare correlated periodic sequential rules in sequences, or just periodic sequential rules. Do rare sequential rules have importance or just top k rules?

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Date: December 27, 2020 07:12AM

Hi,

Thanks for your question. I think you could start from the most simple case which is the frequent periodic sequential rules, and you can think about the case of rare rules as an extension.

I think both could be interesting. The main steps would be to define the problem clearly, define an algorithm, and then try it on some idea and look at the patterns that you have found to see what kind of patterns you can discover.

Best regards,

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Posted by: Neni
Date: December 27, 2020 10:43AM

Thank you for directions. It is very helpful.

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Posted by: Neni
Date: January 22, 2021 11:11PM

Hi,
I started with a topic of periodic sequential rule mining and adapted RuleGrowth algorithm for that purpose, but its only one method and constraint different than the original algorithm. I was wandering is it a good idea to search for periodic sequential rules in each sequence individualy, like in sequential patterns common in multiple sequences, but for seq rules. Does that idea make some sence for a papper?

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Date: January 25, 2021 04:53PM

Hi,

Good. I think it could make a paper. But to the important is to write some good motivation for the problem that you are proposing. You should try to find some scenario like about analyzing customer data or analyzing text and explain why finding these periodic sequential rules is interesting. In the introduction of your paper, you can say other algorithms like MPFPS can find periodic patterns in a sequence database, but they dont have the confidence... so they may find many patterns that are weakly correlated... something like that... so in this paper, you want to find rules where there is also the confidence. Hence this is the reason for adapting rule growth. And you can talk about why it has many potential applications like for text, shopping etc.

Then, in your experimental evaluation, you can do some basic experiments like time , memory etc. But you can also show some patterns that you found in the real database and explain why these patterns are interesting. This will make your paper look better.

If you do this, and write well also the rest of your paper, it can make a good paper.

Best regards,

Philippe

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Posted by: Neni
Date: January 27, 2021 02:01AM

Hi,
Thank you very much, your answers have been most helpful to me.
From your writting, I understood that is better to go with sequence rules that periodically appear inside each sequence, opposed to periodically appearing in the whole dataset.
I'm writting this paper by myself and sometimes it can be confusing, so thank you also for additional instructions, they really helped me a lot.

Options: ReplyQuote
Re: Problem with recreating results from MRCPPS paper for FIFA dataset
Date: January 27, 2021 04:52AM

Hi,

Yes, I think it is more useful to find periodic patterns that are periodic in each sequence. Like in this paper:

http://www.philippe-fournier-viger.com/2019_IS_periodic%20patterns%20multiple%20sequences.pdf

If you look at Fig. 1, I think that doing like in Fig. 1 (b) is better than doing like in Fig. 1 (a). I think it would have more applications like for shopping where you have multiple customers and want to find rules that are periodic for several customers.

Best regards,



Edited 1 time(s). Last edit at 01/27/2021 04:53AM by webmasterphilfv.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.