The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
SPMF : Implementation of MSApriori with Hashing to increase speed of execution
Posted by: sjp12012a
Date: July 05, 2018 10:03AM

Hi,

I am working on an CS assignment problem which was referred to use SPMF libraries. In order to increase the speed of computation of MSApriori, I used hashing technique in the first pass of the transaction data scan.

I am aware that few others have already used this technique before. Please let me know how can I discuss this further.

I can submit my analysis for review here.

Regards,
Srini



Edited 1 time(s). Last edit at 07/05/2018 10:04AM by sjp12012a.

Options: ReplyQuote
Re: SPMF : Implementation of MSApriori with Hashing to increase speed of execution
Date: July 06, 2018 05:35PM

Hello,

Sure, if you want to discuss this in the forum, you can share details. Is it improving the performance by a great amount? If so, your improvement could be integrated in the SPMF library and you could become a contributor.

Best regards,

Philippe

Options: ReplyQuote
Re: SPMF : Implementation of MSApriori with Hashing to increase speed of execution
Posted by: Srini
Date: July 08, 2018 08:09PM

Hi,

It improved quite a bit, from perpetual execution to supplying a result in 15 secs on the data set that I was using.

I think the key decision is to chose an appropriate Hash function for an arbitrary number of integers. I have used a Java Bitset object for doing the same in my assignment. Please put down your suggestions on this.

I am giving the URL here where I uploaded the assignment with analysis:

https://drive.google.com/drive/folders/1GIju0UZsQGjaDqRA_HUvrHA4CebJgVNq?usp=sharing

I could work on making this generic if approved and would like to contribute to SPMF code.

Regards,
Srini.

Options: ReplyQuote
Re: SPMF : Implementation of MSApriori with Hashing to increase speed of execution
Posted by: Srinivas
Date: June 10, 2019 04:14AM

Dear Prof. Philippe,

Request to reconsider this change in the MSApriori:

Please consider this improved implementation of MSApriori algorithm for cases (eg when LS=0) where the no. of. item sets in C2, C3... upto Ck are very large. The existing implementation from the textbook WebMining (by Bing Liu et al) is ineffective in time even though its useful for space optimization.

Additional details along with the modified Java source code is available here:

https://github.com/cs17emds11029/MSApiori

Total modified files in SPMF: 3

I would be very thankful if you can accept this change and make me a contributor to the SPMF library.

The above optimization technique is not published in any conference/journal yet.

Regards,
Srinivas.

Options: ReplyQuote
Re: SPMF : Implementation of MSApriori with Hashing to increase speed of execution
Date: June 10, 2019 05:57AM

Hi,

Sorry for the long delay. I have been on vacation as it was a national holiday last week in China. I also received your e-mail today. I will definitely add your code to SPMF to replace the old code and make you join as a contributor. Thanks for your work.

Of course, I will first test the code ;-) My plan is to release the next version of SPMF this week. There are two or three new algorithms that I will add to SPMF at the same time. Then, I will include your code and name on the website at the same time.

If you want to contribute other improvement to the code in the future, it is always welcome and we can also discuss it by e-mail.

Best regards,
Philippe

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.