The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Survey: what is your favorite data mining algorithm?
Date: March 18, 2012 05:10PM

Hello everyone,

I would like to ask you: What is your favorite data mining algorithm?

Personally, I have a few favorites:
- the PrefixSpan algorithm for sequential pattern mining. This algorithm is relatively simple. There is also some good ideas in its design such as pseudo-projection and pattern-growth that makes it very efficient.
- the Apriori algorithm for frequent itemset mining. It is not very efficient. But it is simple and it has inspired hundreds of researchers afterward and many algorithms are still based on it.
- the K-Nearest Neighboor algorithm. This algorithm can be implemented using a KD-Tree data structure. I think that the KD-Tree data structure is very interesting because it allows to calculate the nearest neighboor very efficiently.

What are your favorite data mining algorithms?

Phil



Edited 1 time(s). Last edit at 03/18/2012 05:12PM by webmasterphilfv.

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Posted by: Dvijesh88
Date: March 18, 2012 09:43PM

PrefixSpan - I thing still it have so many lacks but it is really good algorithm and i thing best algorithm for the sequential mining

Apriori - because it is really basic algorithm and use to understand any field of data mining.

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Posted by: tisonet
Date: March 19, 2012 09:53AM

Hi, I love the idea of PrefixSpan algorithm, too.
I dont know why it is not in TOP 10 DATA MINING ALGORITHM.

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Date: March 21, 2012 09:38AM

tisonet Wrote:
-------------------------------------------------------
> Hi, I love the idea of PrefixSpan algorithm, too.
>
> I dont know why it is not in TOP 10 DATA MINING
> ALGORITHM.


I think that it is because PrefixSpan is not as famous as some other algorithms.

The list of top 10 algorithms was established in 2006. If the same list was done again today, it would probably be different. Maybe that some new algorithms would be included as well!



Edited 2 time(s). Last edit at 03/21/2012 09:40AM by webmasterphilfv.

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Posted by: Dvijesh88
Date: March 22, 2012 03:44AM

Yes u are right
PrefixSpan is not much popular compare to some of the basic algorithm

I love Prefixspan because this algorithm have every thing.
----> Simple level-by-level projection. (easy for the beginner to understand)
----> Bi-Matrix (S-Matrix) Projection (Easy and tricky for those who really working in that filed.)
----> use psuedoprojection in both above level (advance level)


Still it can be better. but it just matter of research

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Date: March 22, 2012 08:30AM

Good analysis. I agree with you.

Yes. There is many possibilities for improving it and adding more features.

By the way, I have read your email. I will give you answer you later today or tomorrow. smiling smiley

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Posted by: Dvijesh88
Date: March 22, 2012 08:59PM

ok sir
no problem. i can wait

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Posted by: Baxter
Date: March 19, 2012 02:18PM

My favorite algorithm is definitely DBSCAN (for density-based clustering). thumbs up

Very smart to cluster data with density instead of other approaches. That paper is very good.

http://en.wikipedia.org/wiki/DBSCAN

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Posted by: vivek basati
Date: March 22, 2012 04:19AM

hi

My favorite too is prefix span but in SPMF frame work its implemented only to small file with integers and strings.

can you upload a source code for some large data bases like transactional data???

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Date: March 22, 2012 08:26AM

vivek basati Wrote:
-------------------------------------------------------
> hi
>
> My favorite too is prefix span but in SPMF frame
> work its implemented only to small file with
> integers and strings.
>
> can you upload a source code for some large data
> bases like transactional data???


Hello,

What is the problem with large files? Did you run out of memory?

If the problem is the memory, you should try to increase the memory that the Java virtual machine can use. By default the maximum is 256 megabytes but you can use more!

Please check the instructions in the FAQ about how to increase the memory for the algorithms:

http://www.philippe-fournier-viger.com/spmf/index.php?link=FAQ.php#memory

If this does not solve the problem, then give me more details about what is the problem and I will try to help you

Thanks,
Philippe

Options: ReplyQuote
Re: Survey: what is your favorite data mining algorithm?
Posted by: Dvijesh88
Date: March 22, 2012 08:18PM

I think there should be memory problem.

can you tell me the configuration of your system?

secondly tell me which dataset you are using? IBM Synthetic data generator?
I want to know the parameters

- how many customer used in data set
- How many transaction per customer
- How many items per transaction

Actually i was facing same problem. but i find out alternate way to deal with this.

one thing i want to say that program will take more time to calculate the sequences compare to described in research paper.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.