Show all posts by user

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto: Forum List•Message List•New Topic•Log In

Pages: Previous First...62 636465 66 ...Last Next

Current Page: 64 of 67

Results 1891 - 1920 of 2010

12 years ago

webmasterphilfv

1891. Re: Data format for Sequential Patterns with time-series

By the way, there is some algorithms that are specialised for a single sequence like WINEPI or MINEPI for what they call "episode mining". But this is a different problem than sequential pattern mining
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1892. Re: Data format for Sequential Patterns with time-series

Hello Yogi, Welcome to the forum. Sequential pattern mining like GSP take as input : (1) a sequence database and (2) a parameter called "minsup". A sequence database is a set of sequences. The goal of sequential pattern mining is to find subsequences that are common to several sequences. In your case, I think that you have only a single sequence. I will give you an e
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1893. Re: Ph.D project in data mining

^^ These are some good topics! Graph mining are a very good topic in my opinion. It may need a little bit more mathematics than some other topics. But it should be interesting. In particular, social network mining is very popular now. Multimedia is also very promising. For example, mining video stream from camera to detect abnormal behavior, clustering similar music, discovering pattern
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1894. Re: [JAVA] how to calculate the maximum memory usage of a data mining algorithm

Hello Dvijesh, >>>> this memory show that how much RAM used by JVM? The method checkMemory() is for checking how much memory is used by the program running inside the JVM at a particular moment. >>> And if it is RAM than How it show that max memory is used by program because if program execute all the steps then RAM will be realesed..... To know what is the max mem
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1895. Re: SPAM vs PrefixSpan

Hi Dvijesh, In my opinion, it would not make sense to use only a vertical database with prefixspan because PrefixSpan need to scan the sequences, which cannot be made efficiently with a vertical database. Vertical databases are better for candidate generation like in ECLAT, SPAM or APRIORITID. But perhaps that it would be possible to do something like FPGrowth. FPGrowth uses a horizontal d
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1896. Re: Apriori algorithm in C# or Java

Hi Priyanka, Please check this page for Java source code for Apriori. Philippe
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1897. Re: SPAM vs PrefixSpan

Hello Dvijesh, Here are my quick observations about SPAM and PrefixSpan. Maybe I forgot some elements. What is good about SPAM: - it uses a bitmap representation which is memory efficient. There is several optimizations that are posible with bitmaps. - it uses a vertical representation of the database so that the database only need to be scanned once to create the vertical representation.
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1898. Big Data, the next Big bubble?

Hello, Many people in the data mining industry are talking about a new buzzword "Big Data", which is just a new word to talk about how to address the scalability problem of data mining algorithms for handling huge amount of data, and their storage. Is this new buzzword a big bubble? That is what this person seems to think: http://mlcwideangle.exbdblogs.com/2011/10/05/bursting-th
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1899. Re: Analysis of mining Frequent sequential pattern algorithm's

Hello Vivek, I tried to follow the paper for implementing SPAM. In my implementation there is a few differences due to optimization and some design decisions. But the main idea of the S step is the same. I will explain to you the createNewBitmapSStep method in the Bitmap class, which performs the main part of the S-Step. The S-STEP is shown in figure 4 of the paper. It consists of doin
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1900. Re: Analysis of mining Frequent sequential pattern algorithm's

Hello Vivek, I tried in Netbeans and it works on my computer. I created a new projects and then i copied all the files of spmf081.zip into the src folder of the project. Then, in Netbeans i right click on MainWindow to launch the GUI from Netbeans and I selected SPAM and the contextPrefixSpan.txt file. It worked. Maybe you did not copy all the files in netbeans. The "no such m
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1901. Re: Analysis of mining Frequent sequential pattern algorithm's

Hello Vivek, To use an absolute minsup for SPAM was just an implementation decision that I made. The reason why I did that was to avoid an extra database scan and thus to make the algorithm faster. But in the latest version of SPMF that you can download from the website, I have improved the SPAM implementation and it now use a relative minsup just like the PrefixSpan implementation. Hop
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1902. Re: SPMF 0.81 is released! - Improved SPAM implementation

Hello Derek, I don't have these algorithms and I do not plan to implement them anytime soon. GSP and AprioriAll are actually slower than newer algorithms like PrefixSpan and SPAM. Therefore, I recommend to use these algorithms instead. For VB, i'm not interested either. I don't think that it is the best langage for data mining. Best, Philippe
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1903. PAKDD 2012 Data Mining Competition announced!

The 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is pleased to organize a data mining competition. The competition is divided into two categories: 1. Open category (open to both academia, industry and students). The tasks for this category involve predicting customer churn and win-back for a large telecommunication company. The prizes for the competition are spons
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1904. VLDB/Cloud-I 2012 - 1st International Workshop on Cloud Intelligence (Cloud-I 2012)

*************************************************************** 1st International Workshop on Cloud Intelligence (Cloud-I 2012) *************************************************************** In conjunction with VLDB 2012, Istanbul, Turkey, August 31, 2012 http://eric.univ-lyon2.fr/cloud-i/ Contact email: cloud-i@eric.univ-lyon2.fr *** NEW *** The best accepted papers will be invited
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1905. Call for Springer Book Chapters: Security and Privacy preserving in Social Networks (extended chapter description proposal deadline)

########################################################################################################################## Book on Security and Privacy preserving in Social Networks To be published by Springer Verlag (Lecture Notes in Social Networks) http://dbconf.u-bourgogne.fr/Springer/ Description and Objectives -----------
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1906. Re: Top conferences in data mining (ranking)

The top data mining conferences and journals according to Google (automatically generated): http://scholar.google.com/citations?hl=en&view_op=search_venues&vq=data+mining 1. ACM SIGKDD International Conference on Knowledge discovery and data mining 40 56 2. IEEE International Conference on Data Mining (ICDM) 33 43 3. ACM International Conference on Web Search and Data Mining 26 43
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1907. Re: Analysis of mining Frequent sequenctial pattern algorithm's

Hi Tisonet. Very interesting. Thanks for sharing your results. We can also observe than BIDE is always faster than PrefixSpan and that LAPIN-SPAM is always faster than SPAM in your experiment. Philippe
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1908. Re: Analysis of mining Frequent sequenctial pattern algorithm's

I have posted a new version of my SPAM implementation (SPMF v0.81). I have removed the BIT_PER_SECTION variable. Now, the number of bit per sequence is variable. The implementation is therefore more memory efficient and faster.
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1909. SPMF 0.81 is released! - Improved SPAM implementation

Hello everyone, This is just to let you know that I have published a new version of my SPMF open source data mining tool (0.81). In this version, I have improved the SPAM implementation. Before my SPAM implementation used a fixed number of bits for each sequence. I have modified the code so that it now uses a variable number of bits for each sequence. This is the main difference. The i
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1910. [JAVA] [C#] how to calculate the maximum memory usage of a data mining algorithm

Hello, Today, I received a question today about how to compute the maximum memory of a data mining algorithm like FPGrowth in Java. It is very simple. I do like that. I add a variable : double maxMemory = 0; Then I copy and paste this method: private void checkMemory() { double currentMemory = ((double) ((double) (Runtime.getRuntime() .totalMemory
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1911. Six computer science courses for free online. Has to register before April 16th 2012 at Udacity...

Hello everyone, I found an interesting website called Udacity.com. It offers some free computer science courses that you can take online and that start April 16. They take 6 weeks and there is some homeworks and exam. The courses are: - CS 101 : How to build a search engine (for beginner, starts April 16 - 7 weeks) - CS 212 : Design of computer programs (starts April 16 by Peter Norvig
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1912. Re: SPMF 0.80 is released!

Thanks!
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1913. Re: 2012 [French] - call for papers

Call for Papers EGC-M 2012: The 3rd International Conference on the Extraction and Management of Knowledge - Maghreb EGC-M 2012 November 12, 2012 November 15, 2012 Hammamet, Tunisia http://egcm.uae.ma/index.php/en/ Objectives: EGC-M is an annual leading International Conference on Extraction and Management of Knowledge. The purpose of the conference is to bring together rese
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1914. SPMF 0.80 is released!

Hello everyone, I have released a new version of the SPMF data mining software. This new version provides the following main changes: - I have improved the user interface thanks to N. Kali ! - I have cleaned the code of several files to remove some unused methods. - I have changed the license. The source code is now under the GNU GPL v3 open-source license. This will help to make the soft
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1915. Re: 2012 [French] - call for papers

6ème Colloque International sur l’Analyse Statistique Implicative A.S.I. 6 http://sites.univ-lyon2.fr/asi6 Caen (France) 7-10 Novembre 2012 Université de Caen Appel à communication Date limite de soumission : 15 avril 2012 Jean-Claude Régnier, Président du Comité scientifique et de programme (UMR5191 - ICAR - Université de Lyon) jean-claude.regnier@univ-lyo
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1916. 2012 [French] - call for papers

5èmes Journées thématiques "Apprentissage Artificiel & Fouille de Données" AAFD'12 28 et 29 juin 2012 Université Paris 13, Institut Galilée ________________________________________ Le LIPN et le L2TI, avec l'appui du groupe Data Mining et Apprentissage de la SFdS (Société Française de Statistique), organisent les 28 et 29 juin 2012, la cinquième édition du colloque
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1917. Re: How to use IBM_Quest_data_generator to genetater web logs

Hi Tisonet, Thanks for sharing that! Philippe
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1918. Top data mining mistakes

Hello, I have found this great article describing some "top data mining mistakes". http://www.sas.com/news/sascom/2010q3/column_tech.html Philippe
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1919. Re: Top 10 Data mining problems

Thanks for sharing. Very interesting article.
Forum: The Data Mining / Big Data Forum

12 years ago

webmasterphilfv

1920. Re: Most popular programming langages

Hi Dvijesh, You made some good observations! I agree about Logo. I also thought that this language was dead. I did high school in the 1990s and I also thought that it had disappeared at that time . It is probably still used in some schools, I guess... Yes. R seems very popular. It is popular in data mining. I'm not familiar with R. But I think that it is useful for people interested i
Forum: The Data Mining / Big Data Forum

Pages: Previous First...62 636465 66 ...Last Next

Current Page: 64 of 67

Goto: Forum List•Message List•New Topic•Log In