The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 

Pages: PreviousFirst...6263646566...LastNext
Current Page: 64 of 67
Results 1891 - 1920 of 2010
12 years ago
webmasterphilfv
By the way, there is some algorithms that are specialised for a single sequence like WINEPI or MINEPI for what they call "episode mining". But this is a different problem than sequential pattern mining
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello Yogi, Welcome to the forum. Sequential pattern mining like GSP take as input : (1) a sequence database and (2) a parameter called "minsup". A sequence database is a set of sequences. The goal of sequential pattern mining is to find subsequences that are common to several sequences. In your case, I think that you have only a single sequence. I will give you an e
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
^^ These are some good topics! Graph mining are a very good topic in my opinion. It may need a little bit more mathematics than some other topics. But it should be interesting. In particular, social network mining is very popular now. Multimedia is also very promising. For example, mining video stream from camera to detect abnormal behavior, clustering similar music, discovering pattern
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello Dvijesh, >>>> this memory show that how much RAM used by JVM? The method checkMemory() is for checking how much memory is used by the program running inside the JVM at a particular moment. >>> And if it is RAM than How it show that max memory is used by program because if program execute all the steps then RAM will be realesed..... To know what is the max mem
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hi Dvijesh, In my opinion, it would not make sense to use only a vertical database with prefixspan because PrefixSpan need to scan the sequences, which cannot be made efficiently with a vertical database. Vertical databases are better for candidate generation like in ECLAT, SPAM or APRIORITID. But perhaps that it would be possible to do something like FPGrowth. FPGrowth uses a horizontal d
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hi Priyanka, Please check this page for Java source code for Apriori. Philippe
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello Dvijesh, Here are my quick observations about SPAM and PrefixSpan. Maybe I forgot some elements. What is good about SPAM: - it uses a bitmap representation which is memory efficient. There is several optimizations that are posible with bitmaps. - it uses a vertical representation of the database so that the database only need to be scanned once to create the vertical representation.
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello, Many people in the data mining industry are talking about a new buzzword "Big Data", which is just a new word to talk about how to address the scalability problem of data mining algorithms for handling huge amount of data, and their storage. Is this new buzzword a big bubble? That is what this person seems to think: http://mlcwideangle.exbdblogs.com/2011/10/05/bursting-th
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello Vivek, I tried to follow the paper for implementing SPAM. In my implementation there is a few differences due to optimization and some design decisions. But the main idea of the S step is the same. I will explain to you the createNewBitmapSStep method in the Bitmap class, which performs the main part of the S-Step. The S-STEP is shown in figure 4 of the paper. It consists of doin
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello Vivek, I tried in Netbeans and it works on my computer. I created a new projects and then i copied all the files of spmf081.zip into the src folder of the project. Then, in Netbeans i right click on MainWindow to launch the GUI from Netbeans and I selected SPAM and the contextPrefixSpan.txt file. It worked. Maybe you did not copy all the files in netbeans. The "no such m
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello Vivek, To use an absolute minsup for SPAM was just an implementation decision that I made. The reason why I did that was to avoid an extra database scan and thus to make the algorithm faster. But in the latest version of SPMF that you can download from the website, I have improved the SPAM implementation and it now use a relative minsup just like the PrefixSpan implementation. Hop
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello Derek, I don't have these algorithms and I do not plan to implement them anytime soon. GSP and AprioriAll are actually slower than newer algorithms like PrefixSpan and SPAM. Therefore, I recommend to use these algorithms instead. For VB, i'm not interested either. I don't think that it is the best langage for data mining. Best, Philippe
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
The 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is pleased to organize a data mining competition. The competition is divided into two categories: 1. Open category (open to both academia, industry and students). The tasks for this category involve predicting customer churn and win-back for a large telecommunication company. The prizes for the competition are spons
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
*************************************************************** 1st International Workshop on Cloud Intelligence (Cloud-I 2012) *************************************************************** In conjunction with VLDB 2012, Istanbul, Turkey, August 31, 2012 http://eric.univ-lyon2.fr/cloud-i/ Contact email: cloud-i@eric.univ-lyon2.fr *** NEW *** The best accepted papers will be invited
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
########################################################################################################################## Book on Security and Privacy preserving in Social Networks To be published by Springer Verlag (Lecture Notes in Social Networks) http://dbconf.u-bourgogne.fr/Springer/ Description and Objectives -----------
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
The top data mining conferences and journals according to Google (automatically generated): http://scholar.google.com/citations?hl=en&view_op=search_venues&vq=data+mining 1. ACM SIGKDD International Conference on Knowledge discovery and data mining 40 56 2. IEEE International Conference on Data Mining (ICDM) 33 43 3. ACM International Conference on Web Search and Data Mining 26 43
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hi Tisonet. Very interesting. Thanks for sharing your results. We can also observe than BIDE is always faster than PrefixSpan and that LAPIN-SPAM is always faster than SPAM in your experiment. Philippe
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
I have posted a new version of my SPAM implementation (SPMF v0.81). I have removed the BIT_PER_SECTION variable. Now, the number of bit per sequence is variable. The implementation is therefore more memory efficient and faster.
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello everyone, This is just to let you know that I have published a new version of my SPMF open source data mining tool (0.81). In this version, I have improved the SPAM implementation. Before my SPAM implementation used a fixed number of bits for each sequence. I have modified the code so that it now uses a variable number of bits for each sequence. This is the main difference. The i
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello, Today, I received a question today about how to compute the maximum memory of a data mining algorithm like FPGrowth in Java. It is very simple. I do like that. I add a variable : double maxMemory = 0; Then I copy and paste this method: private void checkMemory() { double currentMemory = ((double) ((double) (Runtime.getRuntime() .totalMemory
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello everyone, I found an interesting website called Udacity.com. It offers some free computer science courses that you can take online and that start April 16. They take 6 weeks and there is some homeworks and exam. The courses are: - CS 101 : How to build a search engine (for beginner, starts April 16 - 7 weeks) - CS 212 : Design of computer programs (starts April 16 by Peter Norvig
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
12 years ago
webmasterphilfv
Call for Papers EGC-M 2012: The 3rd International Conference on the Extraction and Management of Knowledge - Maghreb EGC-M 2012 November 12, 2012 – November 15, 2012 Hammamet, Tunisia http://egcm.uae.ma/index.php/en/ Objectives: EGC-M is an annual leading International Conference on Extraction and Management of Knowledge. The purpose of the conference is to bring together rese
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello everyone, I have released a new version of the SPMF data mining software. This new version provides the following main changes: - I have improved the user interface thanks to N. Kali ! - I have cleaned the code of several files to remove some unused methods. - I have changed the license. The source code is now under the GNU GPL v3 open-source license. This will help to make the soft
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
6ème Colloque International sur l’Analyse Statistique Implicative A.S.I. 6 http://sites.univ-lyon2.fr/asi6 Caen (France) 7-10 Novembre 2012 Université de Caen Appel à communication Date limite de soumission : 15 avril 2012 Jean-Claude Régnier, Président du Comité scientifique et de programme (UMR5191 - ICAR - Université de Lyon) jean-claude.regnier@univ-lyo
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
5èmes Journées thématiques "Apprentissage Artificiel & Fouille de Données" AAFD'12 28 et 29 juin 2012 Université Paris 13, Institut Galilée ________________________________________ Le LIPN et le L2TI, avec l'appui du groupe Data Mining et Apprentissage de la SFdS (Société Française de Statistique), organisent les 28 et 29 juin 2012, la cinquième édition du colloque
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hi Tisonet, Thanks for sharing that! Philippe
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hello, I have found this great article describing some "top data mining mistakes". http://www.sas.com/news/sascom/2010q3/column_tech.html Philippe
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Thanks for sharing. Very interesting article.
Forum: The Data Mining / Big Data Forum
12 years ago
webmasterphilfv
Hi Dvijesh, You made some good observations! I agree about Logo. I also thought that this language was dead. I did high school in the 1990s and I also thought that it had disappeared at that time . It is probably still used in some schools, I guess... Yes. R seems very popular. It is popular in data mining. I'm not familiar with R. But I think that it is useful for people interested i
Forum: The Data Mining / Big Data Forum
Pages: PreviousFirst...6263646566...LastNext
Current Page: 64 of 67

This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.