Show all posts by user

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto: Forum List•Message List•New Topic•Log In

Pages: Previous 1 234 5 ...Last Next

Current Page: 3 of 67

Results 61 - 90 of 2010

2 years ago

webmasterphilfv

61. Re: Analyze text in other idioms

Hi, I think I have answered that question already somewhere else on the blog or by e-mail. But just for the record : Yes. The library can be used with language that have English-like punctuation like French, Spanish and similar languages.
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

62. Re: My father has died. Data mining 1 online test in 5 days. I pay well

Sorry about the bad news... Wish you to heal well in these difficult times.
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

63. Re: The problem has been sovled.

Hi, I am glad to hear that! Best regards
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

64. Re: MLiSE 2021 workshop - Machine Learning in Software Engineering @ PKDD 2021 (DEADLINE EXTENDED TO 15th JULY!) hot smiley

Only one week left to submit your paper !!!
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

65. Re: Evaluation of the generated patterns

Hi, I see. There are many measures for association rules. I have ever seen some kind of survey papers listing over 15 different measures. Which one is the best? I think it depends on your application and data. In some books like "Introduction to data mining" by Tan & Kumar, there is a chapter about association rule mining that talk about the evaluation of patterns, and they
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

66. Re: Evaluation of the generated patterns

Hi, Thanks for using the software. In pattern mining, an algorithm will find the patterns that you ask for. This means that if you use an algorithm like Apriori to find the frequent patterns, then Apriori will give you exactly that. There are different measures that can be used to find patterns like the support, lift, confidence, etc. Each measure has some advantages and disadvantages.
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

67. Re: PPDM algorithms and Unable to download PPSF(PRIVACY-PRESERVING AND SECURITY FRAMEWORK)

Hi, That is great. Thanks for sharing. So if anyone needs, they can find it Best, Philippe
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

68. Re: PPDM algorithms and Unable to download PPSF(PRIVACY-PRESERVING AND SECURITY FRAMEWORK)

Hi, I have not integrated these algorithms in SPMF. And indeed, I see that the download link on the website of PPSF is down. I have spent about 30 min to check my phone and computer to see if I still have a copy of the ppsf.jar file because I have downloaded it before but I cannot find it. Maybe on some USB somewhere? I will have a look. If you find it, you can also let me know here. The
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

69. SPMF 2.48 - Two new algorithms! hot smiley

Hi all, This is to let you know that SPMF 2.48 is out, with two new algorithms: - the HUIM-SPSO algorithm for mining high utility itemsets using Set-based Particle Swarm Optimization (thanks to Wei Song and Junya Li for providing the original code) - the NEclatClosed algorithm for mining frequent closed itemset (thanks to Nader Aryabarzan) Philippe
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

70. Re: MLiSE 2021 workshop - Machine Learning in Software Engineering @ PKDD 2021 (DEADLINE EXTENDED TO 15th JULY!) hot smiley

Looking forward to your papers. If you have any questions about the workshop, feel free to contact with me. For example, if you are not sure if the scope of the workshop is OK for your paper, we can discuss about it. :-) Best regards, Philippe
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

71. MLiSE 2021 workshop - Machine Learning in Software Engineering @ PKDD 2021 (DEADLINE EXTENDED TO 15th JULY!) hot smiley

==== Call for papers MLiSE 2021 workshop @ PKDD 2021 ======= ** 1st International Workshop on Machine Learning in Software Engineering ** in conjunction with ECML PKDD 2021 13th September, virtual conference Scope ===== Software engineering (SE) is about methodologies and techniques for building high quality software systems. However, modern software systems are becoming larger and m
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

72. New website: The Pattern Mining Wiki (alpha)

Hi all, This is to let you know that I have created a wiki on pattern mining, named The Pattern Mining Wiki Currently, The Pattern Mining Wiki is quite simple as I add content only when I have some free time... and these days I am quite busy. Hence, you may consider this to be in an "alpha" phase where much remains to be done. There is already a lot of content about pattern mini
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

73. Re: Interestingness of patterns.

Dear Deniz, About the statistical testing in pattern mining, there are some papers about this. The main idea is the following. Lets say that we want to find frequent patterns in the data. A pattern could be : {drink_tea, liver_cancer, drink_alcool} But this pattern maybe just appear frequently by chance, not really because there is a strong correlation between these items. As a soluti
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

74. Re: HUIM Datasets

Waheed Wrote: ------------------------------------------------------- > Thanks for valued response. > > Actually, In the following paper I have found the > cutoff utility concept which is the product of > minSup and external utility. Also, authors > justified the external and internal utility values > like this "The internal and external utilities of > the i
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

75. Re: Interestingness of patterns.

Good evening, Yes, there are indeed many pattern types (itemset, sequential patterns, episodes, etc.), and also many different measures (support, utility etc.) that can be used to select patterns. How to know if a pattern is interesting? Some criterias can be subjective (more like an opinion). For example, I discover that {bread,milk} is purchased by many customers but I know this alread
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

76. Re: HUIM Datasets

Hi, Yes, the format is: <item1_id> <item2_id> ... <item_n_id>:<transaction utility>:<item1_utility> <item2_utility> ... <item_n_utility> So, yes, the internal and external utility values are not explicitly represented in these datasets. The file just indicate the utility values, which arethe products of the internal and external utility values.
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

77. CFP - DASFAA 2022 - 27th International Conference on Database Systems for Advanced Applications

DASFAA2022: Call for Papers and Proposals The 27th International Conference on Database Systems for Advanced Applications (DASFAA-2022), April 11-14, 2022, Hyderabad, India. https://www.dasfaa2022.org/ Hosted by IIIT Hyderabad, India (Online Conference) The DASFAA is one of the longest established and leading international conferences and provides a leading international forum for discussin
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

78. SPMF 2.47 is out

Hi all, This is to let you know that I have released a new version of SPMF this morning!!! There are 8 new algorithms: HUIM-HC and HUIM-SA for mining high utility itemsets using Hill Climbing and Simulated Annealing (see this paper for details) LPP-Growth, LPPM_Breadth and LPPM_depth for mining local periodic patterns (more details in the research paper and powerpoint) Algorithm
Forum: The Data Mining / Big Data Forum

2 years ago

webmasterphilfv

79. Re: Undirected graph dataset name or source

I do not know. But I am curious what this format mean? Is it: Vertex ID, vertex label, vertex ID, vertex label, direction ? Best regards,
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

80. HUIM-HC and HUIM-SA: very efficient algorithms for approximate high utility itemset mining

Hi all, This is to let you know that we have just proposed two new algorithms for approximate high utility itemset mining: - HUIM-HC : using Hill-Climbing - HUIM-SA : using Simulated Annealing If you want to read the paper will appear in the ACM TMIS journal. The PDF is here. In the paper we show that HUIM-SA is generally faster than previous approximate algorithms (HUIM-GA, HUIM-PSO,
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

81. Re:

MLiSE 2021 workshop on Machine Learning in Software Engineering

I am glad to announce that all the papers will be published by Springer in a special book "Machine Learning in Software Engineering"
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

82. Re: Standard known solution for Utility transaction dataset

Hi, If you want to know the number of HUIs in a dataset, you can run the exact algorithms like EFIM, FHM, and HUI-Miner. All of these algorithms are complete, which means that they always find ALL the high utility itemsets. So if you want to know how many HUIs for some minutil value, you can just use one of those algorithms and you will know. In SPMF, there are also some approximate algorit
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

83. Re: Issues in using SPMF

Hi, Sorry to reply late. I was busy and did not check the forum for a few days. I did not check the code but this is the original implementation of HUIM-ABC. Thus, this is how HUIM-ABC is. But it is still possible that something can be optimized. If you find some optimizations or some bug, you may let me know. Or you can also try contacting the authors of HUIM-ABC if you have some question
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

84. Re: Difference between equivalence class and sequence prefix tree

Glad it is clear. Best regards, Philippe
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

85. Re: Issues in using SPMF

Hi, I see. It is just that the algorithm is slow.. Generally, when the minutil value is set to a small value, the algorithms will become slower and when minutil is set to a high value, it is faster. For example, on that dataset, if i use a higher value, the algorithm will terminate. But if the value is too low, it will take a long time to terminate. There is not much do to about this...
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

86. Re: Issues in using SPMF

Hi, How have you set the parameter? What is the error that you had? There is no such thing as a maximum number of lines for these algorithms. The search space depends on the file, yes, but also on how you set the parameter(s). If you set the minutil threshold to a very low value, maybe the algorithm will be slower. If you show the error that you got and tell me the parameters, then I co
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

87. Re: Difference between equivalence class and sequence prefix tree

Dear Martin, To design an algorithm for pattern mining, a key issue is to represent the search space in such way that it can be explored without missing any patterns and that the same pattern will not be visited twice (that would waste time). Both SPAM and SPADE are similar but they do have some differences. First, a similarity is that both SPAM and SPADE use a vertical database represent
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

88.

UDML 2021: 4th Workshop on Utility Driven Mining and Learning @ICDM 2021

Hi all, I am also happy to announce that the UDML 2021 workshop is back this year again at ICDM! That workshop is about the concept of utility in data mining and machine learning. It is a good workshop to submit your papers related to pattern mining (itemset mining, sequential pattern mining, process mining and other topics), as well as machine learning papers. The workshop is EI indexed
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

89.

MLiSE 2021 workshop on Machine Learning in Software Engineering

Dear all, This is to let you know that I am co-organizing a new workshop on data mining and machine learning in software engineering called MLiSE 2021, at the PKDD 2021 conference. The deadline for submiting papers is the 23rd June. Looking forward to see your papers. It is open to papers on a quite wide range of topics! See the MLiSE 2021 website for more details. Philippe
Forum: The Data Mining / Big Data Forum

3 years ago

webmasterphilfv

90. SPMF 2.46 - the NOSEP algorithm

Hi all, This is to let you know that a new version of SPMF has been released (2.46), which included a Java version of the original NOSEP algorithm for discovering non-overlapping sequential patterns in sequences. The implementation was provided by the team of Prof. Youxi Wu. See more details on the SPMF website: http://www.philippe-fournier-viger.com/spmf/ Philippe
Forum: The Data Mining / Big Data Forum

Pages: Previous 1 234 5 ...Last Next

Current Page: 3 of 67

Goto: Forum List•Message List•New Topic•Log In