Evaluation of k-means and hierarchical clustering without train and testing dataset?
Posted by:
DRam
Date: November 16, 2017 12:33AM
Hi,
I’m a data mining research scholar. I have implemented k-means and hierarchical clustering using Java, without train and testing data. It just computes measures to group similar combination of words. My input data is 3 words combination (for ex, new game changer, biggest game changer, new game version….), nearly 2000 data, which is extracted from information retrieval concept. Now the problem is I don’t know how to evaluate it. Generally, we can use human expert evaluation procedure. But I’m not sure, is it used for clustering?
Especially, I don’t know how to evaluate for the Hierarchical cluster. because it group as more than two combinations(for ex {2,4,3}=>{{2,3},4}. So please help me to clarify my doubts.
1. Can we cluster the data without train and testing data?
2. How to give this words combination as input for data mining tools like weka arff file?
3. How to convert this words combination as training and testing data ( or I need to process this data for training or testing, pls give example training and testing format using this combination of words)?
4. How to measure observed and expected values for clusters evaluation?
Pls do this needful, thanks in advance.