Hi all,
I have a classification problem with two classes working on nominal data. I want to apply
SMOTE-N to deal with imbalanced data. However, it is not clear to me how to use SMOTE-N for generating N synthetic data for each feature vector in the minority class. SMOTE-N uses a modified version of the value difference metric (VDM) to find the k-nearest neighbors for each feature vector in the minority class and then the new minority class feature vector is generated by creating new set feature values by taking the majority vote of the feature vector in consideration and its k nearest neighbors (k-nn). But, how is this process repeated to generate multiple synthetic feature vectors for each feature vector in the minority class? The way the algorithm is stated, it seems that one feature vector from the minority class can generate only one synthetic feature vector (using its K-nn)
Thank you in advance,
Kalia