The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
SMOTE-N algorithm to handle imbalanced data
Posted by: Kalia
Date: June 02, 2014 08:10AM

Hi all,

I have a classification problem with two classes working on nominal data. I want to apply SMOTE-N to deal with imbalanced data. However, it is not clear to me how to use SMOTE-N for generating N synthetic data for each feature vector in the minority class. SMOTE-N uses a modified version of the value difference metric (VDM) to find the k-nearest neighbors for each feature vector in the minority class and then the new minority class feature vector is generated by creating new set feature values by taking the majority vote of the feature vector in consideration and its k nearest neighbors (k-nn). But, how is this process repeated to generate multiple synthetic feature vectors for each feature vector in the minority class? The way the algorithm is stated, it seems that one feature vector from the minority class can generate only one synthetic feature vector (using its K-nn)

Thank you in advance,

Kalia

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.