The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
Software for Data Selection
Posted by: louissalome
Date: October 05, 2020 05:55AM

I am facing a Data Selection issue.
I can access a large database with a great amount of variables (i.e. hundreds of columns) on SAP BW. I have very little documentation for this db. I want to go through each variable, ignore the empty ones, and identify the useful ones.
I have done this in a very inefficient way. I load my data in Power BI Desktop and I check each variable one at a time. By doing this, I’m sure I’m making no mistake but it takes too much time. I really need a first cleaning.
I’m looking for a software that could help me to select the interesting variables out of a large dataset. I want it to be at least able to detect the empty variable and possibly to figure out duplicates or correlations.
I’d really like to know your tips and good practices for this kind of data selection!

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.