Know your data

What’s my problem?

I don’t think data science is a kind of art stuff, essentially it is a science. For a long period, I found there was not much information about practice principles for EDA( exploratory data analysis). Yes, maybe there are bunch of helpful program snippets or stats books for reference, when it came to real projects, they were hard to use(or too many choices).

Temp solution

Since there is not a finest solution for me. I just combined several solutions together, and makes them look like a “piratical” solution. Here is it:

From up to down at the second layer, they are six steps before we do the modeling. The following lays are some possible methods we can use. I didn’t list all the information as too many information is gonna make things complex. For feature selection, there is a useful tool.However, if you do it by your coding, it won’t be hard.

I listed all the contents I put into this plot for your reference.

Comprehensive data exploration with Python: https://www.kaggle.com/pmarcelino/comprehensive-data-exploration-with-python

A Feature Selection Tool for Machine Learning in Python: https://towardsdatascience.com/a-feature-selection-tool-for-machine-learning-in-python-b64dd23710f0

Data Mining: Concepts and Techniques, 3rd ed.: http://hanj.cs.illinois.edu/bk3/

Read More

2 Comments

  1. Hi there superb website! Does running a blog
    like this take a lot of work? I’ve virtually no understanding of programming however I had been hoping to
    start my own blog in the near future. Anyways, should you
    have any recommendations or tips for new blog owners please share.

    I understand this is off subject nevertheless I just had to ask.
    Cheers! fotballdrakter

Leave a Reply

Your email address will not be published. Required fields are marked *