Automated Feature Engineering in Python

Automated Feature Engineering in Python

5 years ago
Anonymous $2WKDXfy9lA

https://towardsdatascience.com/automated-feature-engineering-in-python-99baf11cc219

Machine learning is increasingly moving from hand-designed models to automatically optimized pipelines using tools such as H20, TPOT, and auto-sklearn. These libraries, along with methods such as random search, aim to simplify the model selection and tuning parts of machine learning by finding the best model for a dataset with little to no manual intervention. However, feature engineering, an arguably more valuable aspect of the machine learning pipeline, remains almost entirely a human labor.

Feature engineering, also known as feature creation, is the process of constructing new features from existing data to train a machine learning model. This step can be more important than the actual model used because a machine learning algorithm only learns from the data we give it, and creating features that are relevant to a task is absolutely crucial (see the excellent paper “A Few Useful Things to Know about Machine Learning”).