Today I'm going to share a powerful Python library with you, autogluon.
https://github.com/autogluon/autogluon
AutoGluon is an AutoML toolkit for deep learning that can automatically perform end-to-end machine learning tasks, allowing us to achieve powerful predictive performance with just a few lines of code.
AutoGluon "automatically performs machine learning tasks", allowing you to easily achieve powerful predictive performance in your applications.
First experience
Installing AutoGluon
We can directly install it using pip.
pip install autogluon
Loading datasets
You can load datasets using TabularDataset.
from autogluon.tabular import TabularDataset, TabularPredictor
train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv')
test_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv')
Model building
To use this model, you need to initialize "evaluation metrics, dependent variables, and the directory to store results".
In the following example, we use f1 as the evaluation metric. The dependent variable is "class", and the models are placed in the "output_models" folder.
evaluation_metric= "f1"
data_label= "class"
save_path= "output_models"
predictor = TabularPredictor(label='class').fit(train_data, time_limit=120) # Fit models for 120s
leaderboard = predictor.leaderboard(test_data)
# Create predictor
predictor = TabularPredictor(label=data_label,path=save_path,eval_metric=evaluation_metric)
predictor=predictor.fit(train_data)
predictor.leaderboard(silent=True)
The leaderboard shown in the following figure allows you to understand "the attempts of all models and the scores you obtained on these models".
Now let's take a look at the feature importance together.
X = train_data
Predictor.feature_importance(X)
All built models are stored in the output folder "output_models".