AzureML Studio : Automated ML¶

In this notebook, we will use the AzureML Studio's AutoML to automatically select the best model given time and compute constraints.

The AutoML process is as follows:

check the data for potential issues (Class balancing detection, Missing feature values imputation, High cardinality feature detection)
build data pipelines with :
- a data preprocessor (scaling, standardisation, normalisation)
- a prediction model (XGBoost, LightGBM, Random Forest, Linear Regression, etc.)
- a set of hyperparameters (learning rate, number of trees, etc.)
train each pipeline for a limited time
evaluate the performance of each pipeline
select the best pipeline

The experiment is visible in the AzureML Studio : oc-p7-automated-ml

We will compare this pre-trained local model to the baseline model from 1_baseline.ipynb.

AutoML model : max 1h training on CPU¶

In this version, we did not include DNN models in the AutoML process, because they require GPU resources.

This AutoML run is available in the AzureML Studio : automl_1h-cpu

Here are the models that were trained in the AutoML process :

AzureML - AutomatedML - 1h on CPU - models

Best model¶

The best model is a XGBoostClassifier (wrapper for XGBClassifier) with MaxAbsScaler .

Best Model

Results¶

Confusion Matrix	Precision Recall Curve (AP = 0.79)	ROC Curve (AUC = 0.80)

The performances on the dataset are quite better than our baseline model :

Average Precision = 0.79 (baseline = 0.73 , +8.2%)
ROC AUC = 0.80 (baseline = 0.74 , +8.1%)

Unlike our baseline model, this model is quite balanced, just slightly biased towards the POSITIVE class. It is much less biased than our baseline model : it predicted 9% (baseline = 35% , -74%) more POSITIVE (78403) messages than NEGATIVE (65597).

AutoML model : max 10h training on GPU¶

This AutoML run is available in the AzureML Studio : automl_10h-gpu

Here are the models that were trained in the AutoML process :

AzureML - AutomatedML - 10h on GPU - models

Best model¶

The best model is a LightGBM with MaxAbsScaler.

This model adds a pre-processing step, which integrates and fine-tunes a pre-trained BERT model, before training the actual classification model : PretrainedTextDNNTransformer.

Best Model

Results¶

Confusion Matrix	Precision Recall Curve (AP = 0.942)	ROC Curve (AUC = 0.942)

The performances on the dataset are quite better than our previous model :

Average Precision = 0.94 (baseline = 0.73 , +29%)
ROC AUC = 0.94 (baseline = 0.74 , +27%)

Like our previous model, this model is very fair, just very slightly biased towards the NEGATIVE class this time. It is much less biased than our baseline model : it predicted only 1.4% (baseline = 35% , -96%) more NEGATIVE (128909) messages than POSITIVE (127091).