Applying AutoML(Part-2) with MLBox

Also, Check out our Article on:

Advantages

Disadvantages

Preprocessing:

mlbox.preprocessing.Reader(sep=None, header=0, to_hdf5=False, to_path=’save’, verbose=True)

It also offers multiple functionalities.

clean(path, drop_duplicate = False)
train_test_split(Lpath, target_name)

Drift Thresholding:

dft = Drift_thresholder()
df = dft.fit_transform(df)

It also offers multiple functionalities.

Encoding:

→ Missing values

mlbox.encoding.NA_encoder(numerical_strategy=’mean’, categorical_strategy=’’)

→ Categorical features

mlbox.encoding.Categorical_encoder(strategy=’label_encoding’, verbose=False) Encodes categorical features.

Model

→ Classification

mlbox.model.classification.Clf_feature_selector(strategy=’l1’, threshold=0.3)
mlbox.model.classification.Classifier(**params)
mlbox.model.classification.StackingClassifier()

→ Regression

mlbox.model.regression.Reg_feature_selector(strategy=’l1’, threshold=0.3)
mlbox.model.regression.Regressor(**params)

Optimization

mlbox.optimisation.Optimiser(scoring=None, n_folds=2,random_state=1,
to_path=’save’, verbose=True)

Prediction

mlbox.prediction.Predictor(to_path=’save’, verbose=True)

Python implementation of MLBox

1. Installing the Package:

!pip install mlbox

2. Importing the modules from MLBox

from mlbox.preprocessing import *from mlbox.optimisation import *from mlbox.prediction import *

3. Specifying the path and target name

paths = ["/contents/train.csv","/contents/test.csv"]
target_name = "Survived"

4. Preprocessing and splitting our data

rd = Reader(sep = ",")df = rd.train_test_split(paths, target_name)

5. Drift thresholding our data to remove any kind of bias

dft = Drift_thresholder()df = dft.fit_transform(df)

6. Initializing the optimizer

#Hyperparameter tuningopt = Optimiser(scoring = "accuracy", n_folds = 5)
space = {'est__strategy':{"search":"choice","space":["LightGBM"]},
'est__n_estimators':{"search":"choice","space":[150]},
'est__colsample_bytree':{"search":"uniform","space":
[0.8,0.95]},
'est__subsample':{"search":"uniform","space":[0.8,0.95]},
'est__max_depth':{"search":"choice","space":[5,6,7,8,9]},
'est__learning_rate':{"search":"choice","space":[0.07]}
}

7. Optimizing

params = opt.optimise(space, df,15)

8. Prediction

Also, Check out our Article on:

Visit us on https://www.insaid.co/

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
INSAID

INSAID

523 Followers

One of India’s leading institutions providing world-class Data Science & AI programs for working professionals with a mission to groom Data leaders of tomorrow!