5 Tuning your model

In this chapter you’ll learn:

What you mean by hyper-parameters and how to set them in your scikit-learn model.
How to search the model hyperparameter space to find the best model.
How to compare models with different hyperparameters.

In the last chapter, we mentioned that models have some bells and whistles that you can tune to improve a model’s learning process. These tunable parts of the model are called hyperparameters in machine learning literature. These are different from the parameters of the model in the sense that model parameters are learned from the data and represent patterns in the data, whereas hyperparameters are parameters set by the machine learning developer and control the model’s architecture and learning process. Since the developer sets the hyperparameters, the optimal values that maximize the model performance are often found by trial and error. However, running each model several times with different combinations of hyperparameters and tracking the results manually can get tedious and error-prone. So we use some of the semi-automated methods of finding the hyperparameter.

5.1 Grid Search

One of the simplest and most often used semi-automated method for finding the optimal values for hyperparameters is the Grid Search method. In a grid search, we pass the list of values that we think are good candidates for the hyperparameters. The grid search algorithm then runs our model with all combinations of given hyperparameters and store all the results. The algorithm also stores the values that gave the best result separately as the best model (estimator).

Finding available hyperparameters

The list of hyperparameters for your model can be found in the Scikit-learn’s documentation page. A simple google search of “your model name + scikit-learn” will take you to the correct documentation page. For our simple logistic model, you can search logistic classifier scikit-learn and it will take you to this page. In the model documentation page, all arguments/variables listed under the parameters section are hyperparameters of the model that you can play around with.

Once we identify the hyperparameters we are interested in and the values we want to check, we call the GridSearchCV function from the GridSearch module in scikit-learn.

using ScikitLearn.GridSearch: GridSearchCV
@sk_import linear_model: LogisticRegression;
gridsearch = GridSearchCV(LogisticRegression(),
            Dict(:solver => ["newton-cg", "lbfgs", "liblinear"], 
            :C => [0.01, 0.1, 0.5, 0.9]));

The hyperparameters and their values are passed as a dictionary.
:solver corresponds to the different learning algorithms and :C hyperparameter is a regularization constant.

Once we have initialized the grid search model object with the values we are interested in, we can call the fit! function to start the training process.

fit!(gridsearch, features_train, target_train);

The results of the grid search are stored in the grid_scores_ field in gridsearch model object.

search_results = DataFrame(gridsearch.grid_scores_)
hcat(DataFrame(search_results.parameters), 
            search_results)[!,Not(:parameters)]

12 rows × 4 columns

	solver	C	mean_validation_score	cv_validation_scores
	String	Float64	Float64	Array…
1	newton-cg	0.01	0.806034	[0.782051, 0.844156, 0.792208]
2	lbfgs	0.01	0.806034	[0.782051, 0.844156, 0.792208]
3	liblinear	0.01	0.75	[0.730769, 0.766234, 0.753247]
4	newton-cg	0.1	0.814655	[0.807692, 0.831169, 0.805195]
5	lbfgs	0.1	0.814655	[0.807692, 0.831169, 0.805195]
6	liblinear	0.1	0.767241	[0.717949, 0.779221, 0.805195]
7	newton-cg	0.5	0.814655	[0.807692, 0.831169, 0.805195]
8	lbfgs	0.5	0.814655	[0.807692, 0.831169, 0.805195]
9	liblinear	0.5	0.775862	[0.730769, 0.805195, 0.792208]
10	newton-cg	0.9	0.810345	[0.807692, 0.818182, 0.805195]
11	lbfgs	0.9	0.810345	[0.807692, 0.818182, 0.805195]
12	liblinear	0.9	0.784483	[0.75641, 0.805195, 0.792208]

The first line converts the grid search results into a dataframe and the second line cleans the dataframe into a more readable form.

The best model is stored in the best_estimator_ field in gridsearch model object.

best_model = gridsearch.best_estimator_

PyObject LogisticRegression(C=0.1, solver='newton-cg')

We can now use the best_model object the way we used simplelogisitc for predictions and other stuffs.

best_model_predictions = predict(best_model, features_train);
first(best_model_predictions,4)

4-element Vector{Any}:
 "No"
 "Yes"
 "No"
 "No"

Code Summary for Chapter 5

using ScikitLearn.GridSearch: GridSearchCV
@sk_import linear_model: LogisticRegression;
gridsearch = GridSearchCV(LogisticRegression(),
            Dict(:solver => ["newton-cg", "lbfgs", "liblinear"], 
            :C => [0.01, 0.1, 0.5, 0.9]));

# Training the model with candiadate hyperparameters
fit!(gridsearch, features_train, target_train); 

# Cleaning the grid search results and printing 
# them as dataframes
search_results = DataFrame(gridsearch.grid_scores_)
hcat(DataFrame(search_results.parameters), 
            search_results)[!,Not(:parameters)]

# Extracting the bestg model from grid search
best_model = gridsearch.best_estimator_ 

# Making predictions using the best model
best_model_predictions = predict(best_model, features_train);
first(best_model_predictions,4)