```
using ScikitLearn.GridSearch: GridSearchCV
@sk_import linear_model: LogisticRegression;
= GridSearchCV(LogisticRegression(),
gridsearch Dict(:solver => ["newton-cg", "lbfgs", "liblinear"],
:C => [0.01, 0.1, 0.5, 0.9]));
```

# 5 Tuning your model

In the last chapter, we mentioned that models have some bells and whistles that you can tune to improve a model’s learning process. These tunable parts of the model are called ** hyperparameters** in machine learning literature. These are different from the parameters of the model in the sense that model parameters are learned from the data and represent patterns in the data, whereas hyperparameters are parameters set by the machine learning developer and control the model’s architecture and learning process. Since the developer sets the hyperparameters, the optimal values that maximize the model performance are often found by trial and error. However, running each model several times with different combinations of hyperparameters and tracking the results manually can get tedious and error-prone. So we use some of the semi-automated methods of finding the hyperparameter.

## 5.1 Grid Search

One of the simplest and most often used semi-automated method for finding the optimal values for hyperparameters is the Grid Search method. In a grid search, we pass the list of values that we think are good candidates for the hyperparameters. The grid search algorithm then runs our model with all combinations of given hyperparameters and store all the results. The algorithm also stores the values that gave the best result separately as the best model (estimator).

## Finding available hyperparameters

The list of hyperparameters for your model can be found in the Scikit-learn’s documentation page. A simple google search of “your model name + scikit-learn” will take you to the correct documentation page. For our simple logistic model, you can search logistic classifier scikit-learn and it will take you to this page. In the model documentation page, all arguments/variables listed under the parameters section are hyperparameters of the model that you can play around with.

Once we identify the hyperparameters we are interested in and the values we want to check, we call the `GridSearchCV`

function from the `GridSearch`

module in `scikit-learn`

.

- The hyperparameters and their values are passed as a dictionary.
`:solver`

corresponds to the different learning algorithms and`:C`

hyperparameter is a regularization constant.

Once we have initialized the grid search model object with the values we are interested in, we can call the `fit!`

function to start the training process.

`fit!(gridsearch, features_train, target_train); `

The results of the grid search are stored in the `grid_scores_`

field in `gridsearch`

model object.

```
= DataFrame(gridsearch.grid_scores_)
search_results hcat(DataFrame(search_results.parameters),
Not(:parameters)] search_results)[!,
```

12 rows × 4 columns

solver | C | mean_validation_score | cv_validation_scores | |
---|---|---|---|---|

String | Float64 | Float64 | Array… | |

1 | newton-cg | 0.01 | 0.806034 | [0.782051, 0.844156, 0.792208] |

2 | lbfgs | 0.01 | 0.806034 | [0.782051, 0.844156, 0.792208] |

3 | liblinear | 0.01 | 0.75 | [0.730769, 0.766234, 0.753247] |

4 | newton-cg | 0.1 | 0.814655 | [0.807692, 0.831169, 0.805195] |

5 | lbfgs | 0.1 | 0.814655 | [0.807692, 0.831169, 0.805195] |

6 | liblinear | 0.1 | 0.767241 | [0.717949, 0.779221, 0.805195] |

7 | newton-cg | 0.5 | 0.814655 | [0.807692, 0.831169, 0.805195] |

8 | lbfgs | 0.5 | 0.814655 | [0.807692, 0.831169, 0.805195] |

9 | liblinear | 0.5 | 0.775862 | [0.730769, 0.805195, 0.792208] |

10 | newton-cg | 0.9 | 0.810345 | [0.807692, 0.818182, 0.805195] |

11 | lbfgs | 0.9 | 0.810345 | [0.807692, 0.818182, 0.805195] |

12 | liblinear | 0.9 | 0.784483 | [0.75641, 0.805195, 0.792208] |

- The first line converts the grid search results into a dataframe and the second line cleans the dataframe into a more readable form.

The best model is stored in the `best_estimator_`

field in `gridsearch`

model object.

`= gridsearch.best_estimator_ best_model `

`PyObject LogisticRegression(C=0.1, solver='newton-cg')`

We can now use the `best_model`

object the way we used `simplelogisitc`

for predictions and other stuffs.

```
= predict(best_model, features_train);
best_model_predictions first(best_model_predictions,4)
```

```
4-element Vector{Any}:
"No"
"Yes"
"No"
"No"
```

### Code Summary for Chapter 5

```
using ScikitLearn.GridSearch: GridSearchCV
@sk_import linear_model: LogisticRegression;
= GridSearchCV(LogisticRegression(),
gridsearch Dict(:solver => ["newton-cg", "lbfgs", "liblinear"],
:C => [0.01, 0.1, 0.5, 0.9]));
# Training the model with candiadate hyperparameters
fit!(gridsearch, features_train, target_train);
# Cleaning the grid search results and printing
# them as dataframes
= DataFrame(gridsearch.grid_scores_)
search_results hcat(DataFrame(search_results.parameters),
Not(:parameters)]
search_results)[!,
# Extracting the bestg model from grid search
= gridsearch.best_estimator_
best_model
# Making predictions using the best model
= predict(best_model, features_train);
best_model_predictions first(best_model_predictions,4)
```