How to Grid Search for Hyperparameters

. There are many hyperparameters that can result in a change in accuracy of the model when adjusted. The following tool is a Base Estimator wrapper for the NIML model, There are three search types to aid in selecting a good model, Randomized Search Cross-Validation, Grid Search Cross-Validation, and Bayes Search Cross-Validation. When running one of these, the user defines the scoring, number of cross-validation jobs, number of parallel jobs, number of searches, and whether or not to include the training scores. A fit is then performed on the Base Estimator wrapper given the training and test data with the parameters that are randomly chosen. This results in command line output that displays the results of the best estimator fit score, the parameters that lead to that score, and saves a CSV that contains the parameters of the NPU upon running the fit, the time taken to perform the fit, and the scores of performing the fit.

Table of Contents

Suggested Running Order
Randomized Search Cross-Validation
Grid Search Cross-Validation
Bayes Search Cross-Validation

Suggested Running Order

Since each of these search methods take varying amount of time as well as computational power, it is suggested that users follow an order in running them so as to find a model the most efficient way. The suggested search order is to perform a Randomized Search first. This will yeild a good model, then the user can work to find improvements on that model by running Grid Search with a finite set of hyperparameters, or attempt to optimize the model by using Bayes Search.

Randomized Search Cross-Validation

See more documentation at https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html

This search method is used by taking in a dictionary of hyperparameters and permorming a randomzied search on them without replacement. It is suggested to perform this search first in order to get a good base model. A demo for running this can be found at http://localhost:8891/notebooks/Skiml_RandomizedCV.ipynb

Grid Search Cross-Validation

See more documentation at https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

This search method is different from Randomized Search Cross-Validation as it is an exhaustive search means. It will take in every value of a dictionary of hyperparameters provided by the user and perform a search on every possible combination. It is suggested to run this after establishing a good base model. A demo for running this can be found at http://localhost:8891/notebooks/Skiml_GridCV.ipynb

Bayes Search Cross-Validation

See more documentation at https://scikit-optimize.github.io/stable/modules/generated/skopt.BayesSearchCV.html

This search method is used to optimize a model by performing a Bayes Search on it. Unlike Grid Search, a fixed number of parameters are tried out for the given dictionary of hyperparameters. It is suggested that users perform this optimization search after finding a good model. There is no representive jupyter notebook demo of this seach method due to an import error regarding Scikit-Opt.