Setup and optimize your machine learning pipeline in 5 lines of code.
We are using a very basic pipeline here:
We repeat the hyperparameter search three times using a KFold split for the outer cross-validation loop,
and we test each configuration ten times with a KFold split on the inner cross-validation loop.
Hyperparameter search is done with our good old friend grid search. For each configuration, we are interested in the
accuracy, recall and precision of the learning model. After all configurations have
been tested, we pick the optimum configuration by maximizing the metric accuracy.
Our pipeline has two preprocessing steps: We add a standard scaler for normalization and a
PCA for feature space reduction.
For both of them we specify some hyperparameters we want to test. Finally, we add a support vector machine
for classification, again specifying some options for the hyperparameters.
With pipeline.fit(data, targets) you start the hyperparameter search including the nested cross validation, so that
you end up having PHOTON delivering you the best configured model for your data as well as its generalized performance.