Grid search for Stacked Ensemble

Description

Here is the initial design idea for the Stacked Ensemble grid search (which is mostly a search of the metalearner hyperparameters, but could also include other params like metalearner_nfolds).

Python example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 metalearner_grid_params_gbm = {'max_depth': [2,3,4], 'col_sample_rate': [0.2,0.5,0.7]} metalearner_grid_params_rf = {'ntrees': [200,300,400], 'col_sample_rate': [0.2,0.5,0.7]} # set up SE grid, use hyper_params to pass a new value called metalearner_params grid = H2OGridSearch(model=H2OStackedEnsembleEstimator, hyper_params={'metalearner_grid_params': [{'algorithm': "GBM", 'params': metalearner_grid_params_gbm}, {'algorithm': "DRF", 'params': metalearner_grid_params_rf}]}, seed=1, search_criteria={'strategy': 'RandomDiscrete', 'max_models': 36}) grid.train(x=x, y=y, training_frame=train, seed=1, #this is SE seed (not grid seed) base_models=[my_gbm, my_rf]) #pass along fixed SE params like base_models # Single model (for comparison) metalearner_gbm_params = {'max_depth': 2, 'col_sample_rate': 0.3} ensemble = H2OStackedEnsembleEstimator(base_models=[my_gbm, my_rf], metalearner_algorithm="GBM", metalearner_params=metalearner_gbm_params) ensemble.train(x=x, y=y, training_frame=train)

Environment

None

Status

Assignee

Michal Kurka

Fix versions

Reporter

Erin LeDell

Support ticket URL

None

Labels

None

Release Priority

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

CustomerVisible

No

Epic Link

Components

Priority

Major
Configure