H2O gridsearch should be applicated at the pipeline estimator like sklearn

Description

Now, the h2o grid search is used only for one h2o estimator to find the best hyper parameters, but in reality, maybe we need also to find the hyper parameters in the preprocessing of the data.
Like in the sklearn environment, we can build a pipeline with some transformers and a estimator. We sometimes need to find the other better hyper parameters in the transformers via the grid search. I think that it will be really good if h2o can do the same thing.
I have tried some works for this. The h2o estimator and custom transformers can be added in the sklearn pipeline with the cv and scorer rebuilt. but it can't be trained in parallel (means n_jobs can only be 1). That is really slow to train it.
Hope that you have a idea to realize it.

Assignee

New H2O Bugs

Fix versions

None

Reporter

zhongtian xiao

Support ticket URL

None

Labels

Affected Spark version

None

Customer Request Type

None

Task progress

None

CustomerVisible

No

Components

Affects versions

Priority

Major
Configure