Add build_tree_one_node to XGBoost

Description

DRF and GBM have an option build_tree_one_node: http://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/build_tree_one_node.html

We could add this option to XGBoost as well and thus allow to run a small XGBoost with exact even on distributed clusters. The easiest way is to rebalance the Frame and bring it to the memory of the local node before calling https://github.com/h2oai/h2o-3/blob/master/h2o-extensions/xgboost/src/main/java/hex/tree/xgboost/XGBoost.java#L323, this could be done the same way as deeplearning does it (for a different purpose): https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/deeplearning/DeepLearning.java#L179 - simply rebalance the (small) dataset to a single chunk.

It is not always practical for our users to start a single node cluster, we should imho provide a way to run them with exact. This will become more important once the parallel per-segment model building is implemented.

Assignee

Jan Sterba

Fix versions

Reporter

Jan Sterba

Support ticket URL

None

Labels

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

CustomerVisible

No

Components

Priority

Major
Configure