GBM grid failing on some datasets

Description

GBM grid is failing with some datasets. Logs show it has to do with a delected vec. This does not happen all the time (it's working fine in the example from the user guide), however I found a dataset that triggers this bug. The dataset happens to be a slightly wider dataset (~1200 columns), so I am not sure if that has something to do with it.

The problem happens when you run AutoML (all GBMs are skipped leading to a huge loss of performance), and also happens when executing a GBM grid manually as well.

Example in R:

Leaderboard is missing GBMs:

Logfile is attached and here's the error that we see:

Example of running grid manually:

Here's the error:

Assignee

Michal Kurka

Fix versions

Reporter

Erin LeDell

Support ticket URL

None

Labels

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

CustomerVisible

No

Components

Affects versions

Priority

Blocker
Configure