AutoML: stopping_rounds isn't making it all the way down to some of the models, leading to overfitting

Description

When running on Homesite I noticed that some of the models in the leaderboard were severely overfitting the training data. They have much higher training auc than xval auc.

I guess nobody noticed this before because we normally run with a holdout leaderboard_frame, while in this case I'm not splitting the training data into holdouts. When there's a holdout the disparity is more subtle and harder to spot.

Assignee

Raymond Peck

Fix versions

None

Reporter

Raymond Peck

Support ticket URL

None

Labels

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

CustomerVisible

Yes

Priority

Blocker
Configure