Performance metrics when using balance_classes

Description

Motivation

When balance_classes is used in H2O modeling, the performance metrics will not match the performance metrics constructed during training.

An example is shown below:

Solution

When balance_classes is enabled in H2O modeling, the model is built on a balanced version of the training data frame. The performance metrics constructed during training are based on this balanced version of the training data frame. Therefore, when performance is calculated on the unbalanced training data frame, the metrics will be different.

To determine the performance metrics on the unbalanced training data frame use the following:

Assignee

Megan Kurka

Reporter

Megan Kurka

Labels

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

Priority

Major
Configure