We're updating the issue view to help you get more done. 

probability calibration does not work in Sparkling Water Dataframe API

Description

originally posted as a github issue here:

Although when calibration is enabled, EasyPredictModelWrapper returns a BinomialModelPrediction object which contains both raw probs and calibrated probs, the
implicit conversion defined here

1 2 3 4 5 6 7 sparkling-water/ml/src/main/scala/org/apache/spark/ml/h2o/models/H2OMOJOModel.scala Lines 81 to 83 in 9968342 implicit def toBinomialPrediction(pred: AbstractPrediction) = BinomialPrediction( pred.asInstanceOf[BinomialModelPrediction].classProbabilities(0), pred.asInstanceOf[BinomialModelPrediction].classProbabilities(1))

transforms the BinomialModelPrediction object to a BinomialPrediction, which only contains raw probs (p0, and p1) and calibrated probabilities are NOT returned. Both raw and calibrated probabilities should be returned to the user.

Environment

None

Status

Assignee

Jakub Hava

Reporter

Lauren DiPerna

Labels

None

Release Priority

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

Fix versions

Priority

Major