Exception when there is a column with BOOLEAN type in dataset during H2OMOJOModel transformation

Description

When I try to get prediction for some Spark dataset which contains column with BOOLEAN type an exception is thrown:
Caused by: hex.genmodel.easy.exception.PredictUnknownCategoricalLevelException: Unknown categorical level (IsArrDelayed,1)
at hex.genmodel.easy.EasyPredictModelWrapper.fillRawData(EasyPredictModelWrapper.java:868)
at hex.genmodel.easy.EasyPredictModelWrapper.predict(EasyPredictModelWrapper.java:890)
at hex.genmodel.easy.EasyPredictModelWrapper.preamble(EasyPredictModelWrapper.java:756)
at hex.genmodel.easy.EasyPredictModelWrapper.predictBinomial(EasyPredictModelWrapper.java:501)
at hex.genmodel.easy.EasyPredictModelWrapper.predictBinomial(EasyPredictModelWrapper.java:489)
at hex.genmodel.easy.EasyPredictModelWrapper.predict(EasyPredictModelWrapper.java:300)

I investigated this issue and I can say that problem in this code blocks:

H2OMOJOModel.rowToRowData
case BooleanType =>
if (row.getBoolean(idxRow)) put(f.name, 1.toString) else put(f.name, 0.toString)

We see that original value is converted to 0 or 1

But here

EasyPredictModelWrapper.fillRawData

else {
// Column has categorical value.
Object o = data.get(dataColumnName);
double value;
if (o instanceof String) {
String levelName = (String) o;
HashMap<String, Integer> columnDomainMap = domainMap.get(index);
Integer levelIndex = columnDomainMap.get(levelName);
if (levelIndex == null) {
levelIndex = columnDomainMap.get(dataColumnName + "." + levelName);
}
...

When this line code is executed
Integer levelIndex = columnDomainMap.get(levelName);
levelIndex becomes null because the keys in columnDomainMap contains values from original dataset without transformation (true -> 1, false -> 0)

Some debug info:
levelName = "1"
columnDomainMap = {HashMap@10059} size = 2
0 = {HashMap$Node@10079} "false" -> "0"
1 = {HashMap$Node@10080} "true" -> "1"

Status

Assignee

Jakub Hava

Reporter

Alex Denisenko

Labels

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

Fix versions

Priority

Critical
Configure