Target Encoder is not invariant to the order of encoded columns

Description

When the order of encoded columns given to a TargetEncoder constructor is different, the results in the encoded columns are vastly diffent (the delta is really big and the difference is huge on the first decimal place).

Is this expected behavior ?

1 TargetEncoder tec = new TargetEncoder(new String[]{ "embarked", "home.dest"});

gives different result than

1 TargetEncoder tec = new TargetEncoder(new String[]{"home.dest", "embarked"});

The encoding map is the same, it's the applyTargetEncoding that makes all the difference. This behavior has been observed with KFold leakage handling strategy enabled. Blending disabled.

Environment

None

Status

Assignee

New H2O Bugs

Fix versions

None

Reporter

Pavel Pscheidl

Support ticket URL

None

Labels

None

Release Priority

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

CustomerVisible

No

Priority

Blocker