Uploaded image for project: 'Public H2O 3'
  1. Public H2O 3
  2. PUBDEV-3865

h2o gbm : for an unseen categorical level, discrepancy in predictions when score using h2o vs pojo/mojo

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.10.2.2
    • Fix Version/s: 3.10.3.1
    • Component/s: None
    • Labels:
      None
    • CustomerVisible:
      No

      Description

      discrepancy in scoring using h2o vs pojo/mojo for an unseen categorical
      ran on 3.10.0.7 and latest 3.10.2.1 (screenshots attached)
      Can be reproed on Airlines dataset
      code to create model -

      df = h2o.importFile("/Users/nidhimehta/steam-automl/smalldata/allyears2k_headers.zip")
      model2 = h2o.gbm(
        model_id = "model2",
        training_frame = df,
        x = c("Origin"),
        y = "IsDepDelayed",
        max_depth = 15,
        seed = 1234,
        min_rows = 1,
        ntrees = 5
      )
      

      Now, try to predict on an unseen categorical level say - origin ="SANTA"
      Observe that predictions from H2o vs Pojo differ (screenshots attached)

      data file to predict using h2o - pojo_na_test.txt attached
      to predict using pojo - main.java and model.java files attached
      code to create pojo scoring

      javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m model2.java main.java
      java -cp .:h2o-genmodel.jar main
      

        Attachments

        1. descrepency_3.10.0.7.png
          454 kB
          Nidhi Mehta
        2. descrepency_3.10.2.1.png
          427 kB
          Nidhi Mehta
        3. main_mojo.java
          2 kB
          Nidhi Mehta
        4. main.java
          1 kB
          Nidhi Mehta
        5. model2_3.10.0.7.java
          250 kB
          Nidhi Mehta
        6. model2_3.10.2.1.java
          187 kB
          Nidhi Mehta
        7. model2.png
          757 kB
          Nick Karpov
        8. pojo_na_test.txt
          0.0 kB
          Nidhi Mehta

          Activity

            People

            • Assignee:
              arno Arno Candel
              Reporter:
              nidhi Nidhi Mehta
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Zendesk Support