Uploaded image for project: 'Public H2O 3'
  1. PUBDEV-3883

Negative indexing for H2OFrame is buggy in R API

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.10.2.2
    • Fix Version/s: 3.10.3.1
    • Component/s: R
    • Labels:
      None
    • CustomerVisible:
      No

      Description

      Negative indexing with an R H2OFrame is buggy. Negative indexing in R data.frames removes columns. Note: R data.frames are indexed starting with 1

      • In H2O, we have an issue where if you use a single negative column index and a single row index, it will return an incorrect result.
      • Negative row indexing also has issues (see below).

      R data.frame COLUMN indexing:

      > data(iris)
      > head(iris)
        Sepal.Length Sepal.Width Petal.Length Petal.Width Species
      1          5.1         3.5          1.4         0.2  setosa
      2          4.9         3.0          1.4         0.2  setosa
      3          4.7         3.2          1.3         0.2  setosa
      4          4.6         3.1          1.5         0.2  setosa
      5          5.0         3.6          1.4         0.2  setosa
      6          5.4         3.9          1.7         0.4  setosa
      
      # Keep first row; Remove first column
      > iris[1,-1]
        Sepal.Width Petal.Length Petal.Width Species
      1         3.5          1.4         0.2  setosa
      
      # Keep first row; Remove third column
      > iris[1,-3]
        Sepal.Length Sepal.Width Petal.Width Species
      1          5.1         3.5         0.2  setosa
      
      # Keep third row; Remove third column
      > iris[3,-3]
        Sepal.Length Sepal.Width Petal.Width Species
      3          4.7         3.2         0.2  setosa
      
      # Keep rows 3-5; Remove third column
      > iris[3:5,-3]
        Sepal.Length Sepal.Width Petal.Width Species
      3          4.7         3.2         0.2  setosa
      4          4.6         3.1         0.2  setosa
      5          5.0         3.6         0.2  setosa
      
      # Remove first column
      > iris[,-1]
          Sepal.Width Petal.Length Petal.Width    Species
      1           3.5          1.4         0.2     setosa
      2           3.0          1.4         0.2     setosa
      3           3.2          1.3         0.2     setosa
      4           3.1          1.5         0.2     setosa
      5           3.6          1.4         0.2     setosa
      ...
      
      # Remove third and fourth columns (two ways)
      > iris[1,c(-3,-4)]
        Sepal.Length Sepal.Width Species
      1          5.1         3.5  setosa
      
      > iris[1,seq(-3,-4,-1)]
        Sepal.Length Sepal.Width Species
      1          5.1         3.5  setosa
      
      # Keep rows 2-5; Remove 5th column
      > iris[2:5,-5]
        Sepal.Length Sepal.Width Petal.Length Petal.Width
      2          4.9         3.0          1.4         0.2
      3          4.7         3.2          1.3         0.2
      4          4.6         3.1          1.5         0.2
      5          5.0         3.6          1.4         0.2
      
      

      Now let's try negative column indexing on an H2OFrame in R:

      library(h2o)
      h2o.init()
      hf <- as.h2o(iris)
      
      # BROKEN
      # R: Keep first row; Remove first column
      # H2O: Returns element (1,2) 
      > hf[1,-1]
      [1] 3.5
      
      # BROKEN
      # R: Keep first row; Remove third column
      # H2O: Returns element (1,1)
      > hf[1,-3]
      [1] 5.1
      
      # BROKEN
      # R: Keep third row; Remove third column
      # H2O: Returns element (3,1)
      > hf[3,-3]
      [1] 4.7
      
      # OK
      # R: Keep rows 3-5; Remove third column
      # H2O: Keep rows 3-5; Remove third column
      > hf[3:5,-3]
        Sepal.Length Sepal.Width Petal.Width Species
      1          4.7         3.2         0.2  setosa
      2          4.6         3.1         0.2  setosa
      3          5.0         3.6         0.2  setosa
      
      # OK
      # R: Remove first column
      # H2O: Remove first column 
      > hf[,-1]
        Sepal.Width Petal.Length Petal.Width Species
      1         3.5          1.4         0.2  setosa
      2         3.0          1.4         0.2  setosa
      3         3.2          1.3         0.2  setosa
      4         3.1          1.5         0.2  setosa
      5         3.6          1.4         0.2  setosa
      6         3.9          1.7         0.4  setosa
      
      [150 rows x 4 columns] 
      
      # OK
      # R: Remove third and fourth columns (two ways)
      # H2O: Remove third and fourth columns (two ways)
      > hf[1,c(-3,-4)]
        Sepal.Length Sepal.Width Species
      1          5.1         3.5  setosa
      
      [1 row x 3 columns] 
      
      > hf[1,seq(-3,-4,-1)]
        Sepal.Length Sepal.Width Species
      1          5.1         3.5  setosa
      
      

      Lastly, an example of broken negative row indexing as well:

      # combined with negative column indexing
      > dim(iris[-1,-3])
      [1] 149   4
      
      > dim(hf[-1,-3])
      NULL
      > hf[-1,-3]
       [1] 4.9 4.7 4.6 5.0 5.4 4.6 5.0 4.4 4.9 5.4
      
      # combined with positive column indexing
      > iris[-1,3]
        [1] 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 1.5 1.3 1.4 1.7 1.5 1.7 1.5 1.0
       [23] 1.7 1.9 1.6 1.6 1.5 1.4 1.6 1.6 1.5 1.5 1.4 1.5 1.2 1.3 1.4 1.3 1.5 1.3 1.3 1.3 1.6 1.9
       [45] 1.4 1.6 1.4 1.5 1.4 4.7 4.5 4.9 4.0 4.6 4.5 4.7 3.3 4.6 3.9 3.5 4.2 4.0 4.7 3.6 4.4 4.5
       [67] 4.1 4.5 3.9 4.8 4.0 4.9 4.7 4.3 4.4 4.8 5.0 4.5 3.5 3.8 3.7 3.9 5.1 4.5 4.5 4.7 4.4 4.1
       [89] 4.0 4.4 4.6 4.0 3.3 4.2 4.2 4.2 4.3 3.0 4.1 6.0 5.1 5.9 5.6 5.8 6.6 4.5 6.3 5.8 6.1 5.1
      [111] 5.3 5.5 5.0 5.1 5.3 5.5 6.7 6.9 5.0 5.7 4.9 6.7 4.9 5.7 6.0 4.8 4.9 5.6 5.8 6.1 6.4 5.6
      [133] 5.1 5.6 6.1 5.6 5.5 4.8 5.4 5.6 5.1 5.1 5.9 5.7 5.2 5.0 5.2 5.4 5.1
      
      > hf[-1,3]
       [1] 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5
      

        Attachments

          Activity

            People

            • Assignee:
              navdeep Navdeep
              Reporter:
              erin Erin LeDell
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: