h2o.glm family = "gaussian" ignores link = "log" setting

Description

The link argument appears to be ignored by h2o.glm. Here is a reproduce:

library(h2o)
h2oHandle <- h2o.init()

STACKLOSS <- as.h2o(h2oHandle, stackloss, key = "STACKLOSS")

  1. H2O Results
    h2o.glm(x = c("Air.Flow", "Water.Temp", "Acid.Conc."), y = "stack.loss", data = STACKLOSS, family = "gaussian", standardize = FALSE, lambda = 0)
    h2o.glm(x = c("Air.Flow", "Water.Temp", "Acid.Conc."), y = "stack.loss", data = STACKLOSS, family = "gaussian", link = "log", standardize = FALSE, lambda = 0)

  1. R Results
    glm(stack.loss ~ ., data = stackloss, family = gaussian)
    glm(stack.loss ~ ., data = stackloss, family = gaussian(link = "log"))

sessionInfo()

  1.  

    1. – OUTPUT ---------------------------------------------------------------------
      > h2o.glm(x = c("Air.Flow", "Water.Temp", "Acid.Conc."), y = "stack.loss", data = STACKLOSS, family = "gaussian", standardize = FALSE, lambda = 0)

=======================================================

100%
IP Address: 127.0.0.1

  1.  

    1. Port : 54321
      Parsed Data Key: STACKLOSS

GLM2 Model Key: GLMModel__bdc995c6646a3fef45b128c269394aed

Coefficients:
Air.Flow Water.Temp Acid.Conc. Intercept
0.71564 1.29529 -0.15212 -39.91967

Degrees of Freedom: 20 Total (i.e. Null); 17 Residual
Null Deviance: 2069.2
Residual Deviance: 178.8 AIC: 114.6
Deviance Explained: 0.91358
> h2o.glm(x = c("Air.Flow", "Water.Temp", "Acid.Conc."), y = "stack.loss", data = STACKLOSS, family = "gaussian", link = "log", standardize = FALSE, lambda = 0)

=======================================================

100%
IP Address: 127.0.0.1

Port : 54321
Parsed Data Key: STACKLOSS

GLM2 Model Key: GLMModel__8f5c720d0daa825c57fec98ef80e4ef7

Coefficients:
Air.Flow Water.Temp Acid.Conc. Intercept
0.71564 1.29529 -0.15212 -39.91967

Degrees of Freedom: 20 Total (i.e. Null); 17 Residual
Null Deviance: 6582.6
Residual Deviance: 178.8 AIC: 114.6
Deviance Explained: 0.97283
>
> # R Results
> glm(stack.loss ~ ., data = stackloss, family = gaussian)

Call: glm(formula = stack.loss ~ ., family = gaussian, data = stackloss)

Coefficients:
(Intercept) Air.Flow Water.Temp Acid.Conc.
-39.9197 0.7156 1.2953 -0.1521

Degrees of Freedom: 20 Total (i.e. Null); 17 Residual
Null Deviance: 2069
Residual Deviance: 178.8 AIC: 114.6
> glm(stack.loss ~ ., data = stackloss, family = gaussian(link = "log"))

Call: glm(formula = stack.loss ~ ., family = gaussian(link = "log"),
data = stackloss)

Coefficients:
(Intercept) Air.Flow Water.Temp Acid.Conc.
-0.716555 0.025136 0.079023 0.003308

Degrees of Freedom: 20 Total (i.e. Null); 17 Residual
Null Deviance: 2069
Residual Deviance: 173 AIC: 113.9
>
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] tools splines stats graphics grDevices utils
[7] datasets methods base

other attached packages:
[1] h2o_2.9.0.99999 survival_2.37-7 statmod_1.4.20
[4] rjson_0.2.15 RCurl_1.95-4.3 bitops_1.0-6

Assignee

Tomas Nykodym

Reporter

Patrick Aboyoun

Labels

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

Components

Priority

Major
Configure