Request to help us to run coxph model for large data set with 300 columns( 6 GB )

Description

We are trying to run coxph model using h2o,Rsparkling for large data set with 6 GB with 300 columns, whatever the configuration we take for spark, we are getting memory issues.

As per h2o, we should only have 4 times data size bigger cluster, but we took even 128GB 4 worker nodes with a 128 master node. But still its raising issues.

Please help us to choose the spark configuration needed to run h2o with our current data set

Status

Assignee

New H2O Bugs

Fix versions

None

Reporter

Divya Mereddy

Support ticket URL

None

Labels

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

Priority

Major
Configure