conversion of sparse data DataFrame to H2OFrame is slow

Description

it takes a long time to build the model on a sparse dataset (89x5000) when read in using parquet format on a 5 executor SW cluster.

Status

Assignee

Jakub Hava

Reporter

Nidhi Mehta

Labels

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

Fix versions

Priority

Major
Configure