Open issues

Integration Testing for different variation of Spark
SW-179
Add support for Spark Dynamic Allocation
SW-556
Test Sparkling Water on Standalone Cluster
SW-106
Sparkling shell: newly created classes are not visible by remote nodes
SW-36
Correctness Tests for Usage of 'offsetCol' with H2OGBM
SW-1717
[ClientSeparation] Migration guide needs to mention that RSparkling & PySparkling will use client-less approach by default from next major
SW-1716
[ClientSeparation] Switch PySparkling & RSparkling in both backends to client-less approach by default
SW-1715
Fix propagation of internal security conf in Sparkling Water
SW-1711
Enable client mode in Sparkling Water (needs to be done explictly)
SW-1710
Add sparkling water version info in the h2o logs
SW-1709
Put h2o-security package into sparkling water assembly
SW-1706
Create script to generate kubernetes docker images
SW-1690
Document how to use Sparkling Water with Kubernetes
SW-1689
Expose Offset Column in Supervised Algorithms
SW-1687
Expose offset_column in XGBoost
SW-1686
expose disable flow option in sw
SW-1683
Create scoring package for pysparkling
SW-1679
Pass params to h2o_context
SW-1673
Structure Contributions in the 'deteailed_prediction' Column Better
SW-1668
Revisit PySparkling Water Initializer - adding sparkling-water.jar to classpath has non-deterministic behavior
SW-1665
pysparkling numpy compatibility on Azure Databricks
SW-1663
as_h2o_frame is slow in external mode
SW-1657
[DynamicAlloc] Fix H2OConf test suite
SW-1654
Expose H2O SVM into Sparkling Water Algorithm API
SW-1652
SparklingWater forms only H2O cluster on Azure only with one node
SW-1650
Separate client disconnect and client retry timeout to 2 different options
SW-1649
Convert PySpark H2OFrame to DataFrame without Client
SW-1648
Convert PySpark DataFrame to H2OFrame without Client
SW-1647
GridSearch Should Reference Hyper Parameters with SW Names
SW-1608
Test setting non-default values on PySparkling Algos via setters and via construtor
SW-1590
Deprecate Sparkling Water SVM in favor of H2O one
SW-1589
H2OGBM in pyspark
SW-1579
Sparkling water fails to detect newer version of colorama
SW-1569
Have automatic kubernetes tests
SW-1566
Upload docker images to docker hub as part of release
SW-1563
Support for Kubernetes
SW-1562
Cloud up of SW fails on EMR
SW-1559
Use argumentbuilder to build arguments for the external h2o backend
SW-1545
get-extended-h2o script does not work for nightlies
SW-1544
[Client Separation][PySparkling] Algorithm wrappers should use Rest API to train models
SW-1537
[PySparkling] Client Separation from Spark Driver
SW-1529
[PySparkling] Improve documentation on H2OXGBoost parameters
SW-1508
Check range of Double/Float parameters on Scala/Pysparkling Algo API
SW-1507
Remove deprecated parameters from terraform templates
SW-1505
Expose getNTreeGroups() in SW MOJO
SW-1495
Fix pipeline tests
SW-1482
Create an estimator flattening dataframes in Spark pipeline
SW-1440
Re-enable and fix intermittent local-cluster failures
SW-1438
Add automated tests to test h2o native hive in sparkling water environment
SW-1416
Introduce back Jenkinsfile-internal-hadoop-smoke and Spark nightly tests
SW-1410
issue 1 of 166

Integration Testing for different variation of Spark

Description

We need to test for:

  • HDP (different version)

  • CDH5.7, CDH5.8

  • EMR

The tests have to

  • be automatic

  • be defined at a single space - see (Jenkins Pipelines)

Environment

None

Status

Assignee

Michal Malohlava

Reporter

Michal Malohlava

Labels

None

Release Priority

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

Components

Priority

Blocker
Configure