Revisit PySparkling Water Initializer - adding sparkling-water.jar to classpath has non-deterministic behavior

Description

Problem: if using PySparkling i can see that defined jars are added to executor in different order with respect to `sparkling_water_assemly.jar`

The code is here: https://github.com/h2oai/sparkling-water/blob/master/py/src/ai/h2o/sparkling/Initializer.py#L144-L160

Test scenario:

  • Spark 2.4.0, Sparkling Water 3.26.6-2.4, mojo2-runtime 2.1.8-SNAPSHOT (d79b102494b31c55307a20eed3e975c8404ea43f)

  • CMD line:

1 2 3 4 5 6 7 8 /Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/bin/spark-submit --master local-cluster[6,2,2048] \ --driver-memory=4g \ --jars mojo2-runtime.jar,/usr/lib/hadoop-lzo/lib/hadoop-lzo.jar,license.sig \ --py-files /Users/michal/Tmp/4tomk/sparkling-water-3.26.6-2.4/py/build/dist/h2o_pysparkling_2.4-3.26.6-2.4.zip \ --driver-class-path=mojo2-runtime.jar \ --conf spark.driver.extraClassPath=mojo2-runtime.jar \ --conf spark.executor.extraClassPath=mojo2-runtime.jar example.py \ --mojo mp1/pipeline.mojo --icsv mp1/example.csv

In successful run, the executor log contains the following order:

1 2 3 4 5 6 7 8 9 10 9/10/06 20:31:33 INFO Executor: Fetching spark://192.168.1.73:64009/jars/mojo2-runtime.jar with timestamp 1570419077664 19/10/06 20:31:33 DEBUG TransportClient: Sending stream request for /jars/mojo2-runtime.jar to /192.168.1.73:64009 19/10/06 20:31:33 INFO Utils: Fetching spark://192.168.1.73:64009/jars/mojo2-runtime.jar to /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-706d8462-aced-4a6a-a547-8b270616e6dd/executor-f2c519d9-a286-4102-bad1-efb3eff5b30e/spark-3d609fec-8bbd-4c5a-8de7-2b23090e80f8/fetchFileTemp3309890348261265236.tmp 19/10/06 20:31:33 INFO Utils: Copying /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-706d8462-aced-4a6a-a547-8b270616e6dd/executor-f2c519d9-a286-4102-bad1-efb3eff5b30e/spark-3d609fec-8bbd-4c5a-8de7-2b23090e80f8/-17345170191570419077664_cache to /Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203118-0000/4/./mojo2-runtime.jar 19/10/06 20:31:33 INFO Executor: Adding file:/Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203118-0000/4/./mojo2-runtime.jar to class loader 19/10/06 20:31:33 INFO Executor: Fetching spark://192.168.1.73:64009/jars/sparkling_water_assembly.jar with timestamp 1570419086379 19/10/06 20:31:33 DEBUG TransportClient: Sending stream request for /jars/sparkling_water_assembly.jar to /192.168.1.73:64009 19/10/06 20:31:33 INFO Utils: Fetching spark://192.168.1.73:64009/jars/sparkling_water_assembly.jar to /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-706d8462-aced-4a6a-a547-8b270616e6dd/executor-f2c519d9-a286-4102-bad1-efb3eff5b30e/spark-3d609fec-8bbd-4c5a-8de7-2b23090e80f8/fetchFileTemp10465821312929586.tmp 19/10/06 20:31:33 INFO Utils: Copying /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-706d8462-aced-4a6a-a547-8b270616e6dd/executor-f2c519d9-a286-4102-bad1-efb3eff5b30e/spark-3d609fec-8bbd-4c5a-8de7-2b23090e80f8/-4211890191570419086379_cache to /Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203118-0000/4/./sparkling_water_assembly.jar 19/10/06 20:31:33 INFO Executor: Adding file:/Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203118-0000/4/./sparkling_water_assembly.jar to class loader

but in failing runs the order is different:

1 2 3 4 5 6 7 8 9 10 11 12 19/10/06 20:32:17 INFO Executor: Fetching spark://192.168.1.73:64095/jars/sparkling_water_assembly.jar with timestamp 1570419125109 19/10/06 20:32:17 DEBUG TransportClient: Sending stream request for /jars/sparkling_water_assembly.jar to /192.168.1.73:64095 19/10/06 20:32:17 INFO Utils: Fetching spark://192.168.1.73:64095/jars/sparkling_water_assembly.jar to /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-091a5bc5-1a16-4d1b-b5d9-320df4049284/executor-5ec6d2f1-0172-4ff3-9112-ff352828cd1d/spark-b096a45a-4fc3-472 a-a4cf-8dfa6984c0b2/fetchFileTemp7358795635807408516.tmp 19/10/06 20:32:17 INFO Utils: Copying /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-091a5bc5-1a16-4d1b-b5d9-320df4049284/executor-5ec6d2f1-0172-4ff3-9112-ff352828cd1d/spark-b096a45a-4fc3-472a-a4cf-8dfa6984c0b2/16396550961570419125109_cache to /Users/micha l/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203157-0000/5/./sparkling_water_assembly.jar 19/10/06 20:32:18 INFO Executor: Adding file:/Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203157-0000/5/./sparkling_water_assembly.jar to class loader 19/10/06 20:32:18 INFO Executor: Fetching spark://192.168.1.73:64095/jars/mojo2-runtime.jar with timestamp 1570419116417 19/10/06 20:32:18 DEBUG TransportClient: Sending stream request for /jars/mojo2-runtime.jar to /192.168.1.73:64095 19/10/06 20:32:18 INFO Utils: Fetching spark://192.168.1.73:64095/jars/mojo2-runtime.jar to /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-091a5bc5-1a16-4d1b-b5d9-320df4049284/executor-5ec6d2f1-0172-4ff3-9112-ff352828cd1d/spark-b096a45a-4fc3-472a-a4cf-8dfa6984c0b2/fetchFileTemp4099205527266621175.tmp 19/10/06 20:32:18 INFO Utils: Copying /private/var/folders/1f/gcjrygts5w9fq0l149_6dzgw0000gn/T/spark-091a5bc5-1a16-4d1b-b5d9-320df4049284/executor-5ec6d2f1-0172-4ff3-9112-ff352828cd1d/spark-b096a45a-4fc3-472a-a4cf-8dfa6984c0b2/13134623861570419116417_cache to /Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203157-0000/5/./mojo2-runtime.jar 19/10/06 20:32:18 INFO Executor: Adding file:/Users/michal/Bin/spark/spark-2.4.0-bin-hadoop2.7/work/app-20191006203157-0000/5/./mojo2-runtime.jar to class loader

Note: spark is adding the jars to executor classloader in undefined order:
https://github.com/apache/spark/blob/5a512e86e94593bc004a35101ad6497e20c13e0a/core/src/main/scala/org/apache/spark/executor/Executor.scala#L818-L826

Environment

None

Status

Assignee

Unassigned

Reporter

Michal Malohlava

Labels

None

Release Priority

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

Priority

Major
Configure