Uploaded image for project: 'SW'
  1. SW-357

PySparkling in Zeppelin environment using wrong class loader

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.9, 2.0.6, 2.1.1
    • Component/s: None
    • Labels:
      None

      Description

      In Zeppelin with PySparkling the call:

      %pyspark
      sc.addPyFile("/home/kuba/sparkling-water-2.1.0/py/build/dist/h2o_pysparkling_2.1-2.1.0-py2.7.egg")
      from pysparkling import *
      from pyspark import SparkContext
      from pyspark.sql import SQLContext
      import h2o
      
      hc = H2OContext.getOrCreate(sc)
      

      produces

      Traceback (most recent call last):
        File "/tmp/zeppelin_pyspark-7490147146785239051.py", line 346, in <module>
          raise Exception(traceback.format_exc())
      Exception: Traceback (most recent call last):
        File "/tmp/zeppelin_pyspark-7490147146785239051.py", line 339, in <module>
          exec(code)
        File "<stdin>", line 6, in <module>
        File "build/bdist.linux-x86_64/egg/pysparkling/context.py", line 105, in getOrCreate
          h2o_context = H2OContext(spark_context)
        File "build/bdist.linux-x86_64/egg/pysparkling/context.py", line 76, in __init__
          Initializer.load_sparkling_jar(spark_context)
        File "build/bdist.linux-x86_64/egg/pysparkling/initializer.py", line 18, in load_sparkling_jar
          Initializer.__add_sparkling_jar_to_spark(spark_context)
        File "build/bdist.linux-x86_64/egg/pysparkling/initializer.py", line 35, in __add_sparkling_jar_to_spark
          cl.addURL(url)
        File "/home/kuba/spark-2.1.0-bin-hadoop2.4/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
          answer, self.gateway_client, self.target_id, self.name)
        File "/home/kuba/spark-2.1.0-bin-hadoop2.4/python/pyspark/sql/utils.py", line 63, in deco
          return f(*a, **kw)
        File "/home/kuba/spark-2.1.0-bin-hadoop2.4/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 323, in get_return_value
          format(target_id, ".", name, value))
      Py4JError: An error occurred while calling o96.addURL. Trace:
      py4j.Py4JException: Method addURL([class java.net.URL]) does not exist
      	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
      	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
      	at py4j.Gateway.invoke(Gateway.java:272)
      	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
      	at py4j.commands.CallCommand.execute(CallCommand.java:79)
      	at py4j.GatewayConnection.run(GatewayConnection.java:214)
      	at java.lang.Thread.run(Thread.java:745)
      

        Attachments

          Activity

            People

            • Assignee:
              michal Michal Malohlava
              Reporter:
              michal Michal Malohlava
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: