Questions tagged [py4j]

Py4J enables Python programs to dynamically access arbitrary Java objects

Py4J enables Python programs running in a Python interpreter to dynamically access Java objects in a Java Virtual Machine. Methods are called as if the Java objects resided in the Python interpreter and Java collections can be accessed through standard Python collection methods. Py4J also enables Java programs to call back Python objects. Py4J is distributed under the BSD license.

Here is a brief example of what you can do with Py4J. The following Python program creates a java.util.Random instance from a JVM and calls some of its methods. It also accesses a custom Java class, AdditionApplication to add the generated numbers.

 from py4j.java_gateway import JavaGateway

 gateway = JavaGateway()                   # connect to the JVM

 random = gateway.jvm.java.util.Random()   # create a java.util.Random instance

 number1 = random.nextInt(10)              # call the Random.nextInt method

 number2 = random.nextInt(10)

 print(number1,number2)

(2, 7)

 addition_app = gateway.entry_point        # get the AdditionApplication instance

 addition_app.addition(number1,number2)    # call the addition method

9
235 questions
1
vote
1 answer

py4j.Py4JException: Method socketTextStream does not exist

I am new to Spark Streaming. Using PySpark in PyCharm I am unable to get passed the socketTextStream initialization. def start_streaming (self): sp = SparkContext('local[2]', 'streamingTest') stream = StreamingContext(sp, 1) **items =…
XSoviet
  • 11
  • 4
1
vote
1 answer

Pyspark wrapper for H2O POJO

I created model using H2O's Sparkling Water. And now I'd like to apply it to huge Spark DF (populated with sparse vectors). I use python and pyspark, pysparkling. Basically I need to do map job with model.predict() function inside. But copying data…
USER
  • 83
  • 1
  • 11
1
vote
0 answers

Py4j event listener needed

I've found stomp.py to fall behind when the volume is around 200k messages per hour & jython is not an option, so with a java message listener i'm looking to have a python script 'subscribe' to messages / events that would be generated from this…
1
vote
0 answers

Py4JJavaError: Job aborted due to stage failure from Spark on Windows

I've built the latest spark from git (branch-1.4), however I get an error when doing file IO: if not locals().get('sc'): try: import findspark findspark.init() except ImportError: pass import pyspark sc =…
A T
  • 13,008
  • 21
  • 97
  • 158
1
vote
1 answer

Using Py4J to invoke a method that takes a JavaSparkContext and return a JavaRDD

I am looking for some help or example code that illustrates pyspark calling user written Java code outside of spark itself that takes a spark context from Python and then returns an RDD built in Java. For completeness, I'm using Py4J 0.81, Java 8,…
1
vote
2 answers

Python -> Py4j -> Spark -> Cassandra

I would like to test a simply Spark row count job on a test Cassandra table with only four rows just to verify that everything works. I can quickly get this working from Java: JavaSparkContext sc = new JavaSparkContext(conf); …
clay
  • 18,138
  • 28
  • 107
  • 192
1
vote
1 answer

Opening Eclipse editor from python with Py4J

I'm trying open a file in Eclipse editor from my python program. Here is example how to do this with Java: import java.io.File; import org.eclipse.core.filesystem.EFS; import org.eclipse.core.filesystem.IFileStore; import…
Adam
  • 2,254
  • 3
  • 24
  • 42
1
vote
1 answer

py4j: dict to JAVA map

I'm currently working on accessing HBase using python3. The way I'm doing is using py4j to call JAVA APIs that I'm writing to access HBase. I've a question related to creating a Put object which takes a qualifier and value. I want to pass a…
Mayank
  • 5,454
  • 9
  • 37
  • 60
1
vote
1 answer

TypeError when importing py4j module in a web2py controller

I'm currently having a problem when trying to use py4j on web2py. This is how I'm trying to import it on my web2py controller file: from py4j.java_gateway import JavaGateway When loading the page, this is the error I get:
Victor Girotto
  • 113
  • 1
  • 5
0
votes
0 answers

Py4JJavaError in Jupyter Notebook while using Spark RDD map function

While using Jupyter Notebook for creation of Spark RDD when i try to use Map() function in pyspark it gives me an Py4JJavaError. Here is my code that i tried to run: squared_rdd = rdd.map(lambda x: x**2) result_list = squared_rdd.collect()…
0
votes
0 answers

Sparkcontext error spark on k8s : You are trying to pass an insecure Py4j gateway to Spark. This is not allowed as it is a security risk

I'm running into an issue with my Spark application. When I try to initialize the SparkContext using the provided configuration, I'm encountering the following error: 'RuntimeError: Java gateway process exited before sending its port number.' You…
KUNAL DAS
  • 23
  • 6
0
votes
0 answers

Error in python: Py4JJavaError: An error occurred while calling o906.collectToPython

Error in python: Py4JJavaError: An error occurred while calling o906.collectToPython. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 771.0 failed 1 times, most recent failure: Lost task 0.0 in stage 771.0 (TID…
Aditya
  • 1
0
votes
1 answer

What could be causing 'Py4JError' when calling 'spark.createDataFrame' on PySpark SQL Session?

How to solve: Py4JError when using spark.createDataFrame? I have installed Apache Spark, Java and Python. I am using a Jupyter notebook. I have an error when creating a dataframe. I write: import sys from pyspark.sql import SparkSession from…
0
votes
0 answers

I wrote a program in jupyter with pyspark and I get py4j error, how can I solve this problem?

a MapReduce program in Spark that implements a simple “People You Might Know" social network friendship recommendation algorithm. The key idea is that if two people have a lot of mutual friends, then the system should recommend that they connect…
simin
  • 1
  • 1
0
votes
0 answers

py4J.protocol.Py4JError: org.apache.sql.jdbc.ClickhouseDialect._get_object_id does not exist in JVM

When I try to use pyspark read clickhouse table, there exist array type column which raise me 'Unspoort ARRAY TYPE', then I tried to register the ClickHouseDialect to solve the issue, and py4J.protocol.Py4JError:…
TurboCC
  • 11
  • 1
  • 3