Questions tagged [apache-spark-3.0]
27 questions
0
votes
0 answers
Migrating from Spark 2.4 to Spark 3. What's Spark 2.4's SharedSQLContext equivalent in Spark 3?
I'm fairly new to java/scala. I'm unable to find SharedSQLContext in Spark 3 repo. How do we generally find the class equivalent in more updated versions? I couldn't find any documentation on this. Thank you!
Sample existing class:
class…

sojim2
- 1,245
- 2
- 15
- 38
0
votes
0 answers
Fail to read an HBase table with Java Spark 3.2.3
I have a Java Spark application, in which I need to read all the row keys from an HBase table.
Up until now, I worked with Spark 2.4.7 and we migrated to Spark 3.2.3. I used newAPIHadoopRDD but HBase is returning an empty result after the Spark…

Oded
- 336
- 1
- 3
- 17
0
votes
1 answer
Configure Spark 3 thrift server with Apache Ranger
I am trying to configure Spark 3.3.0 Thrift Server with Apache Ranger but I cannot find any resources or information for this setup.Any suggestions on how to implement this? Thanks very much!
I already have an STS (kerberos jdbc) turned on and…

adel mejri
- 13
- 5
0
votes
0 answers
What is the alternative to com.stratio.receiver.spark-rabbitmq for Spark 3?
I have a spark streaming application and want to upgrade it to Spark 3 from 2.
It is consuming messages from RabbitMQ using com.stratio.receiver.spark-rabbitmq version 0.5.1. But this library is not available for Spark 3. Is there any alternatives…

ZMI
- 1
- 1
0
votes
1 answer
Apache spark: asc not working as expected
I have following code:
df.orderBy(expr("COUNTRY_NAME").desc, expr("count").asc).show()
I expect count column to be arranged in ascending order for a given COUNTRY_NAME. But I see something like this:
Last value of 12 is not as per the…

Mandroid
- 6,200
- 12
- 64
- 134
0
votes
0 answers
spark 3.0: java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO while writing to a table
I am trying to write a dataframe to a table:
spark.sql("CREATE DATABASE IF NOT EXISTS my_db")
spark.catalog.setCurrentDatabase("my_db")
dataFrame.write
.format("csv")
.mode(SaveMode.Overwrite)
.bucketBy(5, "NAME", "DEPT")
…

Mandroid
- 6,200
- 12
- 64
- 134
0
votes
0 answers
Cannot import commons-dbutils in sbt
I tried adding the common-dbutils dependency to my project using sbt by adding the below line to the build.sbt file.
libraryDependencies += "commons-dbutils" % "commons-dbutils" % "1.6"
I didn't get any error as well. Looking at the dependency tree…

Vaisakh
- 1
- 1
- 2
0
votes
0 answers
Spark REST API to list running and stopped queries
I am exploring the spark rest API for structured streaming.
I have looked the all exposed rest API available in below link.
https://spark.apache.org/docs/latest/monitoring.html
however, I could not figure out how to get the list of "Active Streaming…

Monu
- 2,092
- 3
- 13
- 26
0
votes
1 answer
Spark can't connect to DB with built-in connection providers
I'm trying to connect to Postgres follow this document
And the document said built-in connection providers. Can anyone help me resolve this, please?
`
There is a built-in connection providers for the following databases:
DB2
MariaDB
MS…

MasterLuV
- 396
- 1
- 17
0
votes
0 answers
Apache Phoenix - Count query returns more than 100k rows, but SELECT query does not return any row
Using Apache Spark 3, I manipulated some CSV data, stored in a dataframe, with the intention of sending it to HBase.
The data is successfully sent using JavaHBaseContext's bulkPut() method.
However, in Apache Phoenix, using a plain SELECT query, I…

Mohamed Ennahdi El Idrissi
- 2,931
- 2
- 19
- 32
0
votes
1 answer
ServiceConfigurationError running spark 3.2
I am trying to update code written with spark 2.4 and doing some tests with spark 3.2. I am able to create a spark session:
spark = (
SparkSession.builder
.config('spark.jars.packages',…

DatGuy
- 377
- 1
- 4
- 10
0
votes
0 answers
Why Apache Spark does some checks and raises those exceptions during the job runtime, but has never thrown them during Unit test?
There was a bug in my Scala code, formatting the date of the timestamp, being then concatenated as the String to some, non-timestamp column of the Spark Streaming:
concat(date_format(col("timestamp"),"yyyy-MM-DD'T'HH:mm:ss.SSS'Z'")
So, during the…

Eljah
- 4,188
- 4
- 41
- 85