Use for questions specific to Apache Spark 1.6. For general questions related to Apache Spark use the tag [apache-spark].
Questions tagged [apache-spark-1.6]
111 questions
0
votes
0 answers
Unable to run Spark in yarn-cluster mode
I'm trying to run spark job with YARN in cluster deploy mode.
I tried to run the simpliest spark-submit command only with jar path, class parameter and master yarn-cluster. However I still have the same error, which actually tells me…

Tymek
- 37
- 8
0
votes
1 answer
spark scala - Merge mutiple rows into one
I have a dataframe
|--id:string (nullable = true)
|--ddd:struct (nullable = true)
|-- aaa: string (nullable = true)
|-- bbb: long(nullable = true)
|-- ccc: string (nullable = true)
|-- eee: long(nullable = true)
I am having output like…

gayathri
- 73
- 3
- 10
0
votes
2 answers
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging
My Spark Consumer is failing with "logging" error. I found while browsing the error is due to incompatibility of jars.
I am using Spark 1.6.3 and all the dependencies is used in pom,xml are 1.6.3. Still I am getting the same error. Below is my…

user3837415
- 41
- 3
- 15
0
votes
1 answer
CassandraSourceRelation not serializable when joining two dataframes
I have a setup of dataframes with spark-cassandra-connector 1.6.2.
I try to perform some transformations with cassandra. Datastax enterprise version is 5.0.5.
DataFrame df1 = sparkContext
…

chocopie_dono
- 3
- 4
0
votes
1 answer
Spark - Convert RDD[Vector] to DataFrame with variable columns
what's the best solution to generalize the conversion from RDD[Vector] to DataFrame with scala/spark 1.6.
The input are different RDD[Vector].
The columns number in Vector could be from 1 to n for different RDD.
I tried using shapeless library, bat…

Arturo Gatto
- 53
- 2
- 9
0
votes
1 answer
Broadcast' object has no attribute 'destroy'?
In my pyspark scirpt code, I declare a Broadcast variable. At last, I want to destory this variable, but get
Blockquote
AttributeError: 'Broadcast' object has no attribute 'destroy'
My code like this:
br =…

Xylon Zhang
- 23
- 5
0
votes
1 answer
Apache spark WHERE clause not working
I'm running Apache Spark 1.6.1 on a smaller yarn cluster. I'm attempting to pull data in from a hive table, using a query like:
df = hiveCtx.sql("""
SELECT *
FROM hive_database.gigantic_table
WHERE loaddate = '20170502'
""")
However, the…

m_wynn
- 1
- 1
- 2
0
votes
1 answer
Why does Spark Streaming not read from Kafka topic?
Spark Streaming 1.6.0
Apache Kafka 10.0.1
I use Spark Streaming to read from sample topic. The code runs with no errors or exceptions but I don't get any data on the console via print() method.
I checked to see if there are messages in the…

Sahil
- 23
- 4
0
votes
0 answers
Why does spark-shell --master yarn-client fail with "UnknownHostException: Invalid host name"?
This is Spark 1.6.1.
When I do below at spark/bin
$ ./spark-shell --master yarn-client
I get the following error.
I checked hostname at /etc/hosts and also in Hadoop but they are assigned as same hostname. Any idea?

Judy Kim
- 13
- 2
0
votes
1 answer
Not able to load hive table into Spark
I am trying to load data from hive table using spark-sql. However, it doesn't return me anything. I tried to execute the same query in hive and it prints out the result. Below is my code which I am trying to execute in…

prateek
- 1
- 1
0
votes
1 answer
Method unknown error on cluster, works locally - both spark versions are identical
I'm having a problem using spark.ml.util.SchemaUtils on Spark v1.6.0. I get the following error:
Exception in thread "main" java.lang.NoSuchMethodError:…

datasock
- 1
- 2
0
votes
1 answer
Spark write the file inside the worker process
I have a Spark job that is generating a set of results with statistics. My number of work items are more than slave count. So I am doing more than one processing per slave.
I cache results after generating RDD objects to be able to reuse them as I…

mert
- 1,942
- 2
- 23
- 43
0
votes
1 answer
Which jar has org.apache.spark.sql.types?
I am on Spark 1.x, and attempting to read csv files. If I need to specify some data types, as per the documentation, I need to import the types defined in the package org.apache.spark.sql.types.
import…

sudheeshix
- 1,541
- 2
- 17
- 28
0
votes
1 answer
Window functions / scala / spark 1.6
I would like to use a window function in Scala.
I have a CSV file which is the following one :
id;date;value1
1;63111600000;100
1;63111700000;200
1;63154800000;300
When I try to apply a window function over this data frame,
sometimes it works and…

Gaëlle Hisler
- 21
- 2
0
votes
1 answer
How to foreachRDD over records from Kafka in Spark Streaming?
I'd like to run a Spark Streaming application with Kafka as the data source. It works fine in local but fails in cluster. I'm using spark 1.6.2 and Scala 2.10.6.
Here are the source code and the stack trace.
DevMain.scala
object DevMain extends App…

user2359997
- 561
- 1
- 16
- 40