Questions tagged [apache-spark-1.6]

Use for questions specific to Apache Spark 1.6. For general questions related to Apache Spark use the tag [apache-spark].

111 questions
0
votes
0 answers

Unable to run Spark in yarn-cluster mode

I'm trying to run spark job with YARN in cluster deploy mode. I tried to run the simpliest spark-submit command only with jar path, class parameter and master yarn-cluster. However I still have the same error, which actually tells me…
Tymek
  • 37
  • 8
0
votes
1 answer

spark scala - Merge mutiple rows into one

I have a dataframe |--id:string (nullable = true) |--ddd:struct (nullable = true) |-- aaa: string (nullable = true) |-- bbb: long(nullable = true) |-- ccc: string (nullable = true) |-- eee: long(nullable = true) I am having output like…
gayathri
  • 73
  • 3
  • 10
0
votes
2 answers

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging

My Spark Consumer is failing with "logging" error. I found while browsing the error is due to incompatibility of jars. I am using Spark 1.6.3 and all the dependencies is used in pom,xml are 1.6.3. Still I am getting the same error. Below is my…
0
votes
1 answer

CassandraSourceRelation not serializable when joining two dataframes

I have a setup of dataframes with spark-cassandra-connector 1.6.2. I try to perform some transformations with cassandra. Datastax enterprise version is 5.0.5. DataFrame df1 = sparkContext …
0
votes
1 answer

Spark - Convert RDD[Vector] to DataFrame with variable columns

what's the best solution to generalize the conversion from RDD[Vector] to DataFrame with scala/spark 1.6. The input are different RDD[Vector]. The columns number in Vector could be from 1 to n for different RDD. I tried using shapeless library, bat…
0
votes
1 answer

Broadcast' object has no attribute 'destroy'?

In my pyspark scirpt code, I declare a Broadcast variable. At last, I want to destory this variable, but get Blockquote AttributeError: 'Broadcast' object has no attribute 'destroy' My code like this: br =…
0
votes
1 answer

Apache spark WHERE clause not working

I'm running Apache Spark 1.6.1 on a smaller yarn cluster. I'm attempting to pull data in from a hive table, using a query like: df = hiveCtx.sql(""" SELECT * FROM hive_database.gigantic_table WHERE loaddate = '20170502' """) However, the…
0
votes
1 answer

Why does Spark Streaming not read from Kafka topic?

Spark Streaming 1.6.0 Apache Kafka 10.0.1 I use Spark Streaming to read from sample topic. The code runs with no errors or exceptions but I don't get any data on the console via print() method. I checked to see if there are messages in the…
0
votes
0 answers

Why does spark-shell --master yarn-client fail with "UnknownHostException: Invalid host name"?

This is Spark 1.6.1. When I do below at spark/bin $ ./spark-shell --master yarn-client I get the following error. I checked hostname at /etc/hosts and also in Hadoop but they are assigned as same hostname. Any idea?
Judy Kim
  • 13
  • 2
0
votes
1 answer

Not able to load hive table into Spark

I am trying to load data from hive table using spark-sql. However, it doesn't return me anything. I tried to execute the same query in hive and it prints out the result. Below is my code which I am trying to execute in…
prateek
  • 1
  • 1
0
votes
1 answer

Method unknown error on cluster, works locally - both spark versions are identical

I'm having a problem using spark.ml.util.SchemaUtils on Spark v1.6.0. I get the following error: Exception in thread "main" java.lang.NoSuchMethodError:…
datasock
  • 1
  • 2
0
votes
1 answer

Spark write the file inside the worker process

I have a Spark job that is generating a set of results with statistics. My number of work items are more than slave count. So I am doing more than one processing per slave. I cache results after generating RDD objects to be able to reuse them as I…
mert
  • 1,942
  • 2
  • 23
  • 43
0
votes
1 answer

Which jar has org.apache.spark.sql.types?

I am on Spark 1.x, and attempting to read csv files. If I need to specify some data types, as per the documentation, I need to import the types defined in the package org.apache.spark.sql.types. import…
0
votes
1 answer

Window functions / scala / spark 1.6

I would like to use a window function in Scala. I have a CSV file which is the following one : id;date;value1 1;63111600000;100 1;63111700000;200 1;63154800000;300 When I try to apply a window function over this data frame, sometimes it works and…
0
votes
1 answer

How to foreachRDD over records from Kafka in Spark Streaming?

I'd like to run a Spark Streaming application with Kafka as the data source. It works fine in local but fails in cluster. I'm using spark 1.6.2 and Scala 2.10.6. Here are the source code and the stack trace. DevMain.scala object DevMain extends App…
user2359997
  • 561
  • 1
  • 16
  • 40