Highest Voted 'apache-spark-1.6' Questions

0

votes

1 answer

How to perform dynamic partition based on row count in dataFram for a column value

I am trying to partition a input files based on accountId But this partition has be done only if dataFrames contains more than 1000 records. The accountId is a dynamic integer that could not be uknown. Consider the following code below val ssc =…

asked Jul 29 '16 at 12:46

Achaius

5,904
21
65
122

0

votes

1 answer

How to know which is the RDD type inferred by Spark using Scala

I was trying the follow example val lista = List(("a", 3), ("a", 1), ("b", 7), ("a", 5)) val rdd = sc.parallelize(lista) Then in the shell I get the following rdd: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[40] at parallelize…

scala shell apache-spark rdd apache-spark-1.6

asked Jul 20 '16 at 07:16

Joseratts

97
1
9

0

votes

0 answers

Spark only uses 1CPU when 2x4CPU are available on reduce()

I have 3 machines: 1x Master with 4x CPU, 8G RAM ; 2x executors with 4x CPU and 16G RAM. The master is standalone mode (no YARN), I'm using pyspark. Even if it is not a huge infrastructure I would still expect some perf out of it. When running a…

apache-spark pyspark apache-spark-1.6

asked Jun 03 '16 at 08:50

pltrdy

2,069
1
11
29

-1

votes

1 answer

Delete Unicode value in output of Spark 1.6 using Scala

The file generated from API contains data like below col1,col2,col3 503004,(d$üíõ$F|'.h*Ë!øì=(.î; ,.¡|®!®3-2-704 when i am reading in spark it is appearing like this. i am using case class to read from RDD then convert it to DataFrame using…

scala apache-spark apache-spark-1.6

asked Sep 20 '19 at 14:45

Sophie Dinka

73
1
8

-1

votes

1 answer

Reading Encoded value in spark 1.6 throwing Error

I am receiving file from API which have a encoded(non-ascii) character value in 3 columns. when i am reading file using DataFrame in Spark1.6 val CleanData= sqlContext.sql("""SELECT COL1 …

scala apache-spark apache-spark-sql apache-spark-1.6

asked Sep 20 '19 at 10:11

Sophie Dinka

73
1
8

-1

votes

1 answer

Read Impala table with SparkSQL

I was trying to execute a query that had functions like lead .. over .. partition and Union. This query works well when I try to run it on impala but fails on Hive. I need to write a Spark job that performs this query. It is failing as well in…

hive pyspark impala apache-spark-1.6 apache-kudu

asked Aug 28 '17 at 19:47

New Coder

499
4
22

Questions tagged [apache-spark-1.6]

How to perform dynamic partition based on row count in dataFram for a column value

How to know which is the RDD type inferred by Spark using Scala

Spark only uses 1CPU when 2x4CPU are available on reduce()

Delete Unicode value in output of Spark 1.6 using Scala

Reading Encoded value in spark 1.6 throwing Error

Read Impala table with SparkSQL