Questions tagged [spark-shell]

More information can be found in the official documentation.

135 questions
0
votes
0 answers

running spark-shell makes a "connection refused"

I am trying to run spark over hadoop(yarn). when I try to run spark-shell it makes ConnectionRefused exception.the log is like this: ERROR cluster.YarnClientSchedulerBackend: The YARN application has already ended! It might have been killed or the…
abolfazl-sh
  • 25
  • 1
  • 11
0
votes
2 answers

Where toDF in Spark-shell, how to use with Vector, Seq or other?

I try some basic data types, val x = Vector("John Smith", 10, "Illinois") val x = Seq("John Smith", 10, "Illinois") val x = Array("John Smith", 10, "Illinois") val x = ... val x = Seq( Vector("John Smith",10,"Illinois"), Vector("Foo",2,"Bar")) but…
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304
0
votes
1 answer

spark-shell avoid typing spark.sql(""" query """)

I use spark-shell a lot and often it is to run sql queries on database. And only way to run sql queries is by wrapping them in spark.sql(""" query """). Is there a way to switch to spark-sql directly and avoid the wrapper code? E.g. when using…
hsenpaws
  • 21
  • 1
  • 8
0
votes
1 answer

Scala : Cant run gcloud compute ssh

I am trying to run a hive query using gcloud compute ssh via scala First, here is what i tried scala> import sys.process._ scala> val results = Seq("hive", "-e", "show databases;").!! asd zxc qwe scala> which is good. Now, i want to run the same…
AbtPst
  • 7,778
  • 17
  • 91
  • 172
0
votes
3 answers

Unable to run spark sql through shell script

I am unable to query a table in spark through shell script. But if i am running through command line, i am able to get the result. Problem arises when i insert those command in shell and trying to run. Created a shell script : vi test.sh Inserted…
Abhinandan
  • 7
  • 1
  • 5
0
votes
1 answer

Getting partition logs while running through Spark-Shell

I am running my code using Spark-Shell in EMR cluster. Sample is: [hadoop@ ~]$ spark-shell --jars --num-executors 72 --executor-cores 5 --executor-memory 16g --conf spark.default.parallelism=360 ... scala> val args =…
user811602
  • 1,314
  • 2
  • 17
  • 47
0
votes
1 answer

Write Data as Json from DataFrame to Azure Blob Storage

I have some data in dataframe which i have to convert to json and store it into Azure Blob Storage. Is there any way to achieve this? Below are the steps which i have tried. I am trying it from spark-shell. val df = spark.sql("select * from…
Antony
  • 970
  • 3
  • 20
  • 46
0
votes
1 answer

resource management on spark jobs on Yarn and spark shell jobs

Our company has a 9 nodes clusters on cloudera. We have 41 long running spark streaming jobs [YARN + cluster mode] & some regular spark shell jobs scheduled to run on 1pm daily. All jobs are currently submitted at user A role [ with root…
user2778168
  • 193
  • 9
0
votes
1 answer

Out of memory exception or worked node lost during the spark scala job

I am executing a spark-scala job using spark-shell and the problem I am facing is, at the end of the final stage and final mapper like in stage 5 it allocates 50 and completed 49 very quickly and at the 50th it takes 5 minutes and says that out of…
GRK
  • 91
  • 1
  • 3
  • 18
0
votes
1 answer

Internals of reduce function in spark-shell

Input file contains 20 lines. I am trying to count total number of records using reduce function. Can anyone please explain me why there is difference in the results? Because here value of y is nothing but only 1. Default number of partitions :…
Pratik Garg
  • 747
  • 2
  • 9
  • 21
0
votes
1 answer

spark-shell - Not able to access java functions in jars

I have started exploring spark 2 days back. So I am pretty new to it. My use case is around accessing a java function present in an external jar in my scala code which I am writing in spark-shell. But I think I am not loading my jar properly. Here…
0
votes
0 answers

hive select query throw exception over spark create external table use ORC format

I have created sample table by spark-shell. Write datframe to external table use ORC format by partition.It's working file with in spark-shell read/writing both. But when I tried to extecute same select query over the hive-shell it throw exception.…
Sudhir
  • 1
  • 1
  • 4
0
votes
2 answers

Select data using utf-8 character encoding from hive

I am selecting data from my hive table/view but the character encoding is not picked up by the spark-shell or beeline, but if I am selecting the same data from Ambari(Directly throguh Hive), but from command line Hive has been disabled for security…
GRK
  • 91
  • 1
  • 3
  • 18
0
votes
0 answers

Unable to access hive using spark

I'm trying to access hive through spark-shell. I'm using Windows 8. Hive version - 2.1.1 Spark version - 2.4.0 Hadoop version - 2.7.7 To begin with I've entered following code in Spark-shell import org.apache.spark.sql.hive.HiveContext …
0
votes
1 answer

Unable to Parse from String to Int with in Case Class

can some one help me where i am exactly missing with this code? I am unable to parse the phone from String to Integer case class contactNew(id:Long,name:String,phone:Int,email:String) val contactNewData =…
Avinash
  • 393
  • 1
  • 4
  • 9
1 2 3
8
9