More information can be found in the official documentation.
Questions tagged [spark-shell]
135 questions
0
votes
0 answers
running spark-shell makes a "connection refused"
I am trying to run spark over hadoop(yarn).
when I try to run spark-shell it makes ConnectionRefused exception.the log is like this:
ERROR cluster.YarnClientSchedulerBackend: The YARN application has already ended! It might have been
killed or the…

abolfazl-sh
- 25
- 1
- 11
0
votes
2 answers
Where toDF in Spark-shell, how to use with Vector, Seq or other?
I try some basic data types,
val x = Vector("John Smith", 10, "Illinois")
val x = Seq("John Smith", 10, "Illinois")
val x = Array("John Smith", 10, "Illinois")
val x = ...
val x = Seq( Vector("John Smith",10,"Illinois"), Vector("Foo",2,"Bar"))
but…

Peter Krauss
- 13,174
- 24
- 167
- 304
0
votes
1 answer
spark-shell avoid typing spark.sql(""" query """)
I use spark-shell a lot and often it is to run sql queries on database. And only way to run sql queries is by wrapping them in spark.sql(""" query """).
Is there a way to switch to spark-sql directly and avoid the wrapper code? E.g. when using…

hsenpaws
- 21
- 1
- 8
0
votes
1 answer
Scala : Cant run gcloud compute ssh
I am trying to run a hive query using gcloud compute ssh via scala
First, here is what i tried
scala> import sys.process._
scala> val results = Seq("hive", "-e", "show databases;").!!
asd
zxc
qwe
scala>
which is good. Now, i want to run the same…

AbtPst
- 7,778
- 17
- 91
- 172
0
votes
3 answers
Unable to run spark sql through shell script
I am unable to query a table in spark through shell script. But if i am running through command line, i am able to get the result. Problem arises when i insert those command in shell and trying to run.
Created a shell script :
vi test.sh
Inserted…

Abhinandan
- 7
- 1
- 5
0
votes
1 answer
Getting partition logs while running through Spark-Shell
I am running my code using Spark-Shell in EMR cluster. Sample is:
[hadoop@ ~]$ spark-shell --jars --num-executors 72 --executor-cores 5 --executor-memory 16g --conf spark.default.parallelism=360
...
scala> val args =…

user811602
- 1,314
- 2
- 17
- 47
0
votes
1 answer
Write Data as Json from DataFrame to Azure Blob Storage
I have some data in dataframe which i have to convert to json and store it into Azure Blob Storage.
Is there any way to achieve this?
Below are the steps which i have tried. I am trying it from spark-shell.
val df = spark.sql("select * from…

Antony
- 970
- 3
- 20
- 46
0
votes
1 answer
resource management on spark jobs on Yarn and spark shell jobs
Our company has a 9 nodes clusters on cloudera.
We have 41 long running spark streaming jobs [YARN + cluster mode] & some regular spark shell jobs scheduled to run on 1pm daily.
All jobs are currently submitted at user A role [ with root…

user2778168
- 193
- 9
0
votes
1 answer
Out of memory exception or worked node lost during the spark scala job
I am executing a spark-scala job using spark-shell and the problem I am facing is, at the end of the final stage and final mapper like in stage 5 it allocates 50 and completed 49 very quickly and at the 50th it takes 5 minutes and says that out of…

GRK
- 91
- 1
- 3
- 18
0
votes
1 answer
Internals of reduce function in spark-shell
Input file contains 20 lines. I am trying to count total number of records using reduce function. Can anyone please explain me why there is difference in the results? Because here value of y is nothing but only 1.
Default number of partitions :…

Pratik Garg
- 747
- 2
- 9
- 21
0
votes
1 answer
spark-shell - Not able to access java functions in jars
I have started exploring spark 2 days back. So I am pretty new to it. My use case is around accessing a java function present in an external jar in my scala code which I am writing in spark-shell. But I think I am not loading my jar properly. Here…

Himanshu Srivastava
- 85
- 1
- 10
0
votes
0 answers
hive select query throw exception over spark create external table use ORC format
I have created sample table by spark-shell. Write datframe to external table use ORC format by partition.It's working file with in spark-shell read/writing both. But when I tried to extecute same select query over the hive-shell it throw exception.…

Sudhir
- 1
- 1
- 4
0
votes
2 answers
Select data using utf-8 character encoding from hive
I am selecting data from my hive table/view but the character encoding is not picked up by the spark-shell or beeline, but if I am selecting the same data from Ambari(Directly throguh Hive), but from command line Hive has been disabled for security…

GRK
- 91
- 1
- 3
- 18
0
votes
0 answers
Unable to access hive using spark
I'm trying to access hive through spark-shell. I'm using Windows 8.
Hive version - 2.1.1
Spark version - 2.4.0
Hadoop version - 2.7.7
To begin with I've entered following code in Spark-shell
import org.apache.spark.sql.hive.HiveContext
…

shaheelkhan
- 3
- 3
0
votes
1 answer
Unable to Parse from String to Int with in Case Class
can some one help me where i am exactly missing with this code? I am unable to parse the phone from String to Integer
case class contactNew(id:Long,name:String,phone:Int,email:String)
val contactNewData =…

Avinash
- 393
- 1
- 4
- 9