Highest Voted 'apache-spark-2.0' Questions

0

votes

1 answer

Trouble Submitting Apache Spark Application to Containerized Cluster

I am having trouble running a Spark application using both spark-submit and the internal REST API. The deployment scenario I would like to demonstrate is Spark running as a cluster on my local laptop. To that end, I've created two Docker containers…

apache-spark apache-spark-2.0 spark-submit

asked Dec 17 '17 at 16:33

Michael Reynolds

96
1
5

0

votes

3 answers

Spark standalone cluster tuning

We have spark 2.1.0 standalone cluster running on a single node with 8 cores and 50GB memory(single worker). We run spark applications in cluster mode with the following memory settings - --driver-memory = 7GB (default - 1core is…

apache-spark apache-spark-sql apache-spark-2.0 apache-spark-standalone

asked Dec 14 '17 at 16:04

veerat

105
9

0

votes

1 answer

Apache Spark connector to read from Azure Queue service?

This can be more of a configuration question, but could not find a specific answer to the problem I am trying to solve. I am looking for a connector to read from Azure Storage Queue Service through Spark, though there are connectors available for…

azure azure-storage apache-spark-2.0

asked Dec 14 '17 at 12:50

jdk2588

782
1
9
23

0

votes

1 answer

Spark2 mongodb connector polymorphic schema

I have collection col that contains { '_id': ObjectId(...) 'type': "a" 'f1': data1 } on same collection i have { '_id': ObjectId(...) 'f2': 222.234 'type': "b" } Spark MongoDB connector Is not working fine. It's reorder the…

mongodb apache-spark pyspark apache-spark-sql apache-spark-2.0

asked Dec 13 '17 at 09:44

Yehuda

457
2
6
16

0

votes

1 answer

Ignite : system.out.print commands in code not logging into the log file

I started two ignite server nodes with the following on console /root/apache-ignite-fabric-2.3.0-bin/bin/ignite.sh -v From a remote client, I run the ClusterGroup example program. I see the below type of logs (printed from system.out.print) in both…

apache-spark ignite apache-spark-2.0

asked Dec 09 '17 at 03:14

Mahesh Renduchintala

672
7
18

0

votes

1 answer

Apache Spark -- Data Grouping and Execution in worker nodes

We are getting live machine data as json and we get this data from RabbitMQ. below is a sample of the json, {"DeviceId":"MAC-1001","DeviceType":"Sim-1","TimeStamp":"05-12-2017…

apache-spark apache-spark-sql apache-spark-2.0

asked Dec 05 '17 at 05:28

Ramesh Kumar R

65
7

0

votes

3 answers

Spark version mismatch using maven dependencies

I want ro run simple worcount ecample using apache Spark. Using local jar files in $SPARK_HOME/jars it runs correctly, but using maven dependancies it errors: java.lang.NoSuchMethodError:…

maven apache-spark apache-spark-2.0

asked Nov 28 '17 at 11:56

Soheil Pourbafrani

3,249
3
32
69

0

votes

1 answer

Why is Spark2 running on only one node?

I am running Spark2 from Zeppelin (0.7 in HDP 2.6) and I am doing an idf transformation which crashes after many hours. It is run on a cluster with a master and 3 datanodes: s1, s2 and s3. All nodes have a Spark2 client and each has 8 cores and 16GB…

hadoop-yarn hortonworks-data-platform apache-zeppelin apache-spark-2.0

asked Nov 24 '17 at 13:52

schoon

2,858
3
46
78

0

votes

1 answer

Set current project in sbt - spark build issue

I am getting the error Set current project to spark-parent (in build file:/C:/cygwin64/spark-current/spark-2.1.1/) while trying to build spark. Is there an option "-Dcurrent" or some sbt switch that I can set to facilitate this or do I need to…

apache-spark build sbt apache-spark-2.0 sbt-buildinfo

asked Nov 23 '17 at 18:45

uh_big_mike_boi

3,350
4
33
64

0

votes

0 answers

Spark Standalone cluster only two workers utilized

In Spark Standalone Cluster, only 2 of the 6 worker instances get utilized, rest of them are idle. I used two VMs both having 4 cores. 2 workers were on the local VM(where master was started) and 4 workers were on the other VM. Only local two got…

java apache-spark apache-spark-2.0 spark-submit apache-spark-standalone

asked Nov 20 '17 at 08:21

Ashwin Daswani

1
1

0

votes

1 answer

Spark 2.1 register UDF to functionRegistry

Hi I want to register a UDF object that is already created. I'm using spark 2.1, and the sparkSession.udf.register() function does not accept a UDF parameter only a regular scala function. It's easy to miss something from the large Spark API so…

apache-spark apache-spark-sql user-defined-functions apache-spark-2.0

asked Nov 17 '17 at 17:26

uh_big_mike_boi

3,350
4
33
64

0

votes

1 answer

Can I set a general-purpose (not spark.*) parameter when submitting a spark application?

A normal way to set a parameter in spark-submit is using --conf: spark2-shell --conf "spark.nonexisting=true" --conf "failOnDataLoss=false" Unfortunately this only works for spark.* parameters and I need to set up other parameters which are simply…

apache-spark apache-spark-2.0

asked Nov 16 '17 at 14:22

Viacheslav Rodionov

2,335
21
22

0

votes

1 answer

Running Dependent Queries with SparkSQL using Spark Session

We have 3 queries which are currently running on HIVE. Using Spark 2.1.0 We are trying to Run that using Spark SQL but by using the SparkSession(like wrapping with Scala code making a Jar & then Submit using Spark-Submit) Now for Example lets say…

scala apache-spark apache-spark-sql apache-spark-2.0 spark-hive

asked Nov 15 '17 at 22:30

AJm

993
2
20
39

0

votes

2 answers

Spark Streaming design questions

I don't have any specific query but design question. I am new to spark/streaming hence forgive me if I am asking dumb question. Please delete it if question is inappropriate for this forum. So basically we have requirement where we have to…

apache-spark spark-streaming apache-spark-2.0

asked Nov 14 '17 at 13:02

Rishi Saraf

1,644
2
14
27

0

votes

1 answer

spark history not start on ambari cluster

we start the spark history as the following /usr/hdp/2.6.0.3-8/spark2/sbin/start-history-server.sh from the log spark-root-org.apache.spark.deploy.history.HistoryServer-1-master01 we get WARN AbstractLifeCycle: FAILED…

hadoop apache-spark ambari apache-spark-2.0

asked Nov 01 '17 at 04:25

King David

500
1
7
20

Questions tagged [apache-spark-2.0]