Highest Voted 'spark-streaming' Questions

2

votes

1 answer

MapReduce: How to pass HashMap to mappers

I'm designing the new generation of an analysis system which needs to process many events from many sensors in near-real time. And to do that I want to use one of the Big Data Analytics platforms such as Hadoop, Spark Streaming or Flink. In order to…

asked Feb 16 '17 at 11:00

Gal Dreiman

3,969
2
21
40

2

votes

0 answers

how to close the SQLContext programmatically?

We have a Spark Streaming job, inside DStream foreachRDD method, i am creating a SQLContext, the reason that i am creating SQLContext inside foreachRDD method instead of outside is, when i enable check-pointing it says SQLContext is not…

scala serialization spark-streaming kafka-consumer-api

asked Feb 16 '17 at 05:13

Shankar

8,529
26
90
159

2

votes

1 answer

Gradual Increase in old generation heap memory

I am facing a very strange issue in spark streaming. I am using spark 2.0.2, number of nodes 3, number of executors 3 {1 receiver and 2 processor}, memory per executor 2 GB, cores per executor 1. The batch interval is 10 sec. My batch size is…

apache-spark garbage-collection spark-streaming

asked Feb 14 '17 at 13:11

deenbandhu

599
5
18

2

votes

2 answers

Spark Streaming - TIMESTAMP field based processing

I'm pretty new to spark streaming and I need some basic clarification that I couldn't fully understand reading the documentation. The use case is that I have a set of files containing dumping EVENTS, and each events has already inside a field…

apache-spark apache-spark-sql spark-streaming

asked Feb 14 '17 at 09:17

Sokrates

93
1
11

2

votes

1 answer

Spark Standalone: TransportRequestHandler: Error while invoking RpcHandler - when starting workers on different machine/VMs

I am totally new at this, so please pardon for obvious mistakes if any. Exact errors: At Slave: INFO TransportClientFactory: Successfully created connection to /10.2.10.128:7077 after 69 ms (0 ms spent in bootstraps) WARN Worker: Failed to connect…

apache-spark spark-streaming

asked Feb 09 '17 at 00:30

Piyush Banginwar

21
1
2

2

votes

3 answers

SaveToCassandra , Is there any ordering in which the rows are written

This is the content of my RDD which I am saving to Cassandra table. But looks like the 2nd row is written first and then the first row overwrites it. So I end up with bad output. (494bce4f393b474980290b8d1b6ebef9, 2017-02-01, PT0H9M30S,…

apache-spark cassandra spark-streaming datastax

asked Feb 03 '17 at 08:49

shylas

99
4
13

2

votes

1 answer

Spark foreachpartition connection improvements

I have written a spark job which does below operations Reads data from HDFS text files. Do a distinct() call to filter duplicates. Do a mapToPair phase and generate pairRDD Do a reducebykey call do the aggregation logic for grouped…

apache-spark connection spark-streaming amazon-sns

asked Feb 02 '17 at 19:28

Sam

1,333
5
23
36

2

votes

2 answers

How to print PythonTransformedDStream

I'm trying to run word count example integrating AWS Kinesis stream and Apache Spark. Random lines are put in Kinesis at regular intervals. lines = KinesisUtils.createStream(...) When I submit my application, lines.pprint() I don't see any values…

apache-spark pyspark spark-streaming amazon-kinesis

asked Jan 25 '17 at 09:28

ArunDhaJ

621
6
18

2

votes

1 answer

spark-redshift - Error save using Spark 2.1.0

I'm using the spark-redshift to load a Kafka stream getting data events from a MySQL binlog. When I try to save the RDD into Redshift a exception is throwed: command> ./bin/spark-submit --packages…

spark-streaming amazon-redshift databricks

asked Jan 20 '17 at 13:31

Carleto

951
1
9
17

2

votes

1 answer

Issue on Spark Streaming data put data into HBase

I am a beginner in this field, so I can not get a sense of it... HBase ver: 0.98.24-hadoop2 Spark ver: 2.1.0 The following code try to put receiving data from Spark Streming-Kafka producer into HBase. Kafka input data format is like this :…

java apache-spark hbase spark-streaming

asked Jan 20 '17 at 13:28

Chris Joo

577
10
24

2

votes

0 answers

How to unit-test Spark Streaming code in Java

JavaStreamingContext.queueStream() Javadoc states: Changes to the queue after the stream is created will not be recognized Therefore, using a queue for testing window based scenarios in Java is not an option, as opposed to Scala, because elements…

java apache-spark spark-streaming

asked Jan 18 '17 at 20:33

Daniel Nitzan

1,582
3
19
36

2

votes

2 answers

Can Spark streaming and Spark applications be run within the same YARN cluster?

Hello people and happy new year ;) ! I am bulding a lambda architecture with Apache Spark, HDFS and Elastichsearch. In the following picture, here what I am trying to do: So far, I have written the source code in java for my spark streaming and…

hadoop apache-spark spark-streaming hadoop-yarn

asked Jan 13 '17 at 14:14

Yassir S

1,032
3
21
44

2

votes

1 answer

Invoking a utility(external) inside Spark streaming job

I have a streaming job consuming from Kafka (using createDstream). its stream of "id" [id1,id2,id3 ..] I have an utility or an api which accepts an Array of id's and does some external call and receives back some info say "t" for each id…

scala apache-spark spark-streaming rdd dstream

asked Jan 06 '17 at 14:10

Rushabh Mehta

83
5

2

votes

2 answers

Scala/Spark: Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging

I am pretty new to scala and spark. Trying to fix my set-up of spark/scala development. I am confused by the versions and missing jars. I searched on stackoverflow, but still stuck in this issue. Maybe something missing or mis-configured. Running…

scala apache-spark sbt spark-streaming sbt-assembly

asked Jan 05 '17 at 16:46

BAE

8,550
22
88
171

2

votes

1 answer

Why do Spark DataFrames not change their schema and what to do about it?

I'm using Spark 2.1's Structured Streaming to read from a Kafka topic whose contents are binary avro-encoded. Thus, after setting up the DataFrame: val messages = spark .readStream .format("kafka") .options(kafkaConf) .option("subscribe",…

apache-spark apache-spark-sql spark-streaming spark-structured-streaming

asked Jan 05 '17 at 15:31

ssice

3,564
1
26
44

Questions tagged [spark-streaming]