Highest Voted 'spark-avro' Questions

1

vote

1 answer

Creating objects from Primitive avro schema

Suppose I have a schema in avro like this { "type" : "string" } How should i create object from this schema in java?

asked Aug 22 '18 at 12:47

Satish Gupta

56
6

1

vote

0 answers

JsonDecoder parsing failing in spark streaming

I am trying to decode a message coming as part of avro message in my spark2.2 streaming. I have a schema defined for this json and whenever the json message comes with out honoring the json schema, my JsonDecoder fails with below error Caused by:…

spark-streaming avro spark-avro

asked Aug 20 '18 at 19:13

D P

153
2
12

1

vote

1 answer

SCHEMA REGISTRY KAFKA: how could i integrate it into java project

After going through several lectures on schema registry and looking into how it works, I am more confused than before. I would like to understand how can I include a schema registry in my kafka project where locally we have some producers and some…

apache-kafka kafka-consumer-api kafka-producer-api spark-avro

asked Jul 31 '18 at 09:18

SteVizzo

29
5

1

vote

0 answers

Spark 1.6 - Overwrite directory with avro files failing using dataframes

I have a directory in HDFS which contains avro files. While I try to overwrite the directory with dataframe it fails. Syntax: avroData_df.write.mode(SaveMode.Overwrite).format("com.databricks.spark.avro").save("") The error is: Caused by:…

apache-spark-sql spark-avro apache-spark-1.6

asked Jul 19 '18 at 09:47

Mnav505

13
3

1

vote

0 answers

How to convert spark streaming Dataset[String] to DataFrame[Row]

I have a non-standard kafka format messages so the code looks like as following val df:Dataset[String] = spark .readStream .format("kafka") .option("subscribe", topic) .options(kafkaParams) .load() .select($"value".as[Array[Byte]]) …

apache-spark spark-streaming spark-csv spark-avro

asked Jun 28 '18 at 19:33

Julias

5,752
17
59
84

1

vote

1 answer

Writing an array of multiple different Records to Avro format, into the same file

We have some legacy file format, which I would need to migrate to Avro storage. The tricky part is that the records basically have some common fields, a discriminator field and some unique fields, specific to the type selected by the…

avro spark-avro

asked Jun 20 '18 at 11:41

Peter G. Horvath

535
1
3
15

1

vote

1 answer

How to get the avro schema from StructType

I have a dataFrame Dataset dataset = getSparkInstance().createDataFrame(newRDD, struct); dataset.schema() is returning me a StructType. But I want the actual schema to store in sample.avsc file Basically I want to convert StructType to Avro…

spark-avro

asked Mar 06 '18 at 11:27

Sumit G

436
8
21

1

vote

2 answers

avro json additional field

I have following avro schema { "type":"record", "name":"test", "namespace":"test.name", "fields":[ {"name":"items","type": {"type":"array", "items": …

avro spark-avro

asked Jan 24 '18 at 22:25

ASe

535
5
15

1

vote

1 answer

How to read Avro Encoded kafka message in scala without knowing avro schema?

I need to write a Scala or Java client to read Kafka message from a topic whose messages are Avro encoded and schema changes dynamically. Please suggest a solution to read these messages without writing as Avro file.

java scala apache-kafka avro spark-avro

asked Dec 27 '17 at 06:47

Nagaraj Vittal

881
13
26

1

vote

1 answer

Hive on spark. Reading parquet file

I'm trying to read parquet file into Hive on Spark. So I've found out that I should do something kind of that: CREATE TABLE avro_test ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS AVRO TBLPROPERTIES…

hadoop hive avro parquet spark-avro

asked Jul 21 '17 at 15:44

Marcel Mars

388
5
16

1

vote

3 answers

Convert org.apache.avro.generic.GenericRecord to org.apache.spark.sql.Row

I have list of org.apache.avro.generic.GenericRecord, avro schemausing this we need to create dataframe with the help of SQLContext API, to create dataframe it needs RDD of org.apache.spark.sql.Row and avro schema. Pre-requisite to create DF is we…

apache-spark apache-spark-sql avro mapr spark-avro

asked Jun 13 '17 at 10:13

Sagar balai

479
6
13

1

vote

1 answer

Spark CodeGenerator failed to compile, got NPE, infrequently

I'm doing simple spark aggregation operation, reading data from avro file as dataframe and then mapping them to case-classes using rdd.map method then doing some aggregation operation, like count etc. Most of the time it works just fine. But…

apache-spark-sql spark-avro

asked Mar 30 '17 at 07:12

Zer001

619
2
8
18

1

vote

0 answers

AvroTypeException: When writing in python3

My avsc file is as follows: {"type":"record", "namespace":"testing.avro", "name":"product", "aliases":["items","services","plans","deliverables"], "fields": [ {"name":"id", "type":"string"…

python-3.x avro spark-avro avro-tools

asked Mar 29 '17 at 18:48

Vishnu Prasad

11
2

1

vote

1 answer

IncompatibleSchemaException: Unexpected type VectorUDT when serializing in Avro format

I am using Spark Mllib to generate predictions for my data and then store them to HDFS in Avro format: val dataPredictions = myModel.transform(myData) val output = dataPredictions.select("is", "probability",…

scala apache-spark apache-spark-mllib avro spark-avro

asked Mar 16 '17 at 15:32

Marsellus Wallace

17,991
25
90
154

1

vote

1 answer

Avro tojson date format

I imported table with selected columns using sqoop to avro file format. Using avro-tools tojson the date appear in strange format (negetive). How can I decode date ? {"first_name":{"string":"Mary"},"last_name": …

mysql sqoop avro spark-avro

asked Mar 09 '17 at 06:38

moron

69
9

Questions tagged [spark-avro]