Highest Voted 'spark-avro' Questions

0

votes

1 answer

How to write Avro Objects to Parquet with partitions in Java ? How to append data to the same parquet?

I am using Confluent's KafkaAvroDerserializer to deserialize Avro Objects sent over Kafka. I want to write the recieved data to a Parquet file. I want to be able to append data to the same parquet and to create a Parquet with Partitions. I managed…

asked Nov 14 '18 at 13:03

Sharon Gal-Ed

1
1

0

votes

1 answer

Port data from HDFS/S3 to local FS and load in Java

I have a Spark job running on an EMr cluster that writes out a DataFrame to HDFS (which is then s3-dist-cp-ed to S3). The data size isn't big (2 GB when saved as parquet). These data in S3 are then copied to a local filesystem (EC2 instance running…

apache-spark-sql avro parquet spark-avro hive-serde

asked Jul 19 '18 at 23:49

Nik

5,515
14
49
75

0

votes

1 answer

How to read Avro Schema-typed Events from kafka and store them in a Hive Table

My idea is to use Spark Streaming + Kafka to get the events from the kafka bus. After retrieving a batch of avro-encoded events I would like to transform them with Spark Avro into SparkSQL Dataframes and then write the dataframes to a Hive Table. Is…

spark-streaming spark-avro

asked Jun 29 '18 at 15:40

Hiro.Protagonist

671
6
8

0

votes

0 answers

zero bytes avro file exception

I am currently using avro 1.8.2 to write log events. I am observing certain very rare cases where my DataFileWriter is actually writing out 0 bytes file. As far as i understand a valid avro file should always have header. The code snippet looks like…

google-cloud-platform google-cloud-storage google-compute-engine avro spark-avro

asked Apr 10 '18 at 17:51

user179156

841
9
31

0

votes

1 answer

Issue while loading an avro dataset into Teradata with spark-streaming

I am trying to load a dataset of avro files into a Teradata table through spark streaming (jdbc). The configuration is properly set and the load succeeds to certain extent (I can validate rows of data have been inserted into the table), but halfways…

apache-spark teradata spark-avro spark-structured-streaming

asked Feb 08 '18 at 17:42

b2Wc0EKKOvLPn

2,054
13
15

0

votes

0 answers

Bigquery load from Avro gives can not convert from long to int

I am trying to load the avro file from google storage to Big query tables but faced these issue. Steps i have followed are as below. Create a dataframe in spark. Stored these data by writing it into avro. dataframe.write.avro("path") Loaded these…

google-bigquery avro spark-avro

asked Jan 30 '18 at 12:52

whoisthis

33
8

0

votes

0 answers

error: not found: value SchemaConverters

I am using databricks for my use-case where I have to convert avro schema to struct type. When I searched, it says spark-avro has SchemaConverters to do that. However, I am using spark-avro-2.11-4.0 library and when I use SchemaConverters, I get…

databricks spark-avro

asked Jan 12 '18 at 08:09

NNN

11
1
3

0

votes

1 answer

Spark Avro Throws : Caused by: java.lang.IllegalArgumentException: object is not an instance of declaring class

I am trying to create a dataframe and write the result in avro format. This is giving the IllegalArgumentException exception as mentioned in the subject. It is working correctly if I am saving it as text file but failing while writing avro. Using…

apache-spark apache-spark-sql spark-avro

asked Jan 02 '18 at 06:25

Tirthankar

75
1
9

0

votes

2 answers

Saving data to ElasticSearch in Spark task

While processing a stream of Avro messages through Kafka and Spark, I am saving the processed data as documents in a ElasticSearch index. Here's the code (simplified): directKafkaStream.foreachRDD(rdd ->{ rdd.foreach(avroRecord -> { …

apache-kafka spark-streaming elasticsearch-java-api spark-avro

asked Nov 28 '17 at 08:58

user3352382

343
2
5
12

0

votes

1 answer

Avro schema update with two schema in one avro file

I have one avro file with first schema then I updated the schema that appends to the same file. So now I have two schemas in one file. How does avro handle this scenario. Will I have any new fields add in the file or will I loose any data while…

hdfs avro spark-avro confluent-schema-registry

asked Nov 05 '17 at 18:13

buckeyeosu

45
8

0

votes

1 answer

How to convert dataframe to avro using schema?

How to convert a dataframe into Avro format using a user-specified schema?

apache-spark apache-spark-sql avro spark-avro

asked Oct 18 '17 at 07:56

user3699367

0

votes

2 answers

Fail reading avro from S3 using spark in emr

When executing my Spark job at aws-emr I got this error when trying to read avro file from s3 bucket: It happen with versions: emr - 5.5.0 emr - 5.9.0 This is the code: val files = 0 until numOfDaysToFetch map { i => …

hadoop apache-spark apache-spark-sql amazon-emr spark-avro

asked Oct 16 '17 at 14:36

LeonBam

145
1
12

0

votes

0 answers

Spark read avro results from a previous write results in "Not an avro data file" due to _SUCCESS file

I'm using the great databricks connector to read/write avro files. I have the following code df.write.mode(SaveMode.Overwrite).avro(someDirectory) Problem is that when I try to read this directory using sqlContext.read.avro(someDirectory) it…

apache-spark apache-spark-sql spark-avro

asked Jul 25 '17 at 19:13

Hagai

275
3
13

0

votes

1 answer

Spark SQL : Handling schema evolution

I want to read 2 avro files of same data set but with schema evolution first avro file schema : {String, String, Int} second avro file schema evolution : {String, String, Long} (Int field is undergone evolution to long) I want to read these two…

apache-spark apache-spark-sql avro spark-avro

asked Jul 25 '17 at 14:55

jshweta14

23
4

0

votes

1 answer

databricks avro schema cannot be converted to a Spark SQL structtype

we have kakfa hdfs connector writing into hdfs in default avro format. A sample o/p: Obj^A^B^Vavro.schema"["null","string"]^@$Í³ø{<9d>¾Ã^X:<8d>uV^K^H5^F°^F^B<8a>^B{"severity":"notice","message":"Test…

avro apache-kafka-connect databricks spark-avro

asked Jul 13 '17 at 07:30

user2286963

125
2
11

Questions tagged [spark-avro]