Questions tagged [kite-sdk]

Kite is a high-level data layer for Hadoop. It is an API and a set of tools that speed up development. You configure how Kite stores your data in Hadoop, instead of building and maintaining that infrastructure yourself.

9 questions
2
votes
2 answers

Dependency Resolution error

I am trying to convert JSON file to Parquet format using Kites SDK. I have the following pom.xml :
Shivkumar Mallesappa
  • 2,875
  • 7
  • 41
  • 68
2
votes
0 answers

Hadoop Parquet Datastorewriter bad writing performance

I´m writing Parquet files using the ParquetDatasetStoreWriterclass and the performance I get is really bad. Normally the flow followed is this: // First write dataStoreWriter.write(entity #1); dataStoreWriter.write(entity…
Victor
  • 2,450
  • 2
  • 23
  • 54
1
vote
1 answer

flume-kite-morphline: com.fasterxml.jackson.core.JsonParseException: Unexpected end-of-input: expected close marker for OBJECT

While working on flume (1.6& 1.7) I am experiencing the below error 2016-12-02 00:57:11,634 (pool-3-thread-1) [WARN - org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:143)] Line length exceeds max (2048), truncating…
1
vote
2 answers

apache nifi , hdfs parquet format

I am a newbie to NIFI, my use case is to read from a port and write to hdfs in parquet format, my research says there is something called KiteSDK, with which I can save as Parquet format.Am I right ?.Please advice.Any examples would help.
Bill
  • 363
  • 3
  • 14
1
vote
1 answer

Files remain in .avro.tmp state in a Spark job?

I have a Spark job that read millions of records from HDFS, processes them, and writes back to HDFS in AVRO format. Observed that many files (written) remain in .avro.tmp state. I am using Kite SDK for writing data in AVRO format. The environment is…
Sudhanshu Umalkar
  • 4,174
  • 1
  • 23
  • 33
0
votes
1 answer

Unable to convert JSON String to Avro Schema using Kite-data-core

I am trying to convert a JSON String to Avro Schema using https://github.com/kite-sdk/kite/blob/master/kite-data/kite-data-core/src/main/java/org/kitesdk/data/spi/JsonUtil.java#L539 But for the below code - String json = "{\n" + " …
0
votes
2 answers

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/RecordReader

I am trying to convert my Json file to Parquet format. Following is my pom file.
Shivkumar Mallesappa
  • 2,875
  • 7
  • 41
  • 68
0
votes
1 answer

Apache NiFi: InferAvroSchema infers signed values as string

I'm setting up a pipeline in NiFi where I get JSON records which I then use to make a request to an API. The response I get would have both numeric and textual data. I then have to write this data to Hive. I use InferAvroSchema to infer the schema.…
Sivaprasanna Sethuraman
  • 4,014
  • 5
  • 31
  • 60
0
votes
2 answers

KiteSdk 1.1.0 csv-import IOError

with HDP-2.5 on Ubuntu-14.04, running this command and $ ./kite-dataset csv-import ./test.csv test_schema trying to import raw csv data into Hive using the KiteSdk ver.1-1-0 and having the following IOError: 1 job failure(s) occurred:…