Questions tagged [apache-kudu]

For questions related to Apache Kudu

From https://kudu.apache.org/docs/

About Kudu

Kudu is a columnar storage manager developed for the Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and supports highly available operation.

Kudu's design sets it apart. Some of Kudu's benefits include:

  • Fast processing of OLAP workloads.
  • Integration with MapReduce, Spark and other Hadoop ecosystem components.
  • Tight integration with Impala, making it a good, mutable alternative to using HDFS with Parquet.
  • Strong but flexible consistency model, allowing you to choose consistency requirements on a per-request basis, including the option for strict serialized consistency.
  • Strong performance for running sequential and random workloads simultaneously.
  • Easy to administer and manage with Cloudera Manager.
  • High availability. Tablet Servers and Master use the Raft consensus algorithm, which ensures availability even if f replicas fail, given 2f+1 available replicas. Reads can be serviced by read-only follower tablets, even in the event of a leader tablet failure.
  • Structured data model.

By combining all of these properties, Kudu targets support for families of applications that are difficult or impossible to implement on current generation Hadoop storage technologies. A few examples of applications for which Kudu is a great solution are:

  • Reporting applications where newly-arrived data needs to be immediately available for end users
  • Time-series applications that must simultaneously support:
    • queries across large amounts of historic data
    • granular queries about an individual entity that must return very quickly
  • Applications that use predictive models to make real-time decisions with periodic refreshes of the predictive model based on all historic data
134 questions
0
votes
1 answer

How to get current kudu master or tserver flag value?

Master and tserver flags can be accessed from kudu web interfaces (by defult http://127.0.0.1:8051/varz and http://127.0.0.1:8050/varz). But I couldn't find a way to get it from command line. For example, how to get tserver_master_addrs from a…
ramazan polat
  • 7,111
  • 1
  • 48
  • 76
0
votes
1 answer

Hive Hbase JOIN performance & KUDU

Reading the Cloudera documentation using Impala to join a Hive table against HBase smaller tables as stated below, then in the absence of a Big Data appliance such as OBDA and a largish HBase dimension table that is mutable: If you have join…
thebluephantom
  • 16,458
  • 8
  • 40
  • 83
0
votes
1 answer

Multi-tenancy implementation with Apache Kudu

I am implementing big data system using apache Kudu. Preliminary requirement are as follows: Support Multi-tenancy Front end will use Apache Impala JDBC drivers to access data. Customers will write Spark Jobs on Kudu for analytical use…
Sauchin
  • 333
  • 3
  • 11
0
votes
0 answers

Streamsets throws exception (MANUAL_FLUSH buffer) while using Kudu client

I'm a newbie in Streamsets and Kudu technologies and I'm trying several solutions to reach my goal: I've got a folder containing some Avro files and these files need to be processed and afterward sent to a Kudu…
0
votes
1 answer

Error While inserting rows into Kudu using Spark Shell

I am new to Apache Kudu, I installed it on my Ubuntu system and later created a table in it using Apache Spark shell. Now I am trying to insert data into that table using insertRows() for that I am using the but below given command,…
Pavan Kumar
  • 205
  • 1
  • 14
0
votes
1 answer

Connecting Apache Drill to Kudu

Is there a way to connect Apache Drill to Kudu? I have seen Drill 1.5 added an experimental support for Kudu and a drill-storage-kudu on github but I can't figure out how to make it work... Is this now less experimental? Thanks
Jice
  • 191
  • 4
  • 16
0
votes
1 answer

Installing Apache Kudu on my mac (Mac Os Sierra 10.12.1) fails to compile during "thirdparty/build-if-necessary.sh"

When I'm trying to install Apache Kudu I obtain this error. I couldn't find any information to solve this problem and the only one I could find says that after installing Xcode the problem was solved, but I have already Xcode…
Orbar
  • 13
  • 7
0
votes
1 answer

how can I calculate how much storage existing kudu table actually uses

I would like to calculate how big (in GB) existing kudu table actually it. Does anybody know how to do this ?
qwertz1123
  • 1,173
  • 10
  • 27
0
votes
1 answer

Cannot connect Impala-Kudu to Apache Kudu (without Cloudera Manager): Get TTransportException Error

I have successfully installed kudu on Ubuntu (Trusty) as per the official kudu documentations (see http://kudu.apache.org/docs/installation.html ). The setup has one node running master and tablet server and another node running the tablet server…
user1478046
-1
votes
1 answer

Configuring Nutch to write to Apache Kudu

I am trying to configure Apache Nutch to write to Apache Kudu, but I cannot find anywhere informations about how to do it. I know I can write to Cassandra and HBase, but there is nothing about Kudu. The Hadoop distribution that I am using is CDH…
Vitaly Olegovitch
  • 3,509
  • 6
  • 33
  • 49
-1
votes
1 answer

Read Impala table with SparkSQL

I was trying to execute a query that had functions like lead .. over .. partition and Union. This query works well when I try to run it on impala but fails on Hive. I need to write a Spark job that performs this query. It is failing as well in…
New Coder
  • 499
  • 4
  • 22
-1
votes
1 answer

Is there any good big data store for update and delete queries?

I am using hive and hbase as back end stores. Hive is really good for raw data storage. But you cant run update and delete queries if you want good performance. Currently I am using phoenix on top of hbase. It is giving me good performance and sql…
sdk
  • 178
  • 2
  • 19
-2
votes
1 answer

Scala Play, Apache Spark and KuduContext incompatibilities

I don't know if this happening because Scala is so version restrictive or because all libraries are deprecated and not updated. I have a little project in Scala Play with Apache Spark. I want and I like to use latest versions of the libraries, so I…
AlleXyS
  • 2,476
  • 2
  • 17
  • 37
-2
votes
2 answers

This row was already applied and cannot be modified

When i run my code in the test environment to test my new code about kudu insert,it reports to me: This row was already applied and cannot be modified. I have already tried to debug my code and to see what is the problem in my code , but it is…
Jelly
  • 1
  • 2
1 2 3
8
9