Questions tagged [apache-kudu]

For questions related to Apache Kudu

About Kudu

Kudu is a columnar storage manager developed for the Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and supports highly available operation.

Kudu's design sets it apart. Some of Kudu's benefits include:

Fast processing of OLAP workloads.

Integration with MapReduce, Spark and other Hadoop ecosystem components.

Tight integration with Impala, making it a good, mutable alternative to using HDFS with Parquet.

Strong but flexible consistency model, allowing you to choose consistency requirements on a per-request basis, including the option for strict serialized consistency.

Strong performance for running sequential and random workloads simultaneously.

Easy to administer and manage with Cloudera Manager.

High availability. Tablet Servers and Master use the Raft consensus algorithm, which ensures availability even if f replicas fail, given 2f+1 available replicas. Reads can be serviced by read-only follower tablets, even in the event of a leader tablet failure.

Structured data model.

By combining all of these properties, Kudu targets support for families of applications that are difficult or impossible to implement on current generation Hadoop storage technologies. A few examples of applications for which Kudu is a great solution are:

Reporting applications where newly-arrived data needs to be immediately available for end users

Time-series applications that must simultaneously support:

queries across large amounts of historic data

granular queries about an individual entity that must return very quickly

Applications that use predictive models to make real-time decisions with periodic refreshes of the predictive model based on all historic data

134 questions

votes

2 answers

Kudu table comments not showing up. What should I do?

This is my create statement for impala-shell: CREATE TABLE IF NOT EXISTS tmp.demo0011( uid Bigint, comment'用户uid' nick String, comment'昵称' primary key(uid) ) partition by hash(uid) partitions 128 stored as kudu tblproperties ( …

impala apache-kudu

asked Oct 24 '18 at 10:32

mrzhang

votes

1 answer

Impala concurrent query delay

My cluster configuration is as follows: 3 Node cluster 128GB RAM per cluster node. Processor: 16 core HyperThreaded per cluster node. All 3 nodes have Kudu master and T-Server and Impala server, one of the node has Impala catalogue and Impala…

impala apache-kudu

asked Sep 21 '18 at 06:23

Prog_G

1,539
1
8
22

votes

1 answer

How to test spring batch step which reads from database and writes into a file?

I would like to know what would be the best approach to test the below scenario in a Spring Batch job: A job consisting of two steps: 1) The first step reads from a database using an ItemReader (from apache kudu using impala) and writes into a…

java spring spring-batch impala apache-kudu

asked Aug 23 '18 at 10:14

Mohamed Said Benmousa

votes

2 answers

kerberos authentication in Kudu for spark2 job

I am trying to put some data in kudu, but the worker cannot find the kerberos token, so I am not able to put some data into the kudu database. here you can see my spark2-submit statement spark2-submit --master yarn "spark.yarn.maxAppAttempts=1"…

kerberos cloudera apache-kudu apache-spark-2.2

asked Jun 08 '18 at 10:40

Lukas

votes

1 answer

Kudu table column containing created timestamp

We are trying to create a kudu table that should contain a column holding the timestamp when the records are getting inserted. We tried the below : create table clcs.table_a ( store_nbr string, load_dttm timestamp default now(), …

ddl apache-kudu

asked Oct 31 '17 at 03:34

srikanth ramesh

votes

4 answers

Spark structured stream to kudu context

I want to read kafka topic then write it to kudu table by spark streaming. My first approach // sessions and contexts val conf = new SparkConf().setMaster("local[2]").setAppName("TestMain") val sparkSession =…

apache-spark-sql spark-streaming apache-kudu

asked Oct 26 '17 at 07:28

Jihun No

1,201
1
14
29

votes

0 answers

Impala KUDU table - howto bulk update

I need to performing updates of KUDU table, Is there any option to du update in bulk? The flow is following: 1 .Fetch 1000 rows 2. Process rows, calculate new value for each row 3. Update KUDU table with new values Updating row by row with one DB…

impala apache-kudu

asked Oct 19 '17 at 14:02

Yuriy Homyakov

votes

1 answer

Load a text file into Apache Kudu table?

How do you load a text file to an Apache Kudu table? Does the source file need to be in HDFS space first? If it doesn't share the same hdfs space as other hadoop ecosystem programs (ie/ hive, impala), is there Apache Kudu equivalent of: hdfs dfs…

cloudera apache-kudu

asked Jul 27 '17 at 21:44

boethius

votes

3 answers

How to access to apache kudu table created from impala using apache spark

I downloaded the quickstart VM of apache kudu and I have followed the examples just like they appears in this page https://kudu.apache.org/docs/quickstart.html, in fact I created the table named "sfmta" but when I tried to to access to the kudu…

apache-spark apache-spark-sql impala apache-kudu

asked May 23 '17 at 22:16

Joseratts

vote

0 answers

Type VARCHAR(n) is not supported in Kudu error when creating table in Impala 3.4.0

I'm trying to create a table with a varchar(30) column in Impala 3.4.0/Kudu 1.14.0 According to this Jira ticket it is exactly Impala 3.4.0 where support for varchar columns was included for the first time. Is there any problem with my understanding…

impala apache-kudu

asked Aug 05 '21 at 23:11

Krzysztof J. Obara

vote

0 answers

KuduSink fails to start

I'm trying to write a ETL pipeline from kafka to HDFS using flink. I'm using the bahir KuduSink and a PojoOperationMapper It throws an exception before starting. I've included my code, pom, and exception stack trace. Is there something obvious I'm…

flink-streaming apache-kudu apache-bahir

asked Jul 12 '21 at 15:43

Christopher Smith

vote

0 answers

Docker (compose) networking

I have a setup which without configuration change sometimes work, sometimes not, and I would welcome any help to understand why (and have it work 100% of the time). Setup Platform: Windows 10 WSL2, ubuntu 21.04 docker compose 1.29.2 docker engine…

docker docker-compose apache-kudu

asked Jun 21 '21 at 13:09

Guillaume

2,325
2
22
40

vote

0 answers

Installing apache kudu in docker in windows machine

When installing apache kudu in docker by executing the below command set: KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1) I get below error: tail: option used in invalid context -- 1 How to avoid…

apache-kudu

asked May 13 '21 at 04:08

scorpion private

vote

0 answers

Query kerberosed database

I have an Impala Kudu database secured via Kerberos. Even if I specify the database name in connection string, this will be ignored and I need to use it in my query (which is annoying because I have a lot of queries generated dynamically). Due of…

jdbc connection-string kerberos impala apache-kudu

asked May 11 '21 at 09:30

AlleXyS

2,476
2
17
37

vote

1 answer

Spark Scala DateType schema execution error

I get an execution error when I try to create a Schema for a dataframe in Spark Scala that says: Exception in thread "main" java.lang.IllegalArgumentException: No support for Spark SQL type DateType at…

scala apache-spark apache-kudu

asked Dec 17 '20 at 09:12

user2728349

Prev 1

…

8 9 Next