2

I have a ksql table with less than 1000 records in it. When I run this query select * from table_name it takes up to 10 seconds before the query starts to return any data.

The machine running Kafka, zookeeper, ksql and schema registry is not overloaded or anything like that.

I am using a dev setup with 1 broker and 1 ksql server.

The table has simple un-nested json with two fields, email and a user id

This is a problem because I can't use this for a single lookup, for example, using a query like this

SELECT * FROM TABLE_NAME WHERE col='value';

since it takes too long to return a result. I expected results would be returned instantly.

The time taken is the same if I use streams instead of tables

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
mungujn
  • 63
  • 1
  • 10
  • is this behavior reproducible only for table lookups? if you try the same thing for a stream do you encounter a different behavior? how many brokers do you have? how many ksql servers do you have? what kind of data do you have in the table? – BogdanSucaciu May 24 '19 at 13:23
  • I have updated the question with my setup – mungujn May 24 '19 at 16:19
  • interesting, I think it has something to do with the data that you put in the topic. can you share some example keys and values? Also, does KSQL server has any insightful logs? – BogdanSucaciu May 24 '19 at 19:04
  • ```json {"email": "john@example.com", "user_id":"hahdjic-andud-hahd"} ``` basically randomly generated email addresses and user_ids. The message key is the same as the email. I Dont know whether this bit is useful but the user_ids are the same. It shouldn't be affecting the performance but maybe it is – mungujn May 27 '19 at 11:49

1 Answers1

0

When I run this query select * from table_name it takes up to 15 seconds before the query starts to return any data.

It may take a little while until a streaming query in KSQL is fully up and running. Fifteen seconds sounds a bit too long, but depending on your local environment the startup delay may still explain the observed behavior.

I have a ksql table with less than 1000 records in it.

If the SELECT query is up and running, and you then write a few new records into the table's underlying topic (e.g. in another terminal), how quickly do these records show up in the SELECT query? This should happen much faster because there is no startup delay (the query should be fully up and running at that point).

miguno
  • 14,498
  • 3
  • 47
  • 63
  • New records appear instantly if the query is up and running. As for the delay, I actually timed it and its 7 to 10 seconds. Not 15 (I think I should edit the question). We have also tested the query on a larger setup (more powerful machines, more kafka brokers) but still got a 7 second delay before results started coming in – mungujn May 29 '19 at 07:55