0

We've seen issues where a kdb database is corrupted and are looking for a way to implement a check that every kdb column is the same length in a particular table. Any recommendation on how to do this?

i.e., would like to get a return value of each column in a table and it's length.

These tables have upwards of 200 columns. Any way to go about this efficiently?

Any help is appreciated. Thank you.

KDBbruh
  • 3
  • 1

1 Answers1

2

Something like this might work for you.

q)tables[]
`positions`quote`trade
q)count each flip trade
time  | 40000
sym   | 40000
src   | 40000
price | 40000
amount| 40000
side  | 40000

You can run the same thing on partitioned data as well.

q)count each flip select from ohlc where date=last date
date     | 1440
sym      | 1440
exchange | 1440
timestamp| 1440
open     | 1440
high     | 1440
low      | 1440
close    | 1440
volume   | 1440

EDIT: The methods used above will only work on tables which are not corrupted, which may not be best suited to your use case.

If the data is corrupted, you can get each column from its location on disk and count it.

q)cols[ohlc]!{count get hsym`$"/path/to/hdb/2020.02.09/ohlc/",string x}each cols ohlc
date     | 1440
sym      | 1440
date     | 1440
exchange | 1440
timestamp| 1439
open     | 1440
high     | 1440
low      | 1440
close    | 1440
volume   | 1440
Cathal O'Neill
  • 2,522
  • 1
  • 6
  • 17
  • Is there an elegant way to iterate through multiple tables? i.e., group by tables? – KDBbruh Mar 29 '21 at 22:08
  • ```tables[]!count''[flip each get each tables[]]``` creates a dict with each table as a key and its value being the dict of counts. – Cathal O'Neill Mar 29 '21 at 22:22
  • The fact that you can successfully execute `flip trade` or `select from ohlc where date=last date` means the tables in question do have columns of the same length. If the tables were corrupted your query would fail. – Igor Korkhov Mar 29 '21 at 23:24
  • Good point @IgorKorkhov. Have made an edit to the original post. – Cathal O'Neill Mar 29 '21 at 23:50