0

I am looking out for a way to select a given percentage of data from a table using calcite query, for example lets say we have a table name sample which has around 800 records and I want to select only 30% from the total data present in sample i.e i n result we get only 240 records from 800 records.

Adding to it if we can also use any criteria with the above situation will be an add on.

Thanks in advance.

bforblack
  • 65
  • 5

2 Answers2

1

Calcite's SQL parser supports the SQL standard TABLESAMPLE keyword, for example

SELECT *
FROM t TABLESAMPLE BERNOULLI(30) REPEATABLE(42)

But it is not documented in the Calcite SQL reference, and I've not tried it recently. Give it a try, and if it doesn't work, please log a Calcite JIRA case.

Julian Hyde
  • 1,239
  • 7
  • 10
0

You can use window functions:

select t.*
from (select t.*, count(*) over () as cnt,
             row_number() over (order by rand()) as seqnum
      from t
     ) t
where seqnum <= 0.4 * cnt;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786