2

This seems a conceptually easy question, but with no straightforward answer.

I have some time-serious data, with a whole bunch (millions) of unique events given by the event_ID variable, describing a set (of several) physical properties (phys_props).

I'm writing SQL and want to return every nth row (where I have freedom to choose n to be e.g. 5, 1000, 10000 etc.) for this time-series dataset.

However, if e.g. n=1000 and there are less than 1000 records for a given physical property, I don't want to return the event_ID records there at all.

Something like:

SELECT * 
FROM myTable
WHERE MOD(myTable, 1000) = 0 

is my starting point.

npross
  • 1,756
  • 6
  • 19
  • 38

3 Answers3

4

you can give an unique no to each of the row, and then do the calculation on the unique no. Something like this

with t as(    
select *,row_number() over (order by event_ID) SN from the_table_name
)

select * from t where SN%100=0
Tano Fotang
  • 449
  • 3
  • 7
buqing
  • 925
  • 8
  • 25
  • Yeah, I'd thought about something like this. But what if the DB is like millions (and often billions!!) of rows long? I don't know if giving new unique_id to every row is the best way forward in that case. – npross Oct 18 '18 at 18:41
  • I don't think that @Robin is suggesting that you create a column for the "unique no". Using a [window function](https://en.wikipedia.org/wiki/Window_function), as he has done, you do not need an extra column: Just alias the row number, and use the row num to limit the rows to return: `with t as(select *,row_number() over (order by event_ID) as SN from your_table_name order by event_iD) select * from t where SN%100=0`. – Tano Fotang Oct 18 '18 at 18:54
  • @TanoFotang okay great, got it. And I *think* this should also take care of the instances where e.g. phys_prop has less than 100 entries -- or would the SN%100 =0 line throw an error?? – npross Oct 18 '18 at 19:00
  • 1
    @npross Yes. Nothing is returned if SN>100 (or whatever count). I think this answer should be edited to show code that works (t%100 does not make sense). I can write out a new answer but that does not sound fair to the person who posted this answer\. – Tano Fotang Oct 18 '18 at 19:01
  • Ooops. I meant to say that "Nothing is returned if SN<100", not SN>100. – Tano Fotang Oct 18 '18 at 19:17
0

You can try something like this:

SELECT * FROM 
(SELECT EVENT_ID,  
       @rownum := @rownum + 1 AS rank
FROM myTable, 
       (SELECT @rownum := 0) r
) ds
WHERE MOD(rank, 5) = 0
;
0

You can try this:

         Select * from myTable
         Having mod(count(*), 1000)=0
Mohan
  • 334
  • 2
  • 12
  • Sure, but what if my 'phys_prop' rows have less than (in this case) 1000 entries, wont this crash? – npross Oct 18 '18 at 18:57
  • This doesn't do what you think it does. count is number of rows per group & there is one (implicit) group. – philipxy Oct 18 '18 at 20:07
  • @npross What do you mean, crash? Learn , aggregate functions & count. (Athough this doesn't do what you want.) – philipxy Oct 18 '18 at 20:09