0

I would like to filter a table by a condition on the frequency of groups in a certain column. Example:

Given table

tmp:([] id:`a`a`b`b`b`c; c2:1 2 3 4 5 6)

first find the frequencies of each group

ce:count each group tmp[`id]

then select the rows in tmp where the id's group count is more than 1

select from tmp where id in where ce > 1

id  c2
a   1
a   2
b   3
b   4
b   5
(row id=`c is gone because it appeared only once)

How can this be done more elegant?

Thanks

Thomas Smyth - Treliant
  • 4,993
  • 6
  • 25
  • 36
tenticon
  • 2,639
  • 4
  • 32
  • 76

1 Answers1

3

You can use fby e.g.

q)select from tmp where 1<(count;i) fby id
id c2
-----
a  1
a  2
b  3
b  4
b  5
Jonathon McMurray
  • 2,881
  • 1
  • 10
  • 22
  • 1
    it's neat to use `i`. it's also possible to use `select from tmp where 1<(count;id) fby id` – tenticon Jan 18 '18 at 09:41
  • Yep, you can use any column as they'll all be the same length, of course. Conventionally most people will use `i` when doing a count, but any will work. – Jonathon McMurray Jan 18 '18 at 09:43