1

Is there any difference between count(*) and count(column) in dolphindb?

What's the difference in their performance?

damie
  • 412
  • 2
  • 7

1 Answers1

0

Although this is not a duplicate, the difference is explained in count(*) versus count(column) in dolphindb

count(*) will count ALL rows, but count(column) will only count the records that have a non-null value in the specified column.

In terms of performance, count(*) should perform better, because it does not need to evaluate the values in each row, but ultimately the SQL count() function expression will execute the rowCount() function and there is no mention of special handling for count(*) or that the syntax is even supported.

In this instance if you have a large enough table you should be able to observe a difference and prove this for yourself. If you run two variants (reference a unique column that does not have any null values for the column name version) The count(*) should be faster


In terms of general SQL Count() there is a good explanation here

There is the perception that in some older SQL databases there were performance gains if you specifyied an arbitrary value instead of *, as in Count(1), true or not, modern RDBMS implementations will not try to evaluate that all columns are not null, they will evaluates count(*) as a special case to mean "count all rows."

This behaviour between 1 and * has less relevance in DolphinDb because the SQL command syntax is only a wrapper to the internal rowCount() function that will accept one or multiple vectors, tuples of vectors, matrices or tables as arguments.

Chris Schaller
  • 13,704
  • 3
  • 43
  • 81