0

I need to create a query in an sqlite database using DB Browser. The columns are: state, measure_id, measure_name, score. All of the data has been input as strings. I can cast the score strings as decimal, however the problem is that some of the values for the score column are numeric and some are actual string values (such as "high" etc). I need to ignore the REAL string values in my output. Also, I need to calculate the standard deviation (as well as min,max,avg) for each measure_id.

How can I ignore the real string values and calculate the standard deviation?

Here is some sample data:

sample 1: AL, ID1, Ident, 52
sample 2: TX, ID2, Foo, High
sample 3: MI, ID3, Bar, 21

(I want to select only sample 1 and 3, and then cast the strings as int and calculate stdev)

Vamsi Prabhala
  • 48,685
  • 4
  • 36
  • 58
zsad512
  • 1,097
  • 1
  • 11
  • 14

1 Answers1

0

If the values are never 0, you can do:

select avg(cast(value as decimal))
from t
where cast(value as decimal) > 0;

The standard deviation is a bit trickier to calculate. You can use the defining formula, but SQLite doesn't even have a square root function.

You might want to move to another database, such as Postgres or MySQL, if you want to support these types of operations.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I am forced to use SQLite in Python, so are you saying its not possible? – zsad512 Jul 16 '17 at 23:31
  • @zsad512 . . . I would be very careful. You can calculate the variance (do the square root in Python). However, it is susceptible to numeric overflow. If you have smallish data with no outlandish values, you can do it in SQLite. However, it might be safer to bring the data into Python and do the calculation there. Note: That is not my normal recommendation. – Gordon Linoff Jul 16 '17 at 23:47