-2

I am working in SQLLite Studio and trying to find all rows with a given CUSIP. However, when I add "where cusip='00080010'", my query returns no results, despite that fact that I can see that the first row has that cusip.

I first considered that cusip might actually be an integer, so I checked the table definition and it is indeed listed as TEXT.

I created this test query and found that string equality with my string literal is failing:

select case when cusip='00080010' then 1 else 0 end test, cusip
from msf 
order by date,cusip
limit 1

test    cusip
0       00080010

I tried running the query in python instead in case this was an issue with SQLLite Studio and got the following results:

db.execute("""select case when cusip='00080010' then 1 else 0 end test, cusip
    ...: from msf
    ...: order by date,cusip
    ...: limit 1""").fetchone()

(0, b'00080010')

So, possibly it's being stored as bytes. I then tried modifying my query:

select case when cusip=b'00080010' then 1 else 0 end test, cusip
from msf 
order by date,cusip
limit 1

Error while executing SQL query on database 'a_stock': near "'00080010'": syntax error

Second attempt:

select case when cusip=cast('00080010' as bytes) then 1 else 0 end test, cusip
from msf 
order by date,cusip
limit 1

test    cusip
0       00080010

In python:

db.execute("""select case when cusip=cast('00080010' as bytes) then 1 else 0 end test, cusip
    ...: from msf
    ...: order by date,cusip
    ...: limit 1""").fetchone()

(0, b'00080010')

I'm out of ideas now. Why is the string comparison failing, how do I fix it, and is there a way to know ahead of time if this is going to happen?

Edit: Sorry for the typo, the value is '00080010' and I put it into SO incorrectly. I retested the code to make sure I'm still having the same issue.

Kalev Maricq
  • 617
  • 1
  • 7
  • 24
  • does it matter that in your table cusip has an extra zero in the beginning? – M Z Aug 08 '20 at 20:08
  • 1
    `00080010` vs `0080010`? – M Z Aug 08 '20 at 20:08
  • @MZ Good catch! Sadly, that's not the issue, just a typo when I wrote the question. – Kalev Maricq Aug 09 '20 at 01:16
  • 1
    What does `typeof(cusip)` give you? – Shawn Aug 09 '20 at 05:27
  • @Shawn It gives blob as you predicted, despite the column type being text. How do I a) write a blob literal for comparison or b) change it to text/string? This dataset was created from a sas dataset via pandas read_sas function. I wonder why it chose to store this as a blob. – Kalev Maricq Aug 09 '20 at 22:09

1 Answers1

0

Could there be blank spaces? Have you tried the following?

select case when trim(cusip) = '00080010' then 1 else 0 end test, cusip
from msf 
order by date,cusip
limit 1

Or try this to take care of any leading zeroes:

select case when cast(cusip as integer) = 80010 then 1 else 0 end test, cusip
from msf 
order by date,cusip
limit 1
Kalev Maricq
  • 617
  • 1
  • 7
  • 24
Lars Skaug
  • 1,376
  • 1
  • 7
  • 13
  • These tests both work (return 1), which surprises me since the python output shows no leading or trailing characters. Does trim convert datatypes or could there be nonprintable characters? – Kalev Maricq Aug 09 '20 at 01:32
  • There must be white space in the original string (text). Would you mind accepting the answer of this solved your problem? – Lars Skaug Aug 09 '20 at 02:24
  • 1
    More likely the value is stored as a blob, and `trim()` returns a string so the comparison works. – Shawn Aug 09 '20 at 05:31
  • @LarsSkaug I'm not sure that whitespace is the issue. Length(cusip) returns 8 and cast(cusip as text)='00080010', so it looks like it might be a type issue instead. – Kalev Maricq Aug 09 '20 at 22:24