2

I am using pyodbc to extract data from a database table. Now, I would like to limit the query to only contain the records that were added within the last 24 hrs. The information about the time a record was added is contained in the DATE_ADDED column .

This is what I am trying:

pyodbc.pooling = False
conn_str = (
    r'DRIVER={Oracle in OraClient12Home1};'
   .....
   .....
   .....)

conn = pyodbc.connect(conn_str)
curs = conn.cursor()


date_today = datetime.datetime.today()
date_yesterday = date_today - datetime.timedelta(days=1)

sql = f"SELECT PROD_ID, ID, COUNT, DATE_ADDED FROM LZE.APODAT WHERE DATE_ADDED < {date_today} AND DATE_ADDED  > {date_yesterday}"

my_df = pd.DataFrame(pd.read_sql(sql, conn))

The error message I get is:

UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 218-219: illegal encoding

I'm sure it has to do with the different formatting/data types of the DATE_ADDED column but I am unsure how to approach this. In the database it looks as follows: 2019-04-24 10:21:05 AM

Any help would be greatly appreciated.

Parfait
  • 104,375
  • 17
  • 94
  • 125
akleefel
  • 87
  • 5
  • Try to surround your dates with quotes: `'{date_today}'` – olinox14 Jun 05 '19 at 13:17
  • i tried that before. It gives me the same error – akleefel Jun 05 '19 at 13:20
  • Does that help you? https://stackoverflow.com/questions/17478476/oracle-sql-query-for-date-format – olinox14 Jun 05 '19 at 13:26
  • if i try this: ```... WHERE TRUNC(DATE_ADDED) < TO_DATE('{date_today}','YYYY/MM/DD HH:MI:SS') AND TRUNC(DATE_ADDED) > TO_DATE('{date_yesterday}','YYYY/MM/DD HH:MI:SS')"``` i get the same error – akleefel Jun 05 '19 at 13:55
  • Yes, I can extract the full table if I remove the where clause. Where would you suspect the special character? In the Database column? – akleefel Jun 05 '19 at 14:45

2 Answers2

4

Consider parameterization (the industry standard when using SQL in application layer) as Python's datetime object may translate to Oracle's date type.

And yes, you can parameterize in Pandas with read_sql() using the params argument (which does not need a pd.DataFrame() wrapper). Plus, you do not need F-strings, so this version can work in any Python version.

sql = """SELECT PROD_ID, ID, COUNT, DATE_ADDED 
         FROM LZE.APODAT 
         WHERE DATE_ADDED < ? AND DATE_ADDED  > ?
      """

my_df = pd.read_sql(sql, conn, params=[date_today, date_yesterday])

Otherwise convert to string (removing decimal microseconds) and use Oracle's TO_DATE():

sql = """SELECT PROD_ID, ID, COUNT, DATE_ADDED 
         FROM LZE.APODAT 
         WHERE DATE_ADDED < TO_DATE(?, 'YYYY-MM-DD HH:MI:SS')
           AND DATE_ADDED > TO_DATE(?, 'YYYY-MM-DD HH:MI:SS')?
      """

my_df = pd.read_sql(sql, conn, params=[date_today.strftime("%Y-%m-%d %H:%M:%S"), 
                                       date_yesterday.strftime("%Y-%m-%d %H:%M:%S")]
                   )
Parfait
  • 104,375
  • 17
  • 94
  • 125
0

You say you want the last 24 hours...

You can say DATE_ADDED >= SYSDATE() - 1.0 in a where clause to get that. SYSDATE() gets you the present moment. Dates in oracle behave like floating point numbers, where 1.0 is 24 hours. So subtracting 1.0 from the present moment gets you the same time yesterday.

If you want everything from midnight yesterday, you can truncate the date. Use

DATE_ADDED >= TRUNC(SYSDATE() - 1.0)
O. Jones
  • 103,626
  • 17
  • 118
  • 172