1

I am connecting to Teradata with the following code which I execute on the python console:

conn = pyodbc.connect('DRIVER={Teradata}; DBCNAME=TdwB; UID=####;PWD=###;')
query = file(full_path).read()
opportunities = pd.read_sql(query, conn)
conn.close()

In query I read a very simple sql query from a file and everything works fine.

Then, I try running a much more complex query, expected to return about 350000 rows (0.2 GB). I am sure the query works because it has been executed perfectly on the SQL Assistant, Teradata query tool.

The script fails with DatabaseError: Execution failed on sql: SELECT after something like 5 minutes (I expect the query to run for about 10-20 minutes).

I am not sure how to tackle this because the error message is rather cryptic.

  • Is it a timeout?
  • Data formatting issue?

Anonymized query

Originally over 300 lines but it's just a select. Here are the main operations on data:

SELECT 

TRUNC (CAST (a.created_at AS DATE ), 'M') AS first_day_month
,d.country_name AS country
,d.contract_id AS contract_id 
,MAX (TRIM(CAST(REGEXP_REPLACE(contracts.BillingStreet, '\||\n|\r|\t', '_',1,0,'m') AS CHARACTER(2048))) || ', ' || TRIM(contracts.BillingPostalCode) || ', ' || TRIM(contracts.BillingCity)) AS FullAdress
,MIN (CAST (bills.created_at AS DATE )) AS first_transaction
,SUM (gross_revenue ) 

FROM db_1.billings AS bills
LEFT JOIN  db_2.contracts AS contracts ON bills.deal_id = contracts.deal_id 

WHERE bills.economic_area = 'NA'
AND CAST (bills.created_at AS DATE ) >= TRUNC (ADD_MONTHS (CURRENT_DATE -1, - 26) , 'MM')
AND bills.country_id = 110

GROUP BY 1,2,3

INTERESTINGLY:

conn = pyodbc.connect('DRIVER={Teradata}; DBCNAME=####; UID=####;PWD=####;', timeout=0)
cursor = conn.cursor()
query = file(full_path).read()
cursor.execute(query)
row = cursor.fetchone()
if row:
    print row
cursor.close()
conn.close()

results in

Traceback (most recent call last):
  File "C:\Users\mferrini\AppData\Local\Continuum\Anaconda\lib\site-packages\IPython\core\interactiveshell.py", line 2883, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-19-8f156c658077>", line 6, in <module>
    cursor.execute(query)
Error: ('HY000', '[HY000] [Teradata][ODBC Teradata Driver][Teradata Database] Wrong Format  (-9134) (SQLExecDirectW)')
NoIdeaHowToFixThis
  • 4,484
  • 2
  • 34
  • 69
  • What's the SQL query? – LCIII Feb 18 '15 at 15:54
  • Well, I am not sure I can share this online due to company restrictions. – NoIdeaHowToFixThis Feb 18 '15 at 15:58
  • 2
    You can anonymize the table/column names for us. We will need to see the query. Is there a timeout set to, say, 500 seconds? – Kris Gruttemeyer Feb 18 '15 at 16:09
  • The `Wrong Format` error - How is the data in `bills.created_at` formatted? Is it consistently formatted throughout the entire set of data in `bills`? – Rob Paller Feb 18 '15 at 17:28
  • @Rob Paller - uhm, it is a massive corporate database. What can I check precisely on my side? – NoIdeaHowToFixThis Feb 18 '15 at 17:39
  • Start off with the data type - is it character or timestamp? – Rob Paller Feb 18 '15 at 17:41
  • Uhm, I don't know as it is a view and views are resolved at runtime. We might be going in the right direction. Is there a way not to convert character to datetime when executing `cursor.execute(query)`? If character, it might prevent the error... – NoIdeaHowToFixThis Feb 18 '15 at 17:52
  • 1
    Actually, `SELECT TYPE (billings.created_at) FROM db_1.billings` yielded `TIMESTAMP(3)` for a 2,000 rows. I'll run a `SELECT DISTINCT` and see if it is consistent, but I would expect so. – NoIdeaHowToFixThis Feb 18 '15 at 18:05
  • I was just going to suggest `TYPE()`. Since it is a `TIMESTAMP` the explicit conversion to `DATE` should not be the cause the Wrong Format error. I'm afraid the error may be elsewhere. – Rob Paller Feb 18 '15 at 18:14
  • 2
    `SELECT TYPE (db_1.billings.created_at);` returns a single row, no need for FROM plus DISTINCT. You might also check the query's Explain for any additional typecast you didn't write in your Select. If `9134` is the actual error code it indicates a problem within a FastPath function, in your code it's either TRUNC or REGEXP_REPLACE, try to remove them one by one and check if it works. – dnoeth Feb 18 '15 at 22:55
  • Interesting. What is a FastPath function in this context? I am not sure google yielded the desired results. – NoIdeaHowToFixThis Feb 19 '15 at 10:00
  • 2
    Most of the new functions added in TD14 are not built-in the parser, they're a special kind of User Defined Function (faster) created by Teradata in a specific database TD_SYSFNLIB. In the "Functions & Operators" they're listed as "Embedded Services System Functions" – dnoeth Feb 19 '15 at 16:06
  • @dnoeth: I have found the error. It was indeed related to a UDF. If you move your comment and explanation on UDFs, I can vote you answer. – NoIdeaHowToFixThis Feb 23 '15 at 10:10

0 Answers0