2

I tried to insert row values for code column in statements table as a foreign key from companies Table. i took the following steps:

Creating Tables

cur.execute("CREATE TABLE IF NOT EXISTS companies (code INT NOT NULL PRIMARY KEY, short_name VARCHAR(255) NOT NULL, long_name VARCHAR(255) NOT NULL)")

cur.execute("CREATE TABLE IF NOT EXISTS statements (statement_id SERIAL NOT NULL PRIMARY KEY, statement_name VARCHAR(255) NOT NULL, code INT NOT NULL, FOREIGN KEY (code) REFERENCES companies_list (code))")

What code column contains in companies table ( i.e. )

 code |
-----------
  113
  221
  344

The next step is inserting wanted data to statements table as below :

statement_name = ["balance_sheet", "income_statement", "cash_flow"]

code = "SELECT code FROM companies_list WHERE code IS NOT NULL"

statements = [tuple((t,)) for t in zip(statement_name, code)]

query = "INSERT INTO statements (statement_name, code) VALUES %s"
cur.executemany(query, statements)

i got the following error :

psycopg2.DataError: invalid input syntax for integer: "S"
LINE 1: ...ents (statement_name, code) VALUES ('balance_sheet', 'S')

The Final result i want to get is like below :

statement_id |   statement_name    |   code
---------------------------------------------
     1           balance_sheet         113
     2           income_statement      113
     3           cash_flow             113
     4           balance_sheet         221
     5          income_statement       221
     6           cash_flow             221
T.M
  • 93
  • 9

1 Answers1

0

The error arises from this line:

code = "SELECT code FROM companies_list WHERE code IS NOT NULL"

This does not perform an actual query, it assigns the SQL select statement string to the code variable. The next line then zips the statement names with code which, because code is a string (an iterable), results in the first 3 characters of code being zipped with the items from statement_name, the result being:

[(('balance_sheet', 'S'),), (('income_statement', 'E'),), (('cash_flow', 'L'),)]

So that's where the 'S' is coming from - it's the first character of "SELECT" in the code string. 'S' is a string, not an integer as defined in the schema for the statements table, hence the error.

You can see the queries generated with cursor.mogrify():

>>> statement_name = ["balance_sheet", "income_statement", "cash_flow"]
>>> code = "SELECT code FROM companies_list WHERE code IS NOT NULL"
>>> statements = [tuple((t,)) for t in zip(statement_name, code)]
>>> query = "INSERT INTO statements (statement_name, code) VALUES %s"
>>> for args in statements:
...     print(cur.mogrify(query, args))
... 
INSERT INTO statements (statement_name, code) VALUES ('balance_sheet', 'S')
INSERT INTO statements (statement_name, code) VALUES ('income_statement', 'E')
INSERT INTO statements (statement_name, code) VALUES ('cash_flow', 'L')

One way of fixing this is to execute the query contained in code to get a list of company codes, then use that to construct the INSERT query:

import itertools

cur.execute("SELECT code FROM companies_list WHERE code IS NOT NULL")
codes = [row[0] for row in cur.fetchall()]
query = 'INSERT INTO statements (statement_name, code) VALUES (%s, %s)'
args = itertools.product(statement_name, codes)
cur.executemany(query, args)

Here itertools.product() is used to form the Cartesian product of the statement names and the company codes. This is mimicking database join functionality, so if the statement types are available in your database, it might be better to do it in SQL rather than Python.

mhawke
  • 84,695
  • 9
  • 117
  • 138
  • my bad i'm new at databse in general, i thought as i used query in executemany successfully before i can use code inside query.. so now if that doesn't work how can i solve this issue.. the result needed as in the last table (statements) .. – T.M Dec 21 '15 at 12:27
  • @T.M: you need to execute the query, or otherwise obtain the list of company codes. See updated answer. – mhawke Dec 21 '15 at 13:53
  • i tried your code but faced an error ( 'NoneType' object has no attribute 'fetchall') .. then changed it to 'symbols = cur.execute("SELECT symbols FROM companies_list WHERE symbols IS NOT NULL") symbols = cur.fetchall()' and faced another erorr (cur.executemany(query, args) TypeError: not all arguments converted during string formatting ) – T.M Dec 21 '15 at 19:43
  • For `psycopg2`, `cursor.execute()` returns `None`, hence the error. I have updated the code to handle that. – mhawke Dec 21 '15 at 23:11