I'm working on an application to read an .xlsx
, (Excel File) with headers and insert the rows, as-is, into a DB Table in a Postgres DB.
I presume i would be using the XLRD library.
Any ideas on How to do this?
I'm using Python 2.6.6.
I'm working on an application to read an .xlsx
, (Excel File) with headers and insert the rows, as-is, into a DB Table in a Postgres DB.
I presume i would be using the XLRD library.
Any ideas on How to do this?
I'm using Python 2.6.6.
By far the easiest way will be to transform it to CSV, then use psql
's \copy
, or psycopg2
's COPY
support. Any command-line .xlsx
to CSV
converter would then work for you when invoked via the built-in subprocess
module, or popen
/system
on older Pythons. You wouldn't need any special Python libs. This would also let you use a converter, like OpenOffice / LibreOffice's command-line mode, that can handle all sorts of other tabular data too.
If you want to do it all in Python you could use the xlrd
lib you mentioned and feed the tuples one by one into psycopg2
's copy_from
, so you stream the rows to PostgreSQL using only Python code. You should find existing examples of psycopg2 based programs that use copy_from
easy to adapt to using xlrd
as a data source instead of Python lists / csv input / whatever.
If you have Excel, format the XLS file so that there are no headers and no blank columns (basically, just make a table). This can also work with LibreOffice. Then, save the file as a CSV. Then, view your CSV file with a text editor and verify that things are in order.
Once you are happy with the content of the CSV, it's quite simple to write a python code that reads each row and then creates a list of each field. Here is a 30 second version of code:
with open('example.csv','r') as csv_file:
full = csv_file.read().split('\n')
for line in full:
values = line.split(',')
print values
In the 'for' loop, you can access each element of the list individually, i.e. 'values[0]'