4

I'm trying to convert excel files to dbf(dBASEIII) using python and my current process is:

  1. Use xlrd to convert the excel file to a .csv
  2. I take the headers off of the .csv and I use
  3. Take the newly made .csv and use the dbf module (https://pypi.python.org/pypi/dbf) to convert to dbf

I remove the headers from the csv file and I run the following:

table = dbf.from_csv(origfname, filename='test.dbf', field_names=headers, dbf_type='db3')

As of now, when the process is over all the fields become memo fields, how do I make them into Char, Date, Number, etc... fields?

Jon Ngo
  • 93
  • 4

1 Answers1

3

The from_csv method is only intended to dump into Memo fields.

If you want more control then you should skip the csv step and go from xls to dbf, using xlrd and dbf.

It will take a little more work: You'll have to create the table first with the appropriate fields, but then it's a simple matter of iterating through the xls table and writing to the dbf table.

So something like this:

some_table = Dbf.Table(
        'whatever.dbf',
        'name C(25); age N(3); history M',
        codepage='cp1252',
        )

# get data from xlrd (specifics are not correct, psuedo-code only)
...
data = []
for row in sheet:
    for cell in row:
        data.append(cell.value)
    # back to real code
    some_table.append(tuple(data))
 # all data has been transferred
 some_table.close()

To automatically generate the field names and column types you would need to cycle through the first few rows of the spreadsheet to get the header names and value types. Here is an example data row I iterated through:

<type 'unicode'> u'PROD'
<type 'unicode'> u'cclark'
<type 'float'> 4.0
<type 'float'> 99.0
<type 'unicode'> u'501302'
<type 'unicode'> u'244026'
<type 'int'> 1
<type 'float'> 42444.0
<type 'str'> ''
<type 'unicode'> u'AB'
Ethan Furman
  • 63,992
  • 20
  • 159
  • 237
  • I suppose a follow question is that, when I use Excel 2003 to convert I don't have to define the Width of the column. Do you know the convention Excel uses to determine the appropriate width to set it to? – Jon Ngo Mar 10 '16 at 16:39
  • @JonNgo: The `width` in Excel is visual, not a limitation on the stored data. If your concern is properly storing `float` fields then you may want to use the `vfp` type (version `3` is the default) which will let you specify a `B` (double-precision float) type. Use `dbf_type='vfp'` when you create the table. – Ethan Furman Mar 10 '16 at 16:52
  • I'm a little confused about what you mean, sorry not a FoxPro expert. So I will try to explain what I mean again. When I save an .xls to a .dbf(dBaseIII) in Excel and open the subsequent dbf in FoxPro, when I modi stru, each column will be assigned a type (Num,Char,Date,etc...) and an arbitrary width. I guess what I mean is I would like to have my python program be as near identical to how Excel 2003 would convert an .xls to the .dbf(dBaseIII) – Jon Ngo Mar 10 '16 at 17:16
  • @JonNgo: Ah, I misunderstood your question. I don't know what process Excel uses, but you should be able to do roughly the same by checking the type of the value returned by `xlrd` (which shoud de `date`, `float`, etc.) I'll try to put something more comprehensive in my answer later. – Ethan Furman Mar 10 '16 at 17:24
  • @JonNgo: Updated answer. – Ethan Furman Mar 17 '16 at 20:40