4

I have ~300 folders with .dbf files that I would like to convert to .csv files.

I am using os.walk to find all the .dbf files and then a for loop utilizing the dbfpy module to convert each .dbf file to a .csv. It seems to be finding and reading the .dbf files correctly but not converting them to .csv. I believe the csv.writer code is the issue. I am not receiving any errors but the files stay as .dbf.

My code below is based on code found here.

import csv
from dbfpy import dbf
import os

path = r"\Documents\House\DBF"


for dirpath, dirnames, filenames in os.walk(path):
    for filename in filenames:
        if filename.endswith('.DBF'):
            in_db = dbf.Dbf(os.path.join(dirpath, filename))
            csv_fn = filename[:-4]+ ".csv"
            out_csv = csv.writer(open(csv_fn,'wb'))

        names = []
        for field in in_db.header.fields:
            names.append(field.name)
        out_csv.writerow(names)


        for rec in in_db:
            out_csv.writerow(rec.fieldData)

        in_db.close()
5tanczak
  • 161
  • 1
  • 2
  • 8

1 Answers1

4

The original file you have will stay as a dbf. You're not actually replacing it, but instead creating a new csv file. I think the problem is that the write to disk never happens. I suspect the csv writer isn't flushing the file buffer.

Another problem I see is that out_csv is created conditionally, so if you have some other file in that directory with a different extension you'll run into problems.

Try using a context manager:

for dirpath, dirnames, filenames in os.walk(path):
    for filename in filenames:
        if filename.endswith('.DBF'):
            csv_fn = filename[:-4]+ ".csv"
            with open(csv_fn,'wb') as csvfile:
                in_db = dbf.Dbf(os.path.join(dirpath, filename))
                out_csv = csv.writer(csvfile)
                names = []
                for field in in_db.header.fields:
                    names.append(field.name)
                out_csv.writerow(names)
                for rec in in_db:
                    out_csv.writerow(rec.fieldData)
                in_db.close()

The 'with' statement (the context manager) will close the file and flush the buffer at the end, without you needing to do that explicitly.

munk
  • 12,340
  • 8
  • 51
  • 71
  • Thanks for the response...unfortunately I am still not able to get this to work. I am getting an "invalid syntax" error for the second "as" before in_db. I tried a bunch of different things but no luck...ugh. Any suggestions would be very welcome! – 5tanczak Aug 29 '13 at 00:40
  • Sorry, that was sloppiness on my part. I hadn't run this and made a couple of typos. The moral - always test your code, especially when you're sure it works :) Try the edited code above and let me know if you still have trouble with it. – munk Aug 29 '13 at 21:20
  • 1
    Getting close I think. After I ran the new version I got the following error: "struct.error: unpack requires a string argument of length 8" I added "new = True" to "dbf.Dbf(os.path.join(dirpath, filename), new=True )" to alleviate this. Now I am getting an attribute error: "AttributeError: \__exit__" I have done some searching and it appears this is common with the "with statement" but unclear as to how to fix. – 5tanczak Aug 30 '13 at 13:44
  • `with` requires an __enter__ and __exit__ method for the object you're creating. Looking at the source for the Dbf class, it is missing this. The solution is to move the "with dbf" into the with open(csv) block. See my modifications above. – munk Aug 30 '13 at 14:27
  • Hmmm...that got rid of the error but the conversion is still not happening (files stay as dbf). I think nothing is executing in the for loops. If I put in a dummy print statement in the for loops it doesn't output anything. – 5tanczak Aug 30 '13 at 17:39
  • Does the first for loop execute? – munk Aug 30 '13 at 19:16
  • No, neither one does. And I realized that the dbf files in the path folder are emptied out after I run the script, meaning when I look them up in the folder they are now 0 KB (I have copies so not a data problem). – 5tanczak Aug 30 '13 at 19:52