2

I have quite a lot of xlsx files which is a pain to convert them one by one to tab delimited files

I would like to know if there is any solution to do this by python. Here what I found and what tried to do with failure This I found and I tried the solution but did not work Mass Convert .xls and .xlsx to .txt (Tab Delimited) on a Mac

I also tried to do it for one file to see how it works but with no success

#!/usr/bin/python
import xlrd
import csv


def main():
    # I open the xlsx file 
    myfile = xlrd.open_workbook('myfile.xlsx')
    # I don't know the name of sheet
    mysheet = myfile.sheet_by_index(0)
    # I open the output csv
    myCsvfile = open('my.csv', 'wb')
    # I write the file into it
    wr = csv.writer(myCsvfile, delimiter="\t")
    for rownum in xrange(mysheet.nrows):
        wr.writerow(mysheet.row_values(rownum))
    myCsvfile.close()
if __name__ == '__main__':
   main()
Community
  • 1
  • 1
nik
  • 2,500
  • 5
  • 21
  • 48
  • 1
    apart from all wrong variable names, your program works fine. what's your problem exactly? what's the error? – Jean-François Fabre Dec 18 '16 at 18:48
  • @Jean-François Fabre can you modify my variable names in my question ? I am using Mac with python2.7 – nik Dec 18 '16 at 18:48
  • @Jean-François Fabre there is no error, it just does not create any file – nik Dec 18 '16 at 18:49
  • 2
    Did you remember to run `main()`? – Alex Hall Dec 18 '16 at 18:50
  • quickfix: `def main(): # I open the xlsx file myfile = xlrd.open_workbook('Classeur1.xlsx') # I don't know the name of sheet mysheet = myfile.sheet_by_index(0) # I open the output csv myCsvfile = open('my.csv', 'wb') # I write the file into it wr = csv.writer(myCsvfile, delimiter="\t") for rownum in xrange(mysheet.nrows): wr.writerow(mysheet.row_values(rownum)) myCsvfile.close()` – Jean-François Fabre Dec 18 '16 at 18:50
  • That wasn't the only change @nik ... `MyCsvfile` needs to be lowercase `myCsvfile`, and `sh` also does not exist at `sh.nrows`. And you never called `main()`, so right, no file gets opened or created. – OneCricketeer Dec 18 '16 at 18:51
  • @cricket_007 I corrected the syntax but still does not work. so I should remove the `def main():` ? can you give me a reason ? when I remove that I get this error myfile = xlrd.open_workbook('myfile.xlsx') ^ IndentationError: unexpected indent – nik Dec 18 '16 at 18:57
  • All you did was define a function, it does not run unless you explicitly call it. Or you can remove it, but whitespace is very important in Python, so you have to fix the indentation. – OneCricketeer Dec 18 '16 at 18:59
  • @cricket_007 I don't know myself :-D it got messy once everybody gave me comment so I was confused. Now I changed it. but now it gives me this error IndentationError: unindent does not match any outer indentation level – nik Dec 18 '16 at 19:03

2 Answers2

2

No real need for the main function.

And not sure about your indentation problems, but this is how I would write what you have. (And should work, according to first comment above)

#!/usr/bin/python
import xlrd
import csv

# open the output csv
with open('my.csv', 'wb') as myCsvfile:
    # define a writer
    wr = csv.writer(myCsvfile, delimiter="\t")

    # open the xlsx file 
    myfile = xlrd.open_workbook('myfile.xlsx')
    # get a sheet
    mysheet = myfile.sheet_by_index(0)

    # write the rows
    for rownum in xrange(mysheet.nrows):
        wr.writerow(mysheet.row_values(rownum))
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • I am getting this error `wr.writerow(mysheet.row_values(rownum)) UnicodeEncodeError: 'ascii' codec can't encode characters in position 131-132: ordinal not in range(128)`do you know why? – nik Dec 18 '16 at 19:08
  • I liked and accepted your answer but I would appreciate if you could modify it to give me .txt instead .csv , if possible – nik Dec 18 '16 at 19:09
  • Umm.. `open('my.txt', 'wb')`? File extensions are highly irrelevant to the problem – OneCricketeer Dec 18 '16 at 19:09
  • it converts to it `.txt`but still my other function http://stackoverflow.com/questions/41207261/how-to-make-a-pair-combination-per-each-line does not work on this txt , I think it is because I am converting it by Mac :-(((( I've been working on this forever and no solution yet :-( – nik Dec 18 '16 at 19:14
  • I don't really know what to tell you. Open up your Python scripts in PyCharm or something, setup some breakpoints, then start debugging. I don't think the Mac is the problem - Python is cross-platform. – OneCricketeer Dec 18 '16 at 19:18
  • I have tried to debug it but no sucess. I am very sure that it is something to do with space in my input. I tried to do different line.split like `elems = line.split('\r')` or `elems = line.split('\n')`or `elems = line.split('\t')` . – nik Dec 18 '16 at 19:22
  • 1
    OH MY GOD, I solved it , look at this , I only add rU to open file `open('data.txt','rU')` it works like a charm :-D :-D :-D here i found info http://stackoverflow.com/questions/1785233/convert-r-text-to-n-so-readlines-works-as-intended – nik Dec 18 '16 at 19:28
  • 1
    That mode is deprecated, so I'm sure it'll work for now. https://docs.python.org/3/library/functions.html#open – OneCricketeer Dec 18 '16 at 23:44
  • you can also look at this question if you want http://stackoverflow.com/questions/41213905/string-manipulation-and-adding-values-based-on-row-they-are – nik Dec 18 '16 at 23:47
  • Regarding your new question, I don't think `mysheet.row_values(rownum))` works the way you think it does. The gist you have contains quotes as if it quoted the entire line of the spreadsheet – OneCricketeer Dec 19 '16 at 00:07
  • what is then the solution ? – nik Dec 19 '16 at 00:10
1

Why go with so much pain when you can do it in 3 lines:

import pandas as pd

file = pd.read_excel('myfile.xlsx')

file.to_csv('myfile.xlsx',
             sep="\t",
             index=False)