0

The program I am running needs to identify if an imported file is a gzipped. The files are coming in using argparse:

parser.add_argument('file_sources', nargs='*', type=argparse.FileType('r'), default=sys.stdin, help='Accepts one or more fasta files, no arguments defaults to stdin')
options = parser.parse_args()
open_files = options.file_sources

and stored in a list where the program loops through the list and determines whether that file is gzipped based on the file ext .gz:

open_files = options.file_sources
#if there is nothing in the list, then read from stdin
if not open_files:
    open_files = [sys.stdin]
#if there are files present in argparse then read their extentions
else:
    opened_files = []
    for _file in open_files:
        if _file.endswith(".gz"):
            _fh = gzip.open(_file,'r')
            opened_files.append(_fh)
        else:
            _fh = open(_file,'r')
            opened_files.append(_fh)

The code breaks at _file.endswith(".gz"):, giving the error 'file' has no attribute 'endswith. If I delete the argparse type, _file goes from being a file object to a string. Doing this causes endwith() to work, but now the file is just a string bearing its name.

How can I keep the functionality of the file while also interpreting its file extension (and not having to use absolute paths as in os.path.splitext since I'm just taking files from the program's current directory)?

Thomas Matthew
  • 2,826
  • 4
  • 34
  • 58
  • Why are you reopening files that are already open? The `argparse.FileType` opens the files when it reads the strings. If you want to open them yourself, just accept strings. Then you can apply the `.endswith` test. – hpaulj Nov 07 '14 at 20:33

1 Answers1

1

file has no attribute endswith, but string has, so:

change:

if _file.endswith(".gz"):

to:

if _file.name.endswith(".gz"):

and as _file is a file type, use open like this:

_fh = gzip.open(_file.name,'r')
Anzel
  • 19,825
  • 5
  • 51
  • 52
  • Thanks for the quick reply, @Anzel. Unfortunately, when I try your recommendation I get: `TypeError: coercing to Unicode: need string or buffer, file found` coming from line 94 of the `gzip.py` module. What does it mean to coerce the unicode and how can I get around this to convert filename to a string? – Thomas Matthew Nov 07 '14 at 11:07
  • @ThomasMatthew, please see my updates, it's about your code around `open` – Anzel Nov 07 '14 at 11:14
  • that must have worked, because I am on to new and better errors. Thanks! – Thomas Matthew Nov 07 '14 at 11:20
  • @ThomasMatthew, not a problem, glad it helps :) – Anzel Nov 07 '14 at 11:24