0

I have a tab delimited data that looks like this:

Probes  FOO BAR
1452463_x_at    306.564     185.705 
1439374_x_at    393.742     330.495 
1426392_a_at    269.850     209.931 
1433432_x_at    636.145     487.012 
1415687_a_at    231.547     175.008 
1424736_at  248.926     189.500 
1435324_x_at    244.901     225.842 
1438688_at  180.511     187.407 
1426906_at  206.694     218.913 

What I want to do is to parse the above data. But first will have to skip the line that starts with Probes. But why this line failed? What is the most common Pythonesque way to deal with these problem?

import sys
import csv
import re 

with open('Z100_data.txt','r') as tsvfile:
    tabreader = csv.reader(tsvfile,delimiter='\t')
    for row in tabreader:
        if re.match("^Probes",row):
            # He we try to skip the first line.
            continue
        else:

           print ', '.join(row)
pdubois
  • 7,640
  • 21
  • 70
  • 99
  • `row` is a `list`; `re.match` doesn't know what to do with a list. `re.match('Probes',row[0])` would work, though simply throwing away the first row is more efficient than testing *every* row. – roippi Mar 12 '14 at 01:44
  • possible duplicate of [Using Python to analyze CSV data, how do I ignore the first line of data](http://stackoverflow.com/questions/11349333/using-python-to-analyze-csv-data-how-do-i-ignore-the-first-line-of-data) – Adam Smith Mar 12 '14 at 01:59

2 Answers2

2

The pythonic way to deal with this is next()

with open('Z100_data.txt','r') as tsvfile
    tabreader = csv.reader(tsvfile,delimiter = '\t')
    next(tabreader) # skips the first line
    for row in tabreader:
        print ', '.join(row)
Adam Smith
  • 52,157
  • 12
  • 73
  • 112
  • 1
    Use `csv.Sniffer().has_header` if you don't know in advance whether the header is there or not. – wim Mar 12 '14 at 01:41
  • @wim that sounds like a better answer but I don't know enough about `csv` to write it up -- you can edit my answer or give your own! – Adam Smith Mar 12 '14 at 01:47
  • Couldn't be bothered because it's a dupe and has already been answered [here](http://stackoverflow.com/a/11350095/674039), just wanted future readers to be able to see that comment :) – wim Mar 12 '14 at 01:56
1

try:

if re.match("^Probes",' '.join(row)):

or:

if "Probes" in row:

or:

if "Probes" == row[0].strip():

csv.reader returns you list of tuples from row split by delimiter. You tried to use re o list/tuple, while it works on strings only.

m.wasowski
  • 6,329
  • 1
  • 23
  • 30