1

everybody.

I can't find a pythonic way to ignore "blank" lines in a CSV. I use quotes because I'm talking about lines that look like '','','','','' Here is a CSV (blank lines could be random):

id,name,age
1,alex,22
3,tiff,42
,,
,,
4,john,24

Here is the code:

def getDataFromCsv(path):
    dataSet = []
    with open(unicode(path), 'r') as stream:
        reader = csv.reader(stream, delimiter=',')
        reader.next() # ignoring header
        for rowdata in reader:
            # how to check here?
            dataSet.append(rowdata)
    return dataSet

Here is similar questions that I've been reading, but different to this in particular: python csv reader ignore blank row

alete
  • 608
  • 2
  • 8
  • 23

5 Answers5

13

You can use any to check if any column in the row contains data:

for rowdata in reader:
    # how to check here?
    if any(x.strip() for x in rowdata):
        dataSet.append(rowdata)
user2390182
  • 72,016
  • 6
  • 67
  • 89
  • 1
    Works as expected, it's clear and it taught me something. Thanks! – alete Jan 17 '18 at 01:44
  • Useful! If I understand correctly, this line `if any(x.strip() for x in rowdata):` can be read as: "if there are any values left after stripping all string values in the row, then the row has data and should be added to the dataSet' – grego Feb 24 '20 at 21:12
  • For those trying to scrub out empty strings from rows is you used csv.DictReader: remember to specify that x is teh lsit of values from your row dictionary, i.e. use this line: `if any(x.strip() for x in list(row.values())):` – grego Feb 24 '20 at 21:14
  • 1
    @grego *"if there are any values left after stripping all string values in the row"* -- that's almost correct: 1. ... any values left **that aren't the empty string** ... 2. `any` stops at the first truthy element, i.e. not necessarily **all** elements get stripped, but only as many as it takes. – user2390182 Feb 25 '20 at 06:27
0

Danger zone.. Maybe reviving an old thread..

Why not use a filter? Then there are no memory issues for large csv files, I think.

Something like:

for data in filter(any, reader):
    print(data)
vt220
  • 1
  • 1
-1

What about:

if len(rowdata) > 0:
    dataSet.append(rowdata)

Or am I missing a part of your question?

  • 3
    This won't work as these "empty" rows are still 3 long. You must test if all of the 3 strings in the row are empty. – user2390182 Jan 16 '18 at 22:56
-1

You can use the built-in function any:

for rowdata in reader:
    # how to check here?
    if not any(row):
        continue
    dataSet.append(rowdata)
tommy.carstensen
  • 8,962
  • 15
  • 65
  • 108
-1
with open(fn, 'r') as csvfile:
    reader = csv.reader(csvfile)
    data = [row for row in reader if any(col for col in row)]
  • open CSV file
  • instantiate csv.reader() object
  • use a list comprehension to:
    • iterate over CSV rows
    • iterate over columns in the row
    • check if any column in the row has a value and if so, add to the list
T3metrics
  • 135
  • 2
  • 11
  • 1
    Hi, welcome to Stack Overflow. When answering a question that already has many answers, please be sure to add some additional insight into why the response you're providing is substantive and not simply echoing what's already been vetted by the original poster. This is especially important in "code-only" answers such as the one you've provided. – chb Mar 27 '19 at 19:35
  • 1
    While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value. – undetected Selenium Mar 27 '19 at 21:22