0

There are many StackOverflow questions about this error when reading from a CSV file. My problem is occurring while reading from STDIN.

[Most SO solutions talk about tweaking the open() command which works for opening CSV files - not for reading them through STDIN]. My problem is with reading through STDIN. So please don't mark this as a duplicate.

My python code is:

import sys , csv
def main(argv): 
    reader = csv.reader(sys.stdin, delimiter=',')
    for line in reader:
        print line

and the returned error is:

Traceback (most recent call last):
File "mapper.py", line 19, in <module>
    main(sys.argv) 
File "mapper.py", line 4, in main
    for line in reader:
_csv.Error: line contains NULL byte

It would suffice me to simply ignore that line where the NULL byte occurs (if that is possible) in the for loop.

Rakib
  • 12,376
  • 16
  • 77
  • 113
  • And what did you pipe in then? The Python CSV reader doesn't support UTF-16 or UTF-32 data, for example. – Martijn Pieters Nov 24 '14 at 09:54
  • i piped in a CSV file like `cat log.csv | python mapper.py`. I opened the CSV file in sublime and looked at the line number where the problem occurred. That line contained the following: `2013-12-18,2013-12-18 08:19:15.0,2778,1003,1328,6112,116.68.205.197,http://jobs.example.com/jobapply_confirm.asp?mclcIbl[f^S_d[=am]np_&%604^35db__j^8NUL]21ad]_bacs_%20%60uaic.^M]e%60a]p%60a^1[5a1=^iapa&a1b3%602a=ci_ua8,;Firefox;25.0;Windows XP;;;,1024x768,view,438,2q8cfvt4,Undefined,BD,2q92rpej,returning,1382929479,1387351637,0,0,0` Notice the substring **`NUL`** in it? – Rakib Nov 24 '14 at 10:07

1 Answers1

1

i solved it by handling CSV exception

import sys , csv    
def main(argv): 
    reader      = csv.reader(sys.stdin, delimiter=',')
    lineCount   = 0
    errorCount  = 0
    while True:
        # keep iterating indefinitely until exception is raised for end of the reader (an iterator)
        try:
            lineCount += 1
            line = next(reader)
            print "%d - %s" % (lineCount , line)
        except csv.Error: 
            # this exception is raised when a malformed CSV is encountered... ignore it and continue
            errorCount += 1
            continue
        except StopIteration: 
            # this exception is raised when next() reaches the end of the iterator
            lineCount -= 1
            break
    print "total line: %d" % lineCount
    print "total error: %d" % errorCount
Rakib
  • 12,376
  • 16
  • 77
  • 113