how can i convert first row of a text file into list in python escaping NaNs?

Question

how can i convert first line of a text file into list in python? I want to escape NaNs while converting into the list.

import csv
with open ('data.txt', 'r') as f:
    first_row = [column[0] for column in csv.reader(f,delimiter='\t')]
    print (first_row)

Give some more context for this - what kind of input data? what do you mean by "escape NaNs"? — Jeff Tratner, Jun 12 '13 at 02:28
@lisa, you should revert your edit and ask a new question. Now none of the below answers have any context. — dansalmo, Jun 12 '13 at 04:23
seems lisa created a new sock puppet account http://stackoverflow.com/questions/17057641/creating-lists-from-text-file-using-pandas-in-python. — dansalmo, Jun 12 '13 at 04:29

Jeff Tratner · Answer 1 · 2013-06-12T02:49:20.770

4

Make it easier on yourself, use pandas:

import pandas
df  = pandas.read_csv("data.txt")

If you need to explicitly tell pandas that a particular value is NaN, just pass it to the reader

df = pandas.read_csv("data.txt", na_values=["NAN"])

or if you want to skip lines that have issues

df = pandas.read_csv("data.txt", error_bad_lines=False)

To get row 1:

row1 = df.irow(0)

TO get column 1:

col1 = df.icol(0)

edited Jun 12 '13 at 02:49

answered Jun 12 '13 at 02:34

Jeff Tratner

16,270
4
47
67

2

Pandas is perfect for this, and you beat me to it, +1, but you don't need to specify the delimiter, pandas `read_csv` "sniffs" what the delimiter of the file is! – Ryan Saxe Jun 12 '13 at 02:38
@RyanSaxe okay, updated to reflect that (plus I misspelled "\t" to boot :P) – Jeff Tratner Jun 12 '13 at 02:40
@Jeff Tratner thank you i accepted your answer. then how can i extract only row one or column one? – lisa Jun 12 '13 at 02:46
Isn't usage of pandas heavyweight for this ? If one can do this easily using standard libraries from Python, doesn't usage of Pandas add an additional dependency ? – sateesh Jun 12 '13 at 02:53
I do agree that Pandas is a bit much for just this simple task, but you have to realize, that the`error_bad_lines` is not something easily available. It would be best to just use the `try` and `except` answer already given if the format of the empty rows was given, but if it's not, pandas makes this much easier and has many functions for dealing with NaNs. – Ryan Saxe Jun 12 '13 at 02:56
@Jeff Tratner, I have just tried your suggestion on a text file shown in the question and am not getting the expected results. row1 includes the header info. col1 displays the entire table. – dansalmo Jun 12 '13 at 04:22
@dansalmo use the `values` attribute. And if you really are only looking for the first row or first column, then pandas might be too much. – Jeff Tratner Jun 12 '13 at 09:50
@sateesh maybe, but the upside is that you can handle pretty much anything without worrying about it and be able to slice and dice it all later very easily – Jeff Tratner Jun 12 '13 at 09:51
1

@Jeff Tratner, the problem was caused by the white space in the text file. Using this `df = pandas.read_csv("test.txt", sep=r"\s+")` fixed it. The original question was changed here and moved to here by OP under a different account for some reason. http://stackoverflow.com/questions/17057641/creating-lists-from-text-file-using-pandas-in-python – dansalmo Jun 12 '13 at 15:28

score 1 · Answer 2 · answered Jun 12 '13 at 02:36

If you have sure way of determining what constitutes invalid value for a cell you can use the string comparison and ignore those values.

If your purpose is to ignore those values which Python doesn't consider as floats you can do something like below:

cell = <cell_value>
try:
    f = float(cell)
    # store f somewhere
except ValueError:
    # ignore cell, or may be log this
    pass

score 1 · Answer 3 · edited May 23 '17 at 11:57

csv.reader() returns an iterator that yields an array of columns per iteration (i.e. line).

Simply put, this is sufficient to get you the first line of data.txt as a list:

import csv
with open ('data.txt') as f:
    first_row = csv.reader(f, delimiter='\t')

It appears you also want to convert the list elements to a decimal type, which can be done using map(...) and float(...).

e.g.:

first_row = map(float, first_row)

If the list contains the text "NaN", float() converts this to the special value nan without much intervention.

e.g.:

>>> float("NaN")
nan

score 0 · Answer 4 · answered Jun 12 '13 at 02:45

0

This worked for me (puts all cells on a row or line into a list):

import csv
with open ('data.txt', 'r') as f:
    for row in csv.reader(f,delimiter='\t'):
        print row # prints a list of entries for current row.

answered Jun 12 '13 at 02:45

cforbish

8,567
3
28
32

how can i convert first row of a text file into list in python escaping NaNs?

4 Answers4