Set a header with pandas

Question

I have some txt files and they start with a lot of BS and after 20 to 30 lines the useful part starts. I want to use the last line before the numbers as my header. I know If I know the exact line number, I can set that as my header (using pd.read_csv) but for each file, that number is different (as I said it's between 20 to 30). I know the line that I am looking for starts with "Potential". Is there any easy way to use pd.read_csv and set the header from the beginning.

Possible duplicate of [Convert row to column header for Pandas DataFrame,](https://stackoverflow.com/questions/26147180/convert-row-to-column-header-for-pandas-dataframe) — Edeki Okoh, May 06 '19 at 19:14
@EdekiOkoh Does not look like a dupe of that (but still can be a dupe of something else). — DYZ, May 06 '19 at 19:16
Any chance that all of the header lines start with some specific single character? — ALollz, May 06 '19 at 19:39

score 5 · Answer 1 · answered May 06 '19 at 19:20

5

You can read the top of the file using "traditional" file I/O methods and count the rows until you find the header row. Once you know its number, reread the file with pandas.read_csv().

with open(yourfile) as infile:
    for n,row in enumerate(infile):
        if row.startswith("Potential"):
            break

df = pd.read_csv(yourfile, skiprows=n)

answered May 06 '19 at 19:20

DYZ

55,249
10
64
93

2

That's a very cool method! Will defo be using it, my thanks to you sir. – Umar.H May 06 '19 at 19:30
1

@piRSquared I would of course accept it if I was the question asker, , in this case I'm not! – Umar.H May 06 '19 at 20:21
2

Ha! @Datanovice whoops (-: – piRSquared May 06 '19 at 20:22

Set a header with pandas

1 Answers1