Skip first lines of CSV in Python - does not work? Why?

Question

My csv data looks like as follows:

$('table').each(function(index) {
    $(this).find('tr').each(function() {
      $(this).find('td:first-child').each(function(i) { 
        var a;
        var b;
        var c;
        $(this).find('a:first').each(function(i) { 
          a = $(this).text();
        });
        $(this).find('p:first').each(function(i) { 
          b = ($(this).text());
        });
        $(this).find('time:first').each(function(i) { 
          c = $(this).text();
        });
        console.log(a + ";"  + b + ";" + c);
      });
    });
  });
XA452:01 Description in Column 1;ID Column1;13.03.2018
AY102:22 Description in Column 2;ID Column2;13.03.2018
BC001:31 Description in Column 3;ID Column3;13.03.2018
DE223:34 Description in Column 4;ID Column4;13.03.2018
FG315:56 Description in Column 5;ID Column5;13.03.2018
HA212:34 Description in Column 6;ID Column6;13.03.2018
EE111:12 Description in Column 7;ID Column7;13.03.2018

I want to start parsing the data where the row begins with XA452:01.

I tried this:

import pandas as pd

testimport_data = pd.read_csv("C:/Users/fff/Desktop/test_data.txt", sep=";", skiprows = 19)

print(testimport_data)

It should work, shouldn't it be? However, I get the following error message:

Traceback (most recent call last):
  File "C:/Users/fff/PycharmProjects/Test/Test.py", line 3, in <module>
    testimport_data = pd.read_csv("C:/Users/fff/Desktop/test_data", sep=";", skiprows = 19)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 709, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 449, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 818, in __init__
    self._make_engine(self.engine)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 1049, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "C:\Users\fff\PycharmProjects\Test\venv\lib\site-packages\pandas\io\parsers.py", line 1695, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas\_libs\parsers.pyx", line 565, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file

What am I doing wrong?

@Rahul: It worked with sep=" " for some reason. Then, the first 19 lines were skipped. But why? — Ferit, Mar 13 '18 at 18:53
@Ferit: Though it worked sep=" ", it is wrong. as the actual separator is ";". manually removing the js code works. i don't know what's wrong. — Rahul, Mar 13 '18 at 18:55
@Rahul: Oh, yeah you're right. With sep = " " we're able to remove the js code for some reason (I really don't know why). If I try sep = ";", I get the error message. It seems like the skiprows command is somewhat ignored, isn't it? — Ferit, Mar 13 '18 at 18:57

Rahul · Accepted Answer · 2018-03-13T19:02:50.567

2

You can skeep reading that lines.

import pandas as pd

with open('pd_csv.csv') as f:
    data = [line.split(";") for line in f.readlines()[19:]]

testimport_data = pd.DataFrame(data)
print(testimport_data)

edited Mar 13 '18 at 19:02

answered Mar 13 '18 at 18:38

Rahul

10,830
4
53
88

1

`XA452:01 Description in Column 1;ID Column1;13.03.2018` are you _sure_ ? – Jean-François Fabre Mar 13 '18 at 18:40

akozi · Answer 2 · 2018-03-13T19:15:53.890

Edit: What I say below is not completely true as Rahul pointed out. The problem arises since the text that is skipped included a semicolon. Therefore my switching of semicolon to a coma fixes this issue since there are no commas in the skipped text.

Old text: This is an issue with ; as a separator. Why I'm not sure but if you switch the text files ';' to ',' then the program runs fine.

In the past I've had this work by changing the what engine pandas uses to read the file if you need it use semi colons.

The file I used is called values.txt

$('table').each(function(index) {
    $(this).find('tr').each(function() {
      $(this).find('td:first-child').each(function(i) { 
        var a;
        var b;
        var c;
        $(this).find('a:first').each(function(i) { 
          a = $(this).text();
        });
        $(this).find('p:first').each(function(i) { 
          b = ($(this).text());
        });
        $(this).find('time:first').each(function(i) { 
          c = $(this).text();
        });
        console.log(a + ";"  + b + ";" + c);
      });
    });
  });
XA452:01 Description in Column 1,ID Column1,13.03.2018
AY102:22 Description in Column 2,ID Column2,13.03.2018
BC001:31 Description in Column 3,ID Column3,13.03.2018
DE223:34 Description in Column 4,ID Column4,13.03.2018
FG315:56 Description in Column 5,ID Column5,13.03.2018
HA212:34 Description in Column 6,ID Column6,13.03.2018
EE111:12 Description in Column 7,ID Column7,13.03.2018

Then ran it with:

>>> data = pd.read_csv('values.txt', sep=',', skiprows=19, names=['1', '2', '3'])
>>> data
                                  1           2           3
0  XA452:01 Description in Column 1  ID Column1  13.03.2018
1  AY102:22 Description in Column 2  ID Column2  13.03.2018
2  BC001:31 Description in Column 3  ID Column3  13.03.2018
3  DE223:34 Description in Column 4  ID Column4  13.03.2018
4  FG315:56 Description in Column 5  ID Column5  13.03.2018
5  HA212:34 Description in Column 6  ID Column6  13.03.2018
6  EE111:12 Description in Column 7  ID Column7  13.03.2018

Issue is not with ";" but with skiprows with ";". without initial rows it works fine. — Rahul, Mar 13 '18 at 19:11
@Rahul I added an edit explaining what I said was wrong. I will leave the example of the code running with ',' instead. Should I remove the incorrect statement outright? — akozi, Mar 13 '18 at 19:18

Skip first lines of CSV in Python - does not work? Why?

2 Answers2