Trouble shooting line continuation error for a long file path for read_csv

Question

I am trying to break up a long file path so that I can read it without having to move the screen to see it.

edgelist = pd.read_csv(r'https://gist.githubusercontent.com/brooksandrew' /
                   r'/e570c38bcc72a8d102422f2af836513b/raw' /
                   r'/89c76b2563dbc0e88384719a35cba0dfc04cd522' / 
                   r'/edgelist_sleeping_giant.csv')

However, I get this error:

TypeError                                 Traceback (most recent call last)
<ipython-input-4-a0ff45f0f7db> in <module>
      2 edgelist = pd.read_csv(r'https://gist.githubusercontent.com/brooksandrew' /
      3                        r'/e570c38bcc72a8d102422f2af836513b/raw' /
----> 4                        r'/89c76b2563dbc0e88384719a35cba0dfc04cd522' /
      5                        r'/edgelist_sleeping_giant.csv')
      6 edgelist.head(10)

I've looked at some other stack posts, but I don't understand them. I've tried a variety of combinations of removing the forward slash with repositioning the quotes, but I think I'm just grasping at straws. I would love a technical explanation to why I'm getting this error.

BTW, writing the load statement on one line with no ending [isolated] forward slashes (on lines 2, 3, and 4) works, but I can't see the entire statement without sliding the screen view. I'm looking for something readable in one view.

w-m · Accepted Answer · 2019-02-20T14:46:20.637

1

Line continuations in Python are signaled with backward slashes, you have been using forward slashes.

This should work as intended:

edgelist = pd.read_csv(r'https://gist.githubusercontent.com/brooksandrew' \
                       r'/e570c38bcc72a8d102422f2af836513b/raw' \
                       r'/89c76b2563dbc0e88384719a35cba0dfc04cd522' \
                       r'/edgelist_sleeping_giant.csv')

As there are no backslashes in the URL itself, you don't need to use raw string literals, and can just use standard string literals:

edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew' \
                       '/e570c38bcc72a8d102422f2af836513b/raw' \
                       '/89c76b2563dbc0e88384719a35cba0dfc04cd522' \
                       '/edgelist_sleeping_giant.csv')

You can even remove the quotes, but then all the spaces need to go as well, as they would become part of the resulting string (and won't be a correct URL anymore):

edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew\
/e570c38bcc72a8d102422f2af836513b/raw\
/89c76b2563dbc0e88384719a35cba0dfc04cd522\
/edgelist_sleeping_giant.csv')

edited Feb 20 '19 at 14:46

answered Feb 20 '19 at 13:44

w-m

10,772
1
42
49

This works, but I'd argue if '\' was truly a line continuation character, one should not need the 'r' on the subsequent lines. – spacedustpi Feb 20 '19 at 13:47
1

As a matter of fact, you don't need the `r` anywhere. Raw string literals are for encoding strings containing backslashes, but the URL only contains forward slashes. The '\' is a true line continuation. With the code above, a number of raw string literals are concatenated. If the `r` were missing on the next lines, it would concatenate string literals to a raw string literal (which, again, in this case would also work no problem). – w-m Feb 20 '19 at 13:55
I actually get this error: HTTPError: HTTP Error 406: Not Acceptable, when I put in: HTTPError: HTTP Error 406: edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew \ /e570c38bcc72a8d102422f2af836513b/raw \ /89c76b2563dbc0e88384719a35cba0dfc04cd522 \ /edgelist_sleeping_giant.csv') - but if I leave the quotes in on every line (so 4 pairs of single quotes), it works. So I'd still have to say it is not a true line continuation character from a technical standpoint. – spacedustpi Feb 20 '19 at 14:24
Quotes left in: edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew' \ '/e570c38bcc72a8d102422f2af836513b/raw' \ '/89c76b2563dbc0e88384719a35cba0dfc04cd522' \ '/edgelist_sleeping_giant.csv') - works. – spacedustpi Feb 20 '19 at 14:30
1

Keep in mind that spaces matter when you don't finish the string literal with quotes. Put the `\\` directly at the line ending and continue on the first character of the next line (no pretty formatting), and it will still work. – w-m Feb 20 '19 at 14:30
This did not work: edgelist = pd.read_csv('https://gist.githubusercontent.com/brooksandrew\ /e570c38bcc72a8d102422f2af836513b/raw\ /89c76b2563dbc0e88384719a35cba0dfc04cd522\ /edgelist_sleeping_giant.csv') – spacedustpi Feb 20 '19 at 14:40
1

I've edited the answer to give more examples, as newlines are not well preserved in the comments here. – w-m Feb 20 '19 at 14:46
My bad, yes, I did not take out the indents, although I think the indents make it more presentable, but I understand what you meant by 'no pretty formatting' now. So I'll just keep the quotes for presentation purposes while understanding they are not necessary. – spacedustpi Feb 20 '19 at 14:48

Trouble shooting line continuation error for a long file path for read_csv

1 Answers1