1

I am trying to write a program that reads a webpage looking for file links, which it then attempts to download using curl/libcurl/pycurl. I have everything up to the pycurl correctly working, and when I use a curl command in the terminal, I can get the file to download. The curl command looks like the following:

curl -LO https://archive.org/download/TheThreeStooges/TheThreeStooges-001-WomanHaters1934moeLarryCurleydivxdabaron19m20s.mp4

This results in one redirect (a file that reads as all 0s on the output) and then it correctly downloads the file. When I remove the -L flag (so the command is just -O) it only reaches the first line, where it doesn't find a file, and stops.

But when I try to do the same operation using pycurl in a Python script, I am unable to successfully set [Curl object].FOLLOWLOCATION to 1, which is supposed to be the equivalent of the -L flag. The python code looks like the following:

c = [class Curl object] # get a Curl object
fp = open(file_name,'wb')
c.setopt(c.URL , full_url) # set the url
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.WRITEDATA , fp)
c.perform()

When this runs, it gets to c.perform() and shows the following:

python2.7: src/pycurl.c:272: get_thread_state: Assertion `self->ob_type == p_Curl_Type' failed.

Is it missing the redirect, or am I missing something else earlier because I am relatively new to cURL?

WilBur
  • 270
  • 4
  • 9

1 Answers1

0

When I enabled verbose output for the c.perform() step, I was able to uncover what I believe was/is the underlying problem that my program had. The first line, which was effectively flagged, indicated that an open connection was being reused.

I had originally packaged the file into an object oriented setup, as opposed to a script, so the curl object had been read and reused without being closed. Therefore after the first connection attempt, which failed because I didn't set options correctly, it was reusing the connection to the website/server (which presumably had the wrong connection settings). The problem was resolved by having the script close any existing Curl objects, and create a new one before the file download.

WilBur
  • 270
  • 4
  • 9