1

I'm using a for loop to go down a list and create variables for the pdfkit module and it works just fine for the first two items on the list then has an error on the Third. This is my code:

import pdfkit
import time


link1 = "https://www."
link2 = ".com"
pdf = ".pdf"
for line in open('links.txt'):
  print(line.strip("\n\r"))
  newlink = link1 + line.strip("\n\r") + link2
  print(newlink)
  newpdf = line.strip("\n\r") + pdf
  print(newpdf)
  pdfkit.from_url(newlink, newpdf)
print('Finished')

And its pulling from This list:

bing
yahoo
google

It Successfully completes the first 2 items and prints a pdf on them then I get an error that says,

Traceback (most recent call last): File new.py, line 14 in module pdfkit.from_url(newlink, newpdf)

File "/usr/local/lib/python2.7/dist-packages/pdfkit/api.py", line 26 in from_return r.to_pdf(output_path)

File "/usr/local/lib/python2.7/dist-packages/pdfkit/pdfkit.py," line 156, in traise IOError('wkhtmltopdf reported an error:\n' + stderr)

IOError:wkhtmltopdf reported an error:

Does anybody know why I'm getting this error and how to fix it?

Haran Rajkumar
  • 2,155
  • 16
  • 24

2 Answers2

0

When I ran the same code as you, it got stuck on "yahoo" whereas google and a few other websites I had tried, worked. It threw the following error for me.

raise IOError("wkhtmltopdf exited with non-zero code {0}. error:\n{1}".format(exit_code, stderr))
OSError: wkhtmltopdf exited with non-zero code 1. error:
Loading pages (1/6)
QFont::setPixelSize: Pixel size <= 0 (0)
QFont::setPixelSize: Pixel size <= 0 (0)
libpng warning: iCCP: known incorrect sRGB profile
Counting pages (2/6)                                               
QFont::setPixelSize: Pixel size <= 0 (0)
QFont::setPixelSize: Pixel size <= 0 (0)
Resolving links (4/6)                                                       
Loading headers and footers (5/6)                                           
Printing pages (6/6)
Done                                                                      
Exit with code 1 due to network error: ProtocolFailure

As you can see here, it seems to be an error due to protocol which implies that wkhtml could not load the page for some reason. I think the error you must have received must have been from a similar source. Therefore, if the choice of websites was just arbitrary, then choose websites that work.

If not, do tell and I'll delve into wkhtml documentation to try figure out the error source.

Haran Rajkumar
  • 2,155
  • 16
  • 24
  • Thank You for doing extra research and trying to help me ive been trying so hard to figure this problem out and am struggling so thank you for the help, – lavarockman Jun 04 '18 at 04:19
  • Im using this code for a much bigger project and was testing the fundamentals on basic sites but all I need to to export the page as a pdf, just as if I was doing a ctrl + P – lavarockman Jun 04 '18 at 04:40
  • Alright, got it. Now, does the pdf get output even though you get an error? – Haran Rajkumar Jun 04 '18 at 05:10
  • Yeah even with the errors it would, export the pdf. Also i was just about to try weasyprint thank you – lavarockman Jun 04 '18 at 13:17
0

I wasn't able to find a fix for the network error on wkhtml, yet. But, I instead found an alternate plugin that works, called weasyprint.

Here is an alternate version of your code with weasyprint implemented.

from weasyprint import HTML

link1 = "https://www."
link2 = ".com"
pdf = ".pdf"
for line in open('links.txt'):
  print(line.strip("\n\r"))
  newlink = link1 + line.strip("\n\r") + link2
  print("newlink "+newlink)
  newpdf = line.strip("\n\r") + pdf
  print(newpdf)
  HTML(newlink).write_pdf(newpdf)
print('Finished')

Hopefully this helps.

Haran Rajkumar
  • 2,155
  • 16
  • 24