I am trying to use pdfkit to make a visual backup of our company wiki. I am running into trouble since the website requires the user to be logged in to use. I developed a script using splinter that logs into the company wiki, but when pdfkit executes, it returns the log in page. PDFkit must open a different session in that case. How would I be able to find out when credentials (cookies) are needed to access the pages on my site, and save them as a variable so I can grab these screen shots?
I am using python 2.7.8 splinter, requests and pdfkit
from splinter import Browser
browser = Browser()
browser.visit('https://companywiki.com')
browser.find_by_id('login-link').click()
browser.fill('os_username', 'username')
browser.fill('os_password', 'password')
browser.find_by_name('login').click()
import pdfkit
pdfkit.from_url("https://pagefromcompanywiki.com", "c:/out.pdf")
I also found the following script which will log me in and save credentials, but I'm not sure how to tie it in to what I am trying to do.
import requests
import sys
EMAIL = ''
PASSWORD = ''
URL = 'https://company.wiki.com'
def main():
session = requests.session(config={'verbose': sys.stderr})
login_data = {
'loginemail': EMAIL,
'loginpswd': PASSWORD,
'submit': 'login',
}
r = session.post(URL, data=login_data)
r = session.get('https://pageoncompanywiki.com').
if __name__ == '__main__':
main()
Any ideas on how to accomplish this task are appreciated