2

I'm trying to login to Campaign Monitor to scrape some data from pages related to email campaign performance.

The "login-protected" URL of the page I'm trying to access looks like this:

https://mycompany.createsend.com/campaigns/reports/lists/DFGDF987GD98F7GD?s=BCV98B5XF54BVC54BC

Going to that page in a web browser (try it here) will redirect to the login page, itself with a URL like this:

https://login.createsend.com/l/98SDF76DS87F68S/DFGDF987GD98F7GD?ReturnUrl=%2Fcampaigns%2Freports%2Flists%2FBCV98B5XF54BVC54BC%3Fs%3BCV98B5XF54BVC54BC&s=7DS6F87S6DF876SDF76

What I've gathered from trying to solve this is that I need to open a session, authenticate on the redirect URL, then request the URL that I actually want (using the authenticated session).

Here is the code I'm using to try to accomplish that:

payload = {
    'username': 'myUsername',
    'password': 'myPassword',
}

redURL = 'https://login.createsend.com/l/98SDF76DS87F68S/DFGDF987GD98F7GD?ReturnUrl=%2Fcampaigns%2Freports%2Flists%2FBCV98B5XF54BVC54BC%3Fs%3BCV98B5XF54BVC54BC&s=7DS6F87S6DF876SDF76'

with requests.Session() as s:
    p = s.post(redURL, data=payload)

    # This prints the "success" message I've pasted below
    print p.content

    r = s.get('https://mycompany.createsend.com/campaigns/reports/lists/DFGDF987GD98F7GD?s=BCV98B5XF54BVC54BC')

    # This prints the HTML of the login page again, as if I'm not authenticated
    print r.content

Here is the "successful" response after the first POST for the session:

{"MultipleAccounts":false,"LoginStatus":"Success","SiteAddress":"https://mycompany.createsend.com","ErrorMessage":"","SessionExpired":false,"Url":"https://mycompany.createsend.com/login?Origin=Marketing\u0026ReturnUrl=%2fcampaigns%2freports%2flists%2f92D2FBCV98B5XF54BVC%3fs%7DS6F87S6DF876SDF76\u0026s=2FBCV98B5XF54BVC","DomainSwitchAddress":"https://mycompany.createsend.com","DomainSwitchAddressQueryString":null,"NeedsDomainSwitch":false}

Can someone please help me out with why the second request in the session prints the HTML of the login page instead of the HTML of the authenticated version of the page (ie. the page with the data I'm looking for)?

PaulJeans
  • 21
  • 2
  • If you login and access that report manually, keeping track of URLs along the way, does your `s` query string parameter stay the same? Could they be using that in addition (or as an alternative) to actual session cookies? Because in your code, the `s` parameter changes between logging in and the access that fails – jedwards Aug 03 '16 at 02:09
  • 1
    So I used Firebug to track the net traffic while logging in. I discovered that it goes (1) login/redirect page; the one I have for redURL, (2) an intermediary URL which I frankly don't completely understand, and (3) the report page that I want. What I did was change the redURL to URL (2) from above, then requested the report page I wanted, URL (3). Works like a charm. Thank you very much! I should have considered that. – PaulJeans Aug 03 '16 at 02:50
  • nice work! happy to hear you got it solved :) – jedwards Aug 03 '16 at 02:52

0 Answers0