0

I want to build a simple python program that gets updates for tracking numbers from UPS, I couldn't get an account number with them so I can't use their API. I decided to try web scraping.

Here's an example of a tracking number: https://www.ups.com/track?loc=en_US&tracknum=1Z0X118AYW08592000&requester=WT/trackdetails

I want to get the scheduled delivery date, the problem is that what the requests module scrapes and what shows when I view the page source doesn't get all the information inside a tag called app-root. That's where the delivery date is.

I found a similar post that solves this problem with FedEx, but I can't get it to work with the ups website: Parsing HTML does not output desired data(tracking info for FedEx)

I installed an extension called HTTP Trace that shows all the requests that go through my server, I can't find the one that matches UPS, this is what I got from the extension when I searched for the tracking number, any ideas what I can do here?

https://wwwapps.ups.com/WebTracking/track?loc=en_IL
HTMLVersion: 5.0
loc: en_IL
track.x: Track
trackNums: 1Z0X118AYW08592000
ups-search: 1Z0X118AYW08592000

POST https://wwwapps.ups.com/WebTracking/track?loc=en_IL
Upgrade-Insecure-Requests: 1
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9

HTTP/1.1 302 Moved Temporarily
 Redirect to: https://www.ups.com/track?loc=en_IL&tracknum=1Z0X118AYW08592000&requester=WT
Server: Apache
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=31536000; includeSubDomains
Cache-Control: no-store, no-cache
Pragma: no-cache
Location: https://www.ups.com/track?loc=en_IL&tracknum=1Z0X118AYW08592000&requester=WT
Content-Length: 365
Content-Type: text/html
Date: Thu, 14 Jan 2021 00:32:34 GMT
Connection: keep-alive
Server-Timing: cdn-cache; desc=MISS
Server-Timing: edge; dur=164
Server-Timing: origin; dur=23
Debug-AK-TLS: No bypass

GET https://www.ups.com/track?loc=en_IL&tracknum=1Z0X118AYW08592000&requester=WT
Upgrade-Insecure-Requests: 1
Content-Type: application/x-www-form-urlencoded
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9

HTTP/1.1 200 OK
Server: Apache
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=31536000; includeSubDomains
Cache-Control: no-store, no-cache
Pragma: no-cache
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Debug-AK-TLS: No bypass
X-Akamai-Transformed: 9 9152 0 pmb=mTOE,1mRUM,1
Date: Thu, 14 Jan 2021 00:32:34 GMT
Content-Length: 10947
Connection: keep-alive
Vary: Accept-Encoding
Server-Timing: cdn-cache; desc=MISS
Server-Timing: edge; dur=182
Server-Timing: origin; dur=201

https://www.facebook.com/tr/?id=969628123173894&ev=PageView&dl=https%3A%2F%2Fwww.ups.com%2Ftrack%3Floc%3Den_IL%26tracknum%3D1Z0X118AYW08592000%26requester%3DWT%2Ftrackdetails&rl=https%3A%2F%2Fwww.ups.com%2F&if=false&ts=1610584355509&sw=1920&sh=1080&v=2.9.32&r=stable&a=tmtealium&ec=0&o=30&fbp=fb.1.1598067407332.38393503&it=1610584355413&coo=false&dpo=LDU&dpoco=0&dpost=0&rqm=GET

GET https://www.facebook.com/tr/?id=969628123173894&ev=PageView&dl=https%3A%2F%2Fwww.ups.com%2Ftrack%3Floc%3Den_IL%26tracknum%3D1Z0X118AYW08592000%26requester%3DWT%2Ftrackdetails&rl=https%3A%2F%2Fwww.ups.com%2F&if=false&ts=1610584355509&sw=1920&sh=1080&v=2.9.32&r=stable&a=tmtealium&ec=0&o=30&fbp=fb.1.1598067407332.38393503&it=1610584355413&coo=false&dpo=LDU&dpoco=0&dpost=0&rqm=GET
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
Accept: image/avif,image/webp,image/apng,image/*,*/*;q=0.8

HTTP/1.1 302
 Redirect to: https://cx.atdmt.com/?c=1479770850078954307&f=AYzL_IHfyiIJ9HIa7oqq8XcmRPtLo6M0aKkForULuTS_d5qgkpmUtO1x4Rmi3jkdZ4EPRHG7qxKZDTiWb-BA5MYf&id=969628123173894&l=3&v=0
cache-control: no-cache, no-store, must-revalidate
pragma: no-cache
expires: 0
date: Thu, 14 Jan 2021 00:32:36 GMT
location: https://cx.atdmt.com/?c=1479770850078954307&f=AYzL_IHfyiIJ9HIa7oqq8XcmRPtLo6M0aKkForULuTS_d5qgkpmUtO1x4Rmi3jkdZ4EPRHG7qxKZDTiWb-BA5MYf&id=969628123173894&l=3&v=0
content-type: text/plain
content-length: 0
server: proxygen-bolt
alt-svc: h3-29=":443"; ma=3600,h3-27=":443"; ma=3600

GET https://cx.atdmt.com/?c=1479770850078954307&f=AYzL_IHfyiIJ9HIa7oqq8XcmRPtLo6M0aKkForULuTS_d5qgkpmUtO1x4Rmi3jkdZ4EPRHG7qxKZDTiWb-BA5MYf&id=969628123173894&l=3&v=0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36
Accept: image/avif,image/webp,image/apng,image/*,*/*;q=0.8

HTTP/1.1 200
x-fb-rlafr: 0
content-type: image/gif
date: Wed, 13 Jan 2021 16:32:37 PST
x-content-type-options: nosniff
report-to: {"group":"coep_report","max_age":86400,"endpoints":[{"url":"https:\/\/www.facebook.com\/browser_reporting\/"}]}
cache-control: public, max-age=0
content-encoding: br
x-frame-options: DENY
cross-origin-resource-policy: cross-origin
expires: Wed, 13 Jan 2021 16:32:37 PST
vary: Accept-Encoding
cross-origin-embedder-policy-report-only: require-corp;report-to="coep_report"
pragma: public
x-fb-debug: ySiesinmQSMWtIWGg5+rMp+g66R70GGiqJJC3M0DowZMGuFf14OidRiX02DfG99gXxjUSjCaEtHosxh/9tl/hQ==
MarinaF
  • 35
  • 1
  • 2
  • 13
  • What prevents you getting an account and using the UPS Tracking API? – jarmod Jan 14 '21 at 00:40
  • did you try adding cookies and x-headers from browser to your request – Quantum Dreamer Jan 14 '21 at 00:41
  • - UPS requires a payment account for the API, and it wouldn't let me open one. - How do I add cookies and x-headers to the request? I'm not sure which request I should use, I tried some of the get requests and got back 200, but I don't know where to get the data from. – MarinaF Jan 14 '21 at 00:50

2 Answers2

1

I honestly do not believe this is possible. I checked how UPS loads its sites, and it seems to load the frontend first like this

Get request to website preview

then goes in to the api to grab the dates. For example, the delivered on date is stored in this api link ("https://www.ups.com/track/api/Track/GetStatus?loc=en_US") which needs a bunch of headers and has some akamai/security cookies (which may prevent you from scraping it).

If you really do not want to use an api, I would suggest using something like Selenium if you do not need it to be quick/do not have many links to work with.

sixsidedie
  • 45
  • 5
0

Your only choice is either Selenium OR API but that comes with a hitch. For what I can tell on the UPS website, their API only allows for queries at night. They only want "emergencies" to be hitting their API, which is preposterous since I would imagine web requests are hitting the same API.

Yaniv
  • 19
  • 4