I have a list of URL's and would like to scrape the location objects for each of their webpages. The data I am referring to is produced by typing "window.location" into your browser's console. For example, performing this action on www.github.com with Chrome would give you the something like following output:
Location {assign: function, replace: function, reload: function, ancestorOrigins: DOMStringList, origin: "https://github.com"…}
When expanded, you can see more information:
Location { ancestorOrigins: DOMStringList assign: function () { [native code] } hash: "" host: "github.com" hostname: "github.com" href: "https://github.com/" origin: "https://github.com" pathname: "/" port: "" protocol: "https:" reload: function () { [native code] } replace: function () { [native code] } search: "" toString: function toString() { [native code] } valueOf: function valueOf() { [native code] } __proto__: Location }
I have used Python and the Mechanize library to scrape in the past, but have never desired this functionality until now and am not sure how to proceed. Any suggestions would be welcomed.