I wanted to scrape pages client-side not server-side. However the same-origin policy prevents me from doing this.
What I'm trying to understand is why I don't have read only access to the DOM of another site.
What security risk does this pose to the site, if I can get the same information by pulling the page onto the server and accessing it any ways.
I simply want to pull basic information from a page like:
document.title
If I can do this serve side, why not client side? The main difference being the extra round-trip that I don't want to pay for?
Obviously user's data should not be accessible, and this is obvious and I don't need information on this. But in the same way I can pull in a generic version of a page using
file_get_contents
and parse the DOM, I would like to do client - side.
What is the technical limitation not allowing JavaScript to determine the difference between...giving access to user defined data vs. generic page data?
PHP can do it.
Why can't JavaScript?
What it the limitation?
I don't want to necessarily circumvent it or hack it, but understand the purposed better and maybe find that it does not apply to the case I have...page scrapes client side
Related
Ways to circumvent the same-origin policy
How are bookmarklets( javascript in a link ) verfied by servers? How is security kept?
http://en.wikipedia.org/wiki/Representational_state_transfer#Central_principle