0

I've a requirement where I need to identify if any page is storing or reading from HTML5 data stores. I am using HTMLUnit to scrape through webpages. I checked in the sourceforge listing that the support for HTML5 storages has been built. Does HTMLUnit actually create objects for localStorage, sessionStorage etc? If yes, how can I access them?

I've also thought of scraping all Javascripts on the page and search for the keywords, but is there any better method than that?

kapa
  • 77,694
  • 21
  • 158
  • 175
  • Sorry, I didn't quite get the question. You want a way to find out if a webpage has HTML 5 doctype or not? – Mosty Mostacho Feb 09 '12 at 20:44
  • Hey thanks for commenting. I'm interested to identify if any webpage is storing data in or reading data from the web storage interfaces defined here: http://www.w3.org/TR/webstorage/#the-storage-interface. Typically this is done through javascript commands like `localStorage.setItem('key','value');` or `localStorage.getItem('key');`. Does HTMLUnit provide any way to identify the use of these storages by a webpage? – Indranil Datta Feb 09 '12 at 22:58

1 Answers1

1

a simple test could be to pass a javascript source code that does the setItem('key','value') storage and then does getItem('key') and inspect the result. If some script object is returned, it means success. something like the following:

ScriptResult result = currentPage.executeJavaScript("window.localStorage.setItem('some_key','some_value');window.localStorage.getItem('some_key');");
System.out.println("script result: "+result.getJavaScriptResult().toString());
Birhanu Eshete
  • 372
  • 3
  • 10