I am doing a Project in Java. In this project I have to work with DOM. For that I first load a dynamic page of any given URL, by using Selenium. Then I parse them using Jsoup.
I want to get the dynamic page source code of given URL
Code snapshot:
public static void main(String[] args) throws IOException {
// Selenium
WebDriver driver = new FirefoxDriver();
driver.get("ANY URL HERE");
String html_content = driver.getPageSource();
driver.close();
// Jsoup makes DOM here by parsing HTML content
Document doc = Jsoup.parse(html_content);
// OPERATIONS USING DOM TREE
}
But the problem is, Selenium takes around 95% of the whole processing time, that is undesirable.
Selenium first opens Firefox, then loads the given page, then gets the dynamic page source code.
Can you tell me how I can reduce the time taken by Selenium, by replacing this tool with another efficient tool. Any other advice would also be welcome.
Edit NO. 1
There is some code given on this link.
FirefoxProfile profile = new FirefoxProfile();
profile.setPreference("general.useragent.override", "some UA string");
WebDriver driver = new FirefoxDriver(profile);
But what is second line here, I didn't understand. As Documentation is also very poor of selenium.
Edit No. 2
System.out.println("Fetching %s..." + url1); System.out.println("Fetching %s..." + url2);
WebDriver driver = new FirefoxDriver(createFirefoxProfile());
driver.get("url1");
String hml1 = driver.getPageSource();
driver.get("url2");
String hml2 = driver.getPageSource();
driver.close();
Document doc1 = Jsoup.parse(hml1);
Document doc2 = Jsoup.parse(hml2);