4

I have the URL https://www.facebook.com/ads/library/?id=286238429359299 which gets redirected to https://www.facebook.com/ads/library/?active_status=all&ad_type=political_and_issue_ads&country=US&impression_search_field=has_impressions_lifetime&id=286238429359299&view_all_page_id=575939395898200 in the browser.

I'm using the following code:

    @Test
    public void createWebClient() throws IOException {
        getLogger("com.gargoylesoftware").setLevel(OFF);
        WebClient webClient = new WebClient(CHROME);
        WebClientOptions options = webClient.getOptions();
        options.setJavaScriptEnabled(true);
        options.setRedirectEnabled(true);
        webClient.waitForBackgroundJavaScriptStartingBefore(10000);
        // IMPORTANT: Without the country/language selection cookie the redirection does not work!
        URL s = webClient.getPage("https://www.facebook.com/ads/library/?id=286238429359299").getUrl();
    }

The above code doesn't take into account of the redirection, is there something I am missing? I need to get the final URL the original URL resolves to.

orange
  • 5,297
  • 12
  • 50
  • 71
  • 1
    As the answers have noted, the "final" URL depends on who and how the URL is requested. There may not be a single _final_ URL. – dimo414 Jul 25 '20 at 01:59

2 Answers2

0

actually the url https://www.facebook.com/ads/library/?id=286238429359299 return a page with javascript.The javascript will detect environment of the web browser.For example,the js will detect if the current browser is the Headless browser and if the web driver is legal.So I think the solution is to analysis the javascript and you will get the final url.

sk l
  • 81
  • 1
  • 5
0

I think it never actually resolves to final URL due being headless.

Please load the same page in a browser, load the source code and search for "page_uri" and you will see exactly URI you are looking for.

If you would check HtmlUnit output or print the page

 System.out.println(page.asXml());

You will see that "page_uri" contains originally entered URL. I suggest to use Selenium WebDriver (not headless)

fg78nc
  • 4,774
  • 3
  • 19
  • 32