0

Here is the full code

public class CrawlServlet implements Filter{
 public static String getFullURL(HttpServletRequest request) {
    StringBuffer requestURL = request.getRequestURL();
    String queryString = request.getQueryString();


    if (queryString == null) {
        return requestURL.toString();
    } else {
        return requestURL.append('?').append(queryString).toString();
    }
 }

 @Override
 public void destroy() {
 // TODO Auto-generated method stub

 }

 @Override
 public void doFilter(ServletRequest request, ServletResponse response,
 FilterChain chain) throws IOException, ServletException {

 HttpServletRequest httpRequest = (HttpServletRequest) request;
 String fullURLQueryString = getFullURL(httpRequest);
 System.out.println(fullURLQueryString+" what wrong");

 if ((fullURLQueryString != null) && (fullURLQueryString.contains("_escaped_fragment_"))) {
     // remember to unescape any %XX characters
     fullURLQueryString=URLDecoder.decode(fullURLQueryString,"UTF-8");
     // rewrite the URL back to the original #! version
         String url_with_hash_fragment=fullURLQueryString.replace("?_escaped_fragment_=", "#!");


         final WebClient webClient = new WebClient();

         WebClientOptions options = webClient.getOptions();
         options.setCssEnabled(false);
         options.setThrowExceptionOnScriptError(false);
         options.setThrowExceptionOnFailingStatusCode(false);
         options.setJavaScriptEnabled(false);
         HtmlPage page = webClient.getPage(url_with_hash_fragment);

         // important!  Give the headless browser enough time to execute JavaScript
         // The exact time to wait may depend on your application.

         webClient.waitForBackgroundJavaScript(20000);

         // return the snapshot
         //String originalHtml=page.getWebResponse().getContentAsString();
         //System.out.println(originalHtml+" +++++++++");
         System.out.println(page.asXml()+" +++++++++");

         PrintWriter out = response.getWriter();
         out.println(page.asXml());
         //out.println(originalHtml);
     } else {
      try {
        // not an _escaped_fragment_ URL, so move up the chain of servlet (filters)
        chain.doFilter(request, response);
      } catch (ServletException e) {
        System.err.println("Servlet exception caught: " + e);
        e.printStackTrace();
      }
    }

 }


 @Override
 public void init(FilterConfig arg0) throws ServletException {
 // TODO Auto-generated method stub

 }


}

After opened the url "http://127.0.0.1:8888/Myproject.html?gwt.codesvr=127.0.0.1:9997?_escaped_fragment_=article", it showed the Host Page html code like this:

<html>

<head>
<meta name="fragment" content="!">
<meta http-equiv="content-type" content="text/html; charset=UTF-8"/>
<!-- -->
<!--
 Consider inlining CSS to reduce the number of requested files 
-->
<!-- -->
<link type="text/css" rel="stylesheet" href="MyProject.css"/>
<!-- -->
<!-- Any title is fine -->
<!-- -->
<title>MyProject</title>
<!-- -->
<!-- This script loads your compiled module. -->
<!-- If you add any GWT meta tags, they must -->
<!-- be added before this line. -->
<!-- -->
<script type="text/javascript" language="javascript" ></script>
<!-- -->
<!-- The body can have arbitrary html, or -->
<!-- you can leave the body empty if you want -->
<!-- to create a completely dynamic UI. -->
<!-- -->
</head>
<body>

<div id="loading">
Loading
<br/>
<img src="../images/loading.gif"/>
</div>
<!-- OPTIONAL: include this if you want history support -->
<iframe src="javascript:''" id="__gwt_historyFrame" tabindex="-1" style="position: absolute; width: 0;height: 0; border:0;"></iframe>
<!--
 RECOMMENDED if your web app will not function without JavaScript enabled 
-->
<noscript>

<div style="width: 22em; position: absolute; left: 50%; margin-left: -11em; color: red; background-color: white; border: 1pxsolid red; padding: 4px; font-family: sans-serif;">
Your web browser must have JavaScript enabled in order for this application to display correctly.
</div>
</noscript>
</body>
</html>

On the other hand, "http://127.0.0.1:8888/Myproject.html?gwt.codesvr=127.0.0.1:9997#!article" works ok & show article without any problem.

I also compiled the whole project & ran it under Tomcat7, but I have the same problem. It always shows the html of host page.

Note: article page is the nested presenter that is embedded inside a header presenter. But I don't think that is the main reason cos it didn't even show the header page.

Tum
  • 3,614
  • 5
  • 38
  • 63

1 Answers1

0

First, instead of ?_escaped_fragment_=article, perhaps try &_escaped_fragment_=article because you already have ? for the gwt.codesvr, so 2 ? may mess up url parameter parsing.

Second, you need to make sure that your filter handle the case of having parameter gwt.codesvr. It looks like your filter assumes it is the first parameter -- i.e., starting with ?. I believe the example here does work either way.

Community
  • 1
  • 1
Patrick
  • 1,561
  • 2
  • 11
  • 22
  • that is not true, why? cos I even put the url before WebClient to test & it still show the host page. url_with_hash_fragment="http://127.0.0.1:8888/myproject.html?gwt.codesvr=127.0.0.1:9997#!article final WebClient webClient = new WebClient(); – Tum May 17 '14 at 16:49
  • besides I even compiled my project & test with "mydomain.com?_escaped_fragment_=article", it has the same result – Tum May 17 '14 at 16:51
  • change to String url_with_hash_fragment=fullURLQueryString.replace("&_escaped_fragment_=", "#!"); doesn't make it work either – Tum May 17 '14 at 16:55
  • anyway, I tried your code successfully & I got the same result, that is the hostpage. I even put the url inside the doFilter: String url_with_hash_fragment="http://127.0.0.1:8888/Ekajati.html?gwt.codesvr=127.0.0.1:9997#!article"; final URL urlWithHashFragment = new URL(url_with_hash_fragment); final WebRequest webRequest = new WebRequest(urlWithHashFragment); it show the same result – Tum May 17 '14 at 17:17
  • I cannot comment re. GWTP. Have you tried running against GWT compiled javascript as opposed to dev mode (i.e., with the escaped fragment but without gwt.codesvr after compiling)? Running GWT compiled javascript locally works for me. However, I am hitting issues with dev mode no longer working on Linux, so unfortunately I am not in a position to test dev mode -- and this sucks big times, but this is another problem! – Patrick May 17 '14 at 22:36
  • i did, and i got the same result – Tum May 17 '14 at 22:49
  • what is your filter mapping ?? –  Sep 04 '14 at 12:10
  • See [here](http://stackoverflow.com/a/14012724/1143684) for working version of my filter. – Patrick Sep 04 '14 at 16:32