Making your GWT app to be indexed by Search Engine is very important, but very very little info to tell you Step-By-Step Guidelines of How to make GWT app Crawlable dynamically.
Ok, Here is what I understood but I am not sure 100% I am correct or not. SO please correct me if you can.
To make a Gwt page ex myDomain.com#article;articleID=1
to be indexed by search engine, you need to:
-1st, convert myDomain.com#article;articleID=1
to myDomain.com#!article;articleID=1
-2nd, when Google /Yahoo bot visits that page (myDomain.com#!article;articleID=1
), it will convert that page into (myDomain.com?_escaped_fragment_=article&articleID=1
) & request that page into your webserver.The Web server then will try to render that page to the Bot.
So if we have a static page of myDomain.com?_escaped_fragment_=article&articleID=1
, then the content of that page will be read by the Bot & can be indexed.
But the article is dynamic cos user could enter myDomain.com?_escaped_fragment_=article&articleID=2
or ...article=3
... but we can't manually make the static page for each of article.
So the solution is HtmlUnit.
A tool like HtmlUnit
will dynamically convert myDomain.com?_escaped_fragment_=article&articleID=2
into a page that the bot can read.
But I don't know step-by-step guidelines of How to set HtmlUnit up?
Is HtmlUnit like a jar file that we can put into lib? what should we do next?
used solution from Patrik, then compile then out int into webapp of Tomcat7 but got this err in the browser
type Exception report
message Filter execution threw an exception
description The server encountered an internal error that prevented it from fulfilling this request.
exception
javax.servlet.ServletException: Filter execution threw an exception
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:66)
com.gwtplatform.dispatch.server.AbstractHttpSessionSecurityCookieFilter.doFilter(AbstractHttpSessionSecurityCookieFilter.java:67)
com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
root cause
java.lang.NoClassDefFoundError: org/w3c/css/sac/ErrorHandler
myproject.server.CrawlFilter.doFilter(CrawlFilter.java:101)
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:66)
com.gwtplatform.dispatch.server.AbstractHttpSessionSecurityCookieFilter.doFilter(AbstractHttpSessionSecurityCookieFilter.java:67)
com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
root cause
java.lang.ClassNotFoundException: org.w3c.css.sac.ErrorHandler
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
myproject.server.CrawlFilter.doFilter(CrawlFilter.java:101)
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:66)
com.gwtplatform.dispatch.server.AbstractHttpSessionSecurityCookieFilter.doFilter(AbstractHttpSessionSecurityCookieFilter.java:67)
com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
note The full stack trace of the root cause is available in the Apache Tomcat/7.0.53 logs.