I'm working on crawler4j using groovy and grails.
I have a BasicCrawler.groovy class in src/groovy and the domain class Crawler.groovy and a controller called CrawlerController.groovy.
I have few properties in BasicCrawler.groovy class like url, parentUrl, domain etc.
I want to persist these values to the database by passing these values to the domain class while crawling is happening.
I tried doing this in my BasicCrawler class under src/groovy
class BasicCrawler extends WebCrawler {
Crawler obj = new Crawler()
//crawling code
@Override
void visit(Page page) {
//crawling code
obj.url = page.getWebURL().getURL()
obj.parentUrl = page.getWebURL().getParentUrl()
}
@Override
protected void handlePageStatusCode(WebURL webUrl, int statusCode, String statusDescription) {
//crawling code
obj.httpstatus = "not found"
}
}
And my domain class is as follows:
class Crawler extends BasicCrawler {
String url
String parentUrl
String httpstatus
static constraints = {}
}
But I got the following error:
ERROR crawler.WebCrawler - Exception while running the visit method. Message: 'No such property: url for class: mypackage.BasicCrawler
Possible solutions: obj' at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:50)
After this I tried another approach. In my src/groovy/BasicCrawler.groovy class, I declared the url and parentUrl properties on the top and then used databinding (I might be wrong since I am just a beginner):
class BasicCrawler extends WebCrawler {
String url
String parentUrl
@Override
boolean shouldVisit(WebURL url) { //code
}
@Override
void visit(Page page) { //code
}
@Override
protected void handlePageStatusCode(WebURL webUrl, int statusCode, String statusDescription) {
//code}
}
def bindingMap = [url: url , parentUrl: parentUrl]
def Crawler = new Crawler(bindingMap)
}
And my Crawler.groovy domain class is as follows:
class Crawler {
String url
String parentUrl
static constraints = {}
}
Now, it doesn't show any error but the values are not being persisted in the database. I am using mongodb for the backend.