2

I am trying to scrape data from site by spookyjs and store in mongoDB.I am able to get data from the website.But not able to save scraped data from spookyjs environment to mongoDB.To save scraped data,I passed my database model instance to spookyjs .I refered below link for it.

https://github.com/SpookyJS/SpookyJS/wiki/Introduction

Below is my code where I extracted data in prod_link_info variable and pass its values into mongoDB

   var product_model = require('./product').product_model;

     //get results
       spooky.then([{product_model:product_model},function(){
                this.waitForSelector('li[id^="product_"]', function() {
                   //  Get info on all elements matching this CSS selector
                    var prod_link_info = this.evaluate(function() {
                        var nodes = document.querySelectorAll('li[id^="product_"]');

                        return [].map.call(nodes, function(node) { // Alternatively: return Array.prototype.map.call(...
                            return node.querySelector('a').getAttribute('href')+"\n";
                        });
                    });

            //insert values in mongodb
            for (var i = 0; i < prod_link_info.length; i++) {
                product_model.create(
                    {
                        prod_link_info:prod_link_info[i],
                    }, function(err, product){
                        if(err) console.log(err);
                        else console.log(product);
                    });
            } });
    }]);

Below is the code of database schema and model used in above code.

var mongoose=require('mongoose');
var Schema = mongoose.Schema;
// create a schema
var productSchema = new Schema({
    prod_link_info: String,

});

var product_model= mongoose.model('product_model', productSchema);

module.exports = {
    product_model: product_model
}

But when I run above code it gives me following error ReferenceError: Can't find variable: product_model.

I want to store the data extracted from spookyjs to mongoDB.Please suggest where am I doing wrong.

1 Answers1

0

When you pass hash of variables to spooky, it is converted to a string using JSON.stringify and then gets converted back to an object using JSON.parse in casper environment (please refer docs); so it is impossible to pass mongoose model to casper environment (moreover there is no actual reason for that).

To solve the problem, you should pass the data from Spooky (casper) environment. As far as I know, the only way to do is to emit data and then handle it using spooky.on. Your example should look like:

var product_model = require('./product').product_model;

//get results
spooky.then([{},function(){
        this.waitForSelector('li[id^="product_"]', function() {
           //  Get info on all elements matching this CSS selector
            var prod_link_info = this.evaluate(function() {
                var nodes = document.querySelectorAll('li[id^="product_"]');

                return [].map.call(nodes, function(node) { // Alternatively: return Array.prototype.map.call(...
                    return node.querySelector('a').getAttribute('href')+"\n";
                });
            });

            this.emit('data.ready', prod_link_info);
        });
}]);

spooky.on('data.ready', function (prod_link_info) {
    //insert values in mongodb
    for (var i = 0; i < prod_link_info.length; i++) {
        product_model.create(
            {
                prod_link_info:prod_link_info[i],
            }, function(err, product){
                if(err) console.log(err);
                else console.log(product);
            });
    } 
});
Sergey Lapin
  • 2,633
  • 2
  • 18
  • 20