2

I'm having a hard time figuring out how to scrape this webpage to get this wedding list into my onepager. It doesn't seem complicated at first but when I get into the code, I just can't get any results.

I've tried ygrab.js, which was fairly simple and got me somewhere but then I can't seem to scrape the images and it only prints the output in the console (not much documentation to go on).

$(function() {
var $listResult = $('#list-result');
var kado = [];
var data = [
{
    url: 'https://www.kadolog.com/fr/list/liste-de-mariage-laura-julien',
    selector: '.kado-not-full',
    loop: true,
    result: [{
              name: 'photo', 
              find: '.views-field-field-photo',
              grab: {
                by: 'attr',
                value: 'src'
              }
             },
            {
            name: 'title',
            find: '.views-field-title .field-content',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'description',
            find: '.views-field-body .field-content',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'price',
            find: '.price',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'remaining',
            find: '.topinfo',
            grab: {
                by: 'text',
                value: ''
            }
        },
        {
            name: 'link',
            find: '.views-field-nothing .field-content .btn',
            grab: {
                by: 'attr',
                value: 'href'
            }
        },
    ],
  },
];
ygrab(data, function(result){
 console.log(JSON.stringify(result, null, 2)); //photos = undefined
});

Then there's Node.js with Request and Cheerio (and I tried Crawler too), but I have no idea how node works.

var request = require("request");

This gives me an error in the console saying require is not defined. Fair enough, I added require.js to the scripts in my page. I got another error ("Uncaught Error: Mismatched anonymous define() module: ...").


My question is this: Is there a simple Javascript way (possibly without involving node?), to scrape the wedding list I'm trying to get? Or maybe a tutorial that resembles what I'm trying to do step by step ?

I'd be truly grateful for any help or advice.

Sourav Ghosh
  • 1,964
  • 4
  • 33
  • 43
Elydee
  • 41
  • 5

2 Answers2

0

i think your only issue is the img selector. Change

    {
          name: 'photo', 
          find: '.views-field-field-photo',
          grab: {
            by: 'attr',
            value: 'src'
          }
    },

To this

   {
          name: 'photo', 
          find: '.views-field-field-photo .field-content img',
          grab: {
            by: 'attr',
            value: 'src'
          }
    },

I actually can't test this right now, but it should be working!!

tjadli
  • 790
  • 6
  • 16
0

Node.js is a seperate application that executes javascript independent of a web page.

require is Node's way of importing packages, and isn't defined by the browser, require.js is a javascript library for requiring packages, but it doesn't work the same way as Node's require function.

To use request and cheerio, you'd need to install Node.js from here, then install request and cheerio with the following commands:

  • npm install request --save
  • npm install cheerio --save

Then any code you write with Node.js in that directory will have access to the modules.

Here's a tutorial to web scraping in Node.js with cheerio.

Spooze
  • 378
  • 1
  • 10