2

I am trying to scrape https://www.freelance.nl/opdrachten/zoeken for data using Request and Cheerio but I am running into issues posting search terms.

I cannot see where the search string and selected category are sent during the post when using the site and how I can use them in Request to automate searches from my node application.

Basically I want to be able to send different search terms using Request then I can scrape the returned html for the data I need.

So far I have this:

request.post('https://www.freelance.nl/opdrachten/zoeken', { form: { key: 'value' } },
    function (error, response, body) {
        if (!error && response.statusCode == 200) {
            console.log(body)

        }
    }
);

But as I cannot see where the form data is stored in dev tools, I cannot send the correct values in the 'form' object. I'm pretty sure it's in request payload, but how do I get to that from my node application?

Is there an easier way to do this? Am I completely wasting my time?

Dev tools screenshot

user2248441
  • 289
  • 2
  • 4
  • 22

2 Answers2

0

Open your eyes ;) On the bottom of your image, look at Request Payload

projectFilterForm[keywords]
projectFilterForm[category][]
projectFilterForm[province][]

Update

var request = require('request');
var querystring = require('querystring');

require('request').debug = true;

var data = querystring.stringify({
    'projectFilterForm[keywords]': 'java'
});

var options = {
    followAllRedirects: true,
    uri: 'https://www.freelance.nl/opdrachten/zoeken',
    method: 'POST',
    headers: {
        'Content-Length': Buffer.byteLength(data),
        'cache-control': 'no-cache',
        'Content-Type': 'multipart/form-data',
        'origin': 'https://www.freelance.nl',
        'referer': 'https://www.freelance.nl/opdrachten/zoeken',
        'user-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36'
    }
};

var req = request(options, function (error, response, body) {
    console.log(body);
});

req.write(data);
req.end();

I tried everything =)) Nothing... after redirect, we get default page. Maybe they use some session bases protection?

This is not node's problem. I tried to do this even in chrome's postman extension with no luck.

Alexander R.
  • 1,756
  • 12
  • 19
0

I've slighly modified your code:

payload = {'projectFilterForm[keywords]':'javascript','projectFilterForm[category][]': '1'}
request.post('https://www.freelance.nl/opdrachten/zoeken', { data:payload },
function (error, response, body) {
    if (!error && response.statusCode == 200) {
        console.log(body)
    }
}

);

Igor Savinkin
  • 5,669
  • 8
  • 37
  • 69
  • It didn't help. Still gives me back the same html in the 'body' variable as if i hadn't entered any form data at all – user2248441 Jul 08 '15 at 13:44