I'm using Cheerio (https://github.com/MatthewMueller/cheerio) to scrape websites and get images for a project I'm working on. I'm wondering if there's an easy way with Node.js (or another package) to convert the $(img).attr('src') to a fully qualified URL? Sometimes I'll get "image.jpg" and other times "../../image.jpg", and other times "//somepath/image.jpg". Perhaps I'm just missing a regex of some sort... Thanks for your time :)
Asked
Active
Viewed 3,337 times
7
-
1We will need the url of the scrapped site... Or an example of a site like that. Either way, I recommend you to build yourself an extra function to parse these values. – Herman Junge Oct 26 '12 at 03:42
-
Ohh Brilliant !! I was troubled by the exact same thing, was manually writing out solutions for each of these. God bless SO ! – vishalv2050 May 31 '14 at 15:28
1 Answers
10
Look at the node url
module. Specifically url.resolve(from, to)
should be what you're looking for.

Waylon Flinn
- 19,969
- 15
- 70
- 72