I noticed that this has been asked before, but no one else has yet to receive an answer, so I'll try my best to ask too.
In the last several months, my Wordpress website, http://geekvision.tv/ , has been undetectable by Facebook's debugger. I…
i have an iMDb-Scraper from another site. It worked very well and now iMDb changed it's html-output and the regular expression doesn't find the poster anymore. I'm a noob at regex, so maybe someone can help me
this is the line
$arr['poster'] =…
I am new in python and I am trying to scrape a data from yellow pages. I was able to scrape it but I get a messed result.
This was the result i got:
2013-03-24 20:26:47+0800 [scrapy] INFO: Scrapy 0.14.4 started (bot: eyp)
2013-03-24 20:26:47+0800…
I have a code that parses through text files in a folder, and saves a predefined number of words around certain search words.
For example, it looks for words such as "date" and "year". If it finds both in the same sentence it will save the sentence…
I’d like to do some hygiene on a bloated images folder/directory for a website of mine. I’m a grade just above novice working with javascript, it seems like it might be possible achieve a solution using javascript…
The solution I’m searching for…
It might just be an idiotic bug in the code that I haven't yet discovered, but it's been taking me quite some time: When parsing websites using nokogiri and xpath, and trying to save the content of the xpaths to a .csv file, the csv file has empty…
I'm trying to use Behat/Mink in order to load a website.
I've used Composer for the installation, this is my composer.json:
{
"require": {
"behat/mink": "*",
"behat/mink-goutte-driver": "*",
"behat/mink-selenium-driver":…
When I execute a scraper, it loads the url using this method:
$html = scraperWiki::scrape("foo.html");
So every time I add new code to the scraper and want to try it, it loads again the html, which takes a fair amount of time.
Is there anyway…
How do I download all images from a web page and prefix the image names with the web page's URL (all symbols replaced with underscores)?
For example, if I were to download all images from http://www.amazon.com/gp/product/B0029KH944/, then the main…
I've been trying to a find a statistics-esque formula for calculating the rate of change for html tags which are either added or removed from various websites.
So, for example, with the scraper I'm writing, I obtain the initial tag count and then…
I'm looking for a good way to do this: my current method seems to not allow depths of searches beyond 30-40, even after editing the php.ini settings in hopes to increase default execution time as well as max memory usage. Basically, as soon as the…
I've developed an image scraper that will scrape specific images from remote sites and display them upon pasting into a text field. The logic includes finding images that end in .jpg .jpeg . png etc.
I'm running into an issue where alot of sites…
I'm trying to develop 'want' and 'own' buttons.
If I use the Facebook debug tool it tells me the final URL is the home page and this has happened because the page has been redirected, which I don't want. I want the fetched URL to be scraped.
As a…
What can I use to achieve the following, script a browser or otherwise make a request to the server, login, browse the site, eg. find links and navigate to those links.
For now, since I am into NodeJS, I was looking at node.io. It allows you to…
I suspect this is a trivial query but hope someone can help me with a query I've got using lxml in a scraper I'm trying to build.
https://scraperwiki.com/scrapers/thisisscraper/
I'm working line-by-line through the tutorial 3 and have got so far…