0

I'm trying to get arrest data from the police blotter of the Palm Beach County Sheriff's Office.

I've limited my search to the city of West Palm Beach, going back as far as the data goes (Oct. 31, 1974).

I'm using FireFox.

When I get the results, I open up FireBug, check the HTML tab, and I can see the info I want from the page (i.e., arrested person's name, arrest address, charges, etc.).

I checked the Net>>XHR>>Post tab to find the POST request parameters, and put that into my code, yet the HTML it returns does not include the vital info I'm looking for.

Does anyone know if I'm just doing it wrong, or if the site is unscrapeable? Here's my code:

require 'rubygems'
require 'nokogiri'
require 'restclient'
require 'open-uri'

blotterURL = 'http://www.pbso.org/index.cfm?fa=blotter'

city = "west palm beach"
fromrec = 1

if page = RestClient.post(blotterURL, {'city_name'=>city, 'fromrec'=>fromrec})
    puts Nokogiri::HTML(page)
end
Username
  • 3,463
  • 11
  • 68
  • 111

1 Answers1

1

It's because the page is being populated by ajax updates. Probably watir-webdriver is your best option.

pguardiario
  • 53,827
  • 19
  • 119
  • 159
  • Alright, I've got watir-webdriver installed. I'm looking for a good web-scraping guide for it. Which would you say is the best? Thanks for responding. – Username Jul 11 '12 at 19:56