0

Basically I would like to open this page, select "Rüzgar" from the last dropdown, run the query with "Sorgula" button and extract all the coordinates stored in the table appearing once clicked the first button of the first column in the main table. I want to do that for all the rows.

Unfortunately I don't have sufficient programming experience to carry out this task. However, since I am a little bit familiar with programming, I think if somebody would point out the correct source for me to learn how to do that (regarding the requirements of the web page from which I am trying to extract data) I can build a small script for this task, maybe with scrapy or some other tool.

P.S.:I tried to do it with scrapinghub's Portia, but that did not work either.

Sam
  • 41
  • 6

2 Answers2

2

take a look at the Python module called selenium, namely the webdriver part of it. Some quick code that would do perform the search query you're after would be written as such:

from selenium import webdriver

driver = webdriver.Firefox()
search_link = 'http://lisans.epdk.org.tr/epvys-web/faces/pages/lisans/elektrikUretimOnLisans/elektrikUretimOnLisansOzetSorgula.xhtml?lisansDurumu=7'

driver.get(search_link)
last_dropdown_menu = driver.find_element_by_id('elektrikUretimOnLisansOzetForm:j_idt32')

last_dropdown_menu.click() # send a click to the element
last_dropdown_menu.send_keys('R') # scroll to Ruzgar
sorgula_button = driver.find_element_by_xpath('//*[@id="elektrikUretimOnLisansOzetForm:j_idt51"]/span[2]').click()

from there, you can figure out how to scrape the info you're after :-)

n1c9
  • 2,662
  • 3
  • 32
  • 52
  • Thanks for quite helpful answer. But I really need some more explanation on how can I automate the operation of scraping all coordinates that appear once clicked the first button in the first column (I need to automate to go through all the rows) – Sam Mar 23 '16 at 19:23
  • 1
    send the requisite clicks to make the info you want to scrape pop up, then inspect the source of the page and see what kind of tags the info you want are in. Scrape the text in those tags like so : `table_you_want = driver.find_element_by_id('elektrikKoordinatViewDataTable_data') for tr in table_you_want: print tr.text` – n1c9 Mar 23 '16 at 19:29
  • thanks a lot, I will try to do in the way you have explained. – Sam Mar 23 '16 at 19:41
1

Selenium might be ok since there are only 3 pages when you set the pagination on the bottom to 500. Nevertheless I wouldn't go with selenium because it's ... there are better ways.

All you do when you click the "Rüzgar" button is a POST request with the following arguments:

it's just a post request

Open the chrome debugger and see for yourself the type of requests you're doing. You can replicate the request. If you're interested in this method tell me to - maybe - write some more.

neverlastn
  • 2,164
  • 16
  • 23