I am trying to use Selenium to obtain value of selected option from a drop down list in a scrapy spider, but am unsure of how to go about it. Its my first interaction with Selenium.
As you can see in the code below, I create a request in parse
function which calls parse_page
function as a callback. In parse_page
I want to extract the value of selected option. I cant figure out how to attach webdriver to the response page sent into parse_page to be able to use it in Select. I have written an obviously wrong code below :(
from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.http import Request
from scrapy.exceptions import CloseSpider
import logging
import scrapy
from scrapy.utils.response import open_in_browser
from scrapy.http import FormRequest
from scrapy.http import Request
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from activityadvisor.items import TruYog
logging.basicConfig()
logger = logging.getLogger()
class TrueYoga(Spider):
name = "trueyoga"
allowed_domains = ["trueyoga.com.sg","trueclassbooking.com.sg"]
start_urls = [
"http://trueclassbooking.com.sg/frames/class-schedules.aspx",
]
def parse(self, response):
clubs=[]
clubs = Selector(response).xpath('//div[@class="club-selections"]/div/div/div/a/@rel').extract()
clubs.sort()
print 'length of clubs = ' , len(clubs), '1st content of clubs = ', clubs
req=[]
for club in clubs:
payload = {'ctl00$cphContents$ddlClub':club}
req.append(FormRequest.from_response(response,formdata = payload, dont_click=True, callback = self.parse_page))
for request in req:
yield request
def parse_page(self, response):
driver = webdriver.Firefox()
driver.get(response)
clubSelect = Select(driver.find_element_by_id("ctl00_cphContents_ddlClub"))
option = clubSelect.first_selected_option
print option.text
Is there any way to obtain this option value in scrapy without using Selenium? My search on google and stackoverflow didn't yield any useful answers so far.
Thanks for help!