I've written a script in python which is able to take search input
from an excel file, put it into the input_val
variable within my scraper and fetch the image link
matching with the search parameter. When I print it, I get the accurate results. I'm using openpyxl
to read and write the values.
However, when I try to write the populated results next to each search keyword in a new excel file I get stuck.
There are three search keyword (three movies) in my current excel file named item.xlsx
. Which are:
Shutter Island
Black Swan
True Grit
If I run my existing scraper without any modification then the results in a new file looks like:
Shutter Island
Black Swan
True Grit
image_link 1
image_link 2
image_link 3
I expect my scraper will fetch the image links and write those just next to each movies in a new excel file. As a result the output should look like below:
Column A Column B
Shutter Island image_link 1
Black Swan image_link 2
True Grit image_link 3
This Is what I've tried so far:
import requests
from bs4 import BeautifulSoup
from openpyxl import load_workbook
wb = load_workbook('item.xlsx')
ws = wb['Sheet1']
for row in range(1, ws.max_row + 1):
input_val = ws["A" + str(row)].value #the search keyword holds here
response = requests.get("http://www.boxofficemojo.com/search/?",params = {'q':input_val})
soup = BeautifulSoup(response.text,"lxml")
table = soup.select("table")[1]
for items in table.select('tr')[4:5]:
[elem.extract() for elem in soup.select("script")] #kicking out script from result
data = [img['src'] for img in items.select('td img')]
ws.append(data)
wb.save("new_one.xlsx")