Actually, you don't need to iterate over the whole thing: "div #wob_wc"
, since the current location, weather, date, temperature, precipitation, humidity, and wind consist of one element and don't repeat anywhere else and you can use select()
or find()
instead.
If you want to iterate over something then iterating over temperature forecast is a good idea, for example:
for forecast in soup.select('.wob_df'):
high_temp = forecast.select_one('.vk_gy .wob_t:nth-child(1)').text
low_temp = forecast.select_one('.QrNVmd .wob_t:nth-child(1)').text
print(f'High: {high_temp}, Low: {low_temp}')
'''
High: 67, Low: 55
High: 65, Low: 56
High: 68, Low: 55
'''
Have a look at the SelectorGadget Chrome extension where you can grab CSS
selectors by clicking on the desired element in your browser. CSS
selectors reference.
Code and full example in the online IDE:
from bs4 import BeautifulSoup
import requests, lxml
headers = {
"User-Agent":
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
params = {
"q": "phagwara weather",
"hl": "en",
"gl": "us"
}
response = requests.get('https://www.google.com/search', headers=headers, params=params)
soup = BeautifulSoup(response.text, 'lxml')
weather_condition = soup.select_one('#wob_dc').text
tempature = soup.select_one('#wob_tm').text
precipitation = soup.select_one('#wob_pp').text
humidity = soup.select_one('#wob_hm').text
wind = soup.select_one('#wob_ws').text
current_time = soup.select_one('#wob_dts').text
print(f'Weather condition: {weather_condition}\n'
f'Tempature: {tempature}°F\n'
f'Precipitation: {precipitation}\n'
f'Humidity: {humidity}\n'
f'Wind speed: {wind}\n'
f'Current time: {current_time}\n')
for forecast in soup.select('.wob_df'):
day = forecast.select_one('.QrNVmd').text
weather = forecast.select_one('img.uW5pk')['alt']
high_temp = forecast.select_one('.vk_gy .wob_t:nth-child(1)').text
low_temp = forecast.select_one('.QrNVmd .wob_t:nth-child(1)').text
print(f'Day: {day}\nWeather: {weather}\nHigh: {high_temp}, Low: {low_temp}\n')
---------
'''
Weather condition: Partly cloudy
Temperature: 87°F
Precipitation: 5%
Humidity: 70%
Wind speed: 4 mph
Current time: Tuesday 4:00 PM
Forcast temperature:
Day: Tue
Weather: Partly cloudy
High: 90, Low: 76
...
'''
Alternatively, you can achieve the same by using Google Direct Answer Box API from SerpApi. It's a paid API with a free plan.
The main difference in your example is that you only need to iterate over already extracted data rather than doing everything from scratch, or figuring out how to bypass blocks from Google.
Code to integrate:
params = {
"engine": "google",
"q": "phagwara weather",
"api_key": os.getenv("API_KEY"),
"hl": "en",
"gl": "us",
}
search = GoogleSearch(params)
results = search.get_dict()
loc = results['answer_box']['location']
weather_date = results['answer_box']['date']
weather = results['answer_box']['weather']
temp = results['answer_box']['temperature']
precipitation = results['answer_box']['precipitation']
humidity = results['answer_box']['humidity']
wind = results['answer_box']['wind']
forecast = results['answer_box']['forecast']
print(f'{loc}\n{weather_date}\n{weather}\n{temp}°F\n{precipitation}\n{humidity}\n{wind}\n')
print(json.dumps(forecast, indent=2))
---------
'''
Phagwara, Punjab, India
Tuesday 4:00 PM
Partly cloudy
87°F
5%
70%
4 mph
[
{
"day": "Tuesday",
"weather": "Partly cloudy",
"temperature": {
"high": "90",
"low": "76"
},
"thumbnail": "https://ssl.gstatic.com/onebox/weather/48/partly_cloudy.png"
}
...
]
'''
Disclaimer, I work for SerpApi.