Disclaimer: I'm learning to develop in Python and I know that way of coding is probably like trash but I plan to keep improving while creating programs.
So I'm trying to build a scraper to check for specific flights prices daily with Selenium and that part of the code is already done. Origin, destination, first flight date, second flight date and price will be saved every day. I'm saving those data into a file and then comparing if there were any changes in price.
My aim is to make if there is change in price by more than an X percentage and then to print a message into the script for every compared flight.
import pandas as pd
import os.path
import numpy as np
#This are just sample data before integrating Selenium values
price = 230
departuredate = '20/02/2020'
returndate = '20/02/2020'
fromm = 'BOS'
to = 'JFK'
price2 = 630
departuredate2 = '20/02/2020'
returndate2 = '20/02/2020'
fromm2= 'CDG'
to2= 'JFK'
#End of sample data
flightdata = {'From': [fromm, fromm2], 'To': [to,to2], 'Departure date': [departuredate,departuredate2], 'Return date': [returndate,returndate2], 'Price': [price,price2]}
df = pd.DataFrame(flightdata, columns= ['From', 'To', 'Departure date', 'Return date', 'Price'])
#Check if the script is running for the first time
if os.path.exists('flightstoday.xls') == True:
os.remove("flightsyesterday.xls")
os.rename('flightstoday.xls', 'flightsyesterday.xls') #Rename the flights scraped fromm yesterday
df.to_csv('flightstoday.xls', mode='a', header=True, sep='\t')
else:
df.to_csv('flightstoday.xls', mode='w', header=True, sep='\t')
#Work with two dataframes
flightsyesterday = pd.read_csv("flightsyesterday.xls",sep='\t')
flightstoday = pd.read_csv("flightstoday.xls",sep='\t')
What I'm missing is how to compare the column 'Price' and print a message saying that for the row X with 'From', 'To', 'Departure date', 'Return date' the flight has changed by an X percentage.
I have tried this code but it only adds a column to flighstoday file but not the percentage and of course doesn't print there was any change in price.
flightstoday['PriceDiff'] = np.where(vueloshoy['Price'] == vuelosayer['Price'], 0, vueloshoy['Price'] - vuelosayer['Price'])
Any help for this newbie will be greatly appreciated. Thank you!