Saving and managing timestamp scraped table to mysql using python

Question

I have tried to look for some similar post

Storing the results of Web Scraping into Database

How to save multiple scraped data to mysql using python

Now my question is I am scraping one website(govt) periodically (example patent) and saving (cronjobs) in mysql database using python with time stamp in one column. I have planed for two apaches of saving and accessing data.

Can I have one master database where it would add only new data rather than saving a totally new database tables again and again.
If I save all the data's in a separate mysql database tables in server, how do I detect changes to any sperate entity?

my code is simple

from datetime import datetime as dt ## I store the time and date of scraping in a column
import pandas as pd
import numpy as np

date = dt.today().strftime('%Y-%m-%d %H:%M:%S')
engine = db.create_engine('mysql://xxxx:xxxxxx@127.0.0.1/')
Session = sessionmaker(bind=engine)
session = Session()
inspector = inspect(engine)
scraped=pd.DataFrame(np.random.rand(4,7)
n1 = len(scraped.index) ##scrpaed data have different dimensions
scraped['date_loaded'] = [date] * n1
scraped.to_sql('scraped', engine, if_exists='append') ## scraped is a DataFrame.

Please advise on which procedure I should take?

Saving and managing timestamp scraped table to mysql using python

0 Answers0