I have a text file called temp.txt and I want to delete all rows in it if the date is older than 24 hours from 21:45pm everyday. I've done a lot of googling and can't find the answer anywhere. The text file is in this format with no headers:
http://clipsexample1.com,clips1,clipexample123,2019-03-28 17:14:14
http://clipsexample12com,clips2,clipexample234,2019-03-27 18:56:20
Is there anyway I could remove the whole row if it is older than 24 hours (the second clip in the example)
EDIT: I have tried using this code but that's just removing todays date, how do I get it to remove today-24 hours?
save_path = 'clips/'
completeName = os.path.join(save_path, 'clips'+str(today)+'.txt')
good_dates = [str(today)]
with open('temp.txt') as oldfile, open(completeName, 'w') as newfile:
for line in oldfile:
if any(good_date in line for good_date in good_dates):
newfile.write(line)
EDIT 30/03/2019: Here is my full code to try and understand how the timestamp field is created:
#change UNIX to standard date format
def get_date(created_utc):
return dt.datetime.fromtimestamp(created_utc)
_timestamp = topics_data["created_utc"].apply(get_date)
topics_data = topics_data.assign(timestamp = _timestamp)
timestamp = _timestamp
print(timestamp)
#remove UNIX data column
topics_data.drop('created_utc', axis=1, inplace=True)
#export clips to temp.txt
topics_data.to_csv('temp.txt', header=True, index=False)
import csv
from datetime import datetime, timedelta
import os
today = datetime.today()
cutoff = datetime(year=today.year, month=today.month, day=today.day,
hour=21, minute=45)
max_time_diff = timedelta(hours=24)
input_file = 'temp.txt'
save_path = './clips'
complete_name = os.path.join(save_path, 'clips'+today.strftime('%Y-%m-%d')+'.txt')
os.makedirs(save_path, exist_ok=True) # Make sure dest directory exists.
with open(input_file, newline='') as oldfile, \
open(complete_name, 'w', newline='') as newfile:
reader = csv.reader(oldfile)
writer = csv.writer(newfile)
for line in reader:
line_date = datetime.strptime(line[3], "%Y-%m-%d %H:%M:%S")
if cutoff - line_date < max_time_diff:
writer.writerow(line)
When I print the timestamp field, this is the result i get:
01 2019-03-29 01:22:09
02 2019-03-29 02:42:21
03 2019-03-28 17:14:14
04 2019-03-29 06:06:18
Name: created_utc, dtype: datetime64[ns]
And the error I am still getting is:
ValueError: time data 'timestamp' does not match format '%Y-%m-%d %H:%M:%S'
Even though the datetime is printing in that format?