So, I was trying to solve some random problem via python and got stuck on the logic. Here's the problem : I have multiple videos along with their run time or running length. Now, I want to maintain 2 lists. 1 list if "synced" and the other one is "non synced". We decide "synced", if the difference of run time of the streams is less than or equal to 2 seconds. Otherwise, they're not synced. If we have multiple streams that match, then we take the streams with the highest match count/number.
I was able to come up with a very simple/slow method to part and pair these files. However, my logic failed when I got a different data set.
This is what I have written :
from datetime import datetime
# same_streams_old = {
"Stream_1": "0:24:08.925167",
"Stream_2": "0:24:08.990644",
"Stream_3": "0:24:08.990644",
"Stream_4": "0:24:12.118778",
"Stream_5": "0:24:12.118778",
"stream_6": "0:24:10.075066"
}
same_streams = {
"Stream_1": "0:24:08.925167",
"Stream_2": "0:24:12.118778",
"Stream_3": "0:23:11.057711",
"Stream_4": "0:24:12.118778",
"Stream_5": "0:24:10.075066",
"Stream_6": "0:24:08.990644"
}
keys = []
values = []
final_synced_video_files = []
final_non_synced_video_files = []
def get_time_diff(episode_run_time, episode_time):
prev_episode_time = datetime.strptime(episode_run_time, '%H:%M:%S.%f')
current_episode_time = datetime.strptime(episode_time, '%H:%M:%S.%f')
time_diff = prev_episode_time - current_episode_time
if current_episode_time > prev_episode_time:
time_diff = current_episode_time - prev_episode_time
return float(time_diff.seconds)
for key, value in same_streams.items():
keys.append(key)
values.append(value)
for key in keys:
for _key in keys:
if key != _key:
diff = get_time_diff(same_streams[key], same_streams[_key])
if diff <= 1.5:
final_synced_video_files.append(key)
else:
pass
final_synced_video_files = list(set(final_synced_video_files))
final_non_synced_video_files = list(set(keys) - set(final_synced_video_files))
print("Synced Files : {0}".format(final_synced_video_files))
print("Non Synced Files : {0}".format(final_non_synced_video_files))
As you can see that the streams with most matches are stream_1
, stream_2
, stream_3
and stream_6
.
What I've written doesn't compare the maximum counts yet. However, as I work on this, I feel like this is not really effective and a good way to solve this. Any inputs anyone?
I tried some approach on overlapping intervals and then got this : REPL LINK
But, if you run see both the same_streams
dictionary, you'll see results are not what I'm trying to achieve. Any help with this would be great.
EDIT:
I need to get the streams that have difference of 2 seconds with each other. For example :
In same_streams_old
, desired result would be streams 1,2,3 & 6. However, in the dictionary same_streams
, desired result is streams 2,4 & 5.
Basically, I need to see which streams can be "muxed" together and which ones can't be muxed.