I have a .mkv video file and a .srt subtitle file. I want to create a new embedded video file(video+subtitle) from them. I also want that each speech of the video will be repeated twice(Only the speech repeats and other parts will play only once). I think using the time-frame from the subtitle file, the repetition can be done properly. please, help me to make a python3 script so that a new file 'repeated.mkv' will be created in which every speech will be repeated twice and it will also show embedded subtitle for each speech.
I have figured out some steps but couldn't implement:
- parse the video and the subtitle file.
- list subtitle speech's starting and ending time
- embed the subtitle file with the video file.
- run a loop thorough out the input video file and if a ending time of a speech occurred, concatenate(for the repetition) the corresponding sub-clip(starting_time to ending_time of the speech) to the final output file. else concatenate everything to the final output file.
the subtitle file's time-frame format is as follows:
1
00:00:49,966 --> 00:00:52,760
There's nothing to tell.
It's just some guy I work with.
2
00:00:52,969 --> 00:00:55,137
Come on.
You're going out with a guy.
I have done upto listing the start and end time of each speech using the subtitle file.
import re
video_path = 'Friends.S01E01.720p.BluRay.x264.mkv'
subtitle_path = 'Friends.S01E01.720p.BluRay.x264.srt'
subtitle_timeframes = []
pattern = r'(\d{2}:\d{2}:\d{2},\d{3}) --> (\d{2}:\d{2}:\d{2},\d{3})'
with open(subtitle_path, 'r') as subtitle_file:
lines = subtitle_file.readlines()
for i in range(0, len(lines), 4):
match = re.search(pattern, lines[i+1])
if match:
start_time, end_time = match.group(1), match.group(2)
subtitle_timeframes.append((start_time,end_time))
print(f"Subtitle {len(subtitle_timeframes)} - Start: {start_time} End: {end_time}")