4

I found a subtitle that goes like this:

<transcript>
<text start="2.906" dur="3">TEXT 1</text>
<text start="7.907" dur="3.914">TEXT 2</text>
......

What is that format called?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
casolorz
  • 8,486
  • 19
  • 93
  • 200

2 Answers2

5

That is Youtube's "Timed Text" transcript format. Based on W3C TTML (see bottom example).

Your shown sample looks like an older version/layout format (eg: videos from 2011 have that similar layout, but 2017 transcripts look slightly different, since now using timedtext format="3").

When you enable CC option on a video that (markup) transcript text is loaded. You can see this by checking "Network" requests in your browser's Developer Tools.

Open timedtext>key= link in new tab to view content of transcript text.

(see below image) :

Edit :
PS: If you want it displayed similar to your posted style, in the timedtext>key= link just edit the ending of the URL from &fmt=srv3 to become &fmt=srv1.

VC.One
  • 14,790
  • 4
  • 25
  • 57
  • This seems to be right, that you. Even though the subtitle came from another site, I guess it was from Google drive. Seems like the format can be changed but haven't figured out how yet. – casolorz Sep 30 '17 at 20:55
  • @casolorz Changed how? You mean into another format like example SRT? If you want it displayed similar to your posted style just replace ending `&fmt=srv3` with `&fmt=srv1`. – VC.One Sep 30 '17 at 21:27
  • Yeah that is what I was trying but I got errors with most things I tried, like vtt or srt. Right now the format was 1, 3 worked as well, but couldn't find anything else that worked. – casolorz Sep 30 '17 at 21:44
  • 2
    Looks like if I put fmt=vtt it actually works, the browser just doesn't like it but I can get it using curl. – casolorz Oct 02 '17 at 15:48
5

Adding &fmt=vtt will actually convert the subtitle but the browser might give an error as this header is returned Content-Type: text/xml but the content is not xml.

casolorz
  • 8,486
  • 19
  • 93
  • 200