DVD subtitlesAs of January 9th, 2001, the latest version of this document can
be found here :
PreambleOne of the last things we missed in DVD decoding under my system was the decoding of subtitles. I found no information on the web or Usenet about them, apart from a few words on them being run-length encoded in the DVD FAQ. So we decided to reverse-engineer their format ( it's completely legal in France, since we did it on interoperability purposes ), and managed to get almost all of it. BasicsDVD subtitles are hidden in private PS packets
( Within the PS packet, there are PES packets, and like AC3, the header
for the ones containing subtitles have a I'll suppose you know how to extract AC3 from a DVD, and jump to the interesting part of this documentation. Anyway you're unlikely to have understood what I said without already being familiar with MPEG2. The data structureA subtitle packet, after its parts have been collected and appended, looks like this : +----------------------------------------------------------+ | | | 0 2 size | | +----+------------------------+-----------------+ | | |size| data packet | control | | | +----+------------------------+-----------------+ | | | | a subtitle packet | | | +----------------------------------------------------------+
Here is the structure of the data packet : +----------------------------------------------------------+ | | | 2 4 S0 | | +----+------------------------------------------+ | | | S0 | data | | | +----+------------------------------------------+ | | | | the data packet | | | +----------------------------------------------------------+
Here's the structure of the control packet : +--------------------------------------------------+ | | | S0 size | | +----------+----------+-----+--------------+ | | | ctrl seq | ctrl seq | ... | end ctrl seq | | | +----------+----------+-----+--------------+ | | | | the control packet | | | +--------------------------------------------------+ A control packet consists of several control sequences. Here is the structure of a control sequence : +----------------------------------------------------------------------------+ | | | +---------+---------+---------+-------+---------+-------+-----+------+ | | | date(2) | next(2) | cmd1(1) | args1 | cmd2(1) | args2 | ... | 0xff | | | +---------+---------+---------+-------+---------+-------+-----+------+ | | | | a control sequence | | | +----------------------------------------------------------------------------+ A control sequence starts with a date coded on 2 bytes, then an offset to the next control sequence coded on 2 bytes. If the offset to the next control sequence equals the offset of the current control sequence, it means we are on the last control sequence. The data in a control sequence after the offset consists of one byte
long commands followed by arguments depending on the command. The last
byte is always Control sequence commandsHere are the control sequences I know of. I know there are many more, to control subtitle fading and such things, but I didn't find information on them and couldn't reverse-engineer any because I actually didn't find any. Control packet decoding example00000a0c01030231040ff0050002cf00223e06000604e9ff00930a0c02ff Let's decode this sample control packet. The first control sequence is : (0000) (0a0c) (01) (03 0231) (04 0ff0) (05 0002cf00223e) (06 000604e9) (ff) We can deduce from this that the effect date is zero 100th of a
second after the PES packet's time ( Then we learn that the sequence is a display sequence
( The second control sequence is : (0093) (0a0c) (02) (ff) We can deduce from this that the effect date is 1.47 seconds after
the PES packet's time ( This control sequence just tell us it is a stop display sequence
( Decoding the graphicsThe graphics are rather easy to decode ( at least, when you know how to do it ). The picture is interlaced, for instance for a 40 lines picture : line 0 ---------------#---------- line 2 ------#------------------- ... line 38 ------------#------------- line 1 ------------------#------- line 3 --------#----------------- ... line 39 -------------#------------ When decoding you should get : line 0 ---------------#---------- line 1 ------------------#------- line 2 ------#------------------- line 3 --------#----------------- ... line 38 ------------#------------- line 39 -------------#------------ If the displaying resolution is low, you can choose to only display even lines, for instance. The pixels are run-length encoded. The one byte values are : one byte values : 0xf 0xe 0xd 0xc 0xb 0xa 0x9 0x8 0x7 0x6 0x5 0x4 two byte values : 0x3* 0x2* 0x1* 3 bytes values : 0x0f* 0x0e* 0x0d* 0x0c* 0x0b* 0x0a* 0x09* 0x08* 0x07* 0x06* 0x05* 0x04* 4 bytes values : 0x03** 0x02** 0x01** 0x000*
After a carriage return, the parser should be byte-aligned, so one nibble might have to be skipped, and it should read a line on the other interlaced picture, and swap like this after each carriage return. CodeMisc informationThere is no colour information stored within the subtitle packet, they are defined elsewhere in the IFO file. I will put some information about it here when I have some time. I don't know what are the other control sequences. CreditsThanks to Michel Lespinasse <walken at via dot ecp dot fr> for his great help on understanding the RLE stuff, and for all the ideas he had. Thanks to mass (David Waite) and taaz (David I. Lehn) from irc at openprojects.net for sending me their subtitles. Other contributors include Bob Ives" <bob at rebelact dot com> for pointing a small error. Changes
|