aboutsummaryrefslogtreecommitdiffhomepage
path: root/DOCS/tech/nut.txt
diff options
context:
space:
mode:
Diffstat (limited to 'DOCS/tech/nut.txt')
-rw-r--r--DOCS/tech/nut.txt1075
1 files changed, 0 insertions, 1075 deletions
diff --git a/DOCS/tech/nut.txt b/DOCS/tech/nut.txt
deleted file mode 100644
index ba55997874..0000000000
--- a/DOCS/tech/nut.txt
+++ /dev/null
@@ -1,1075 +0,0 @@
-==================================
-NUT Open Container Format 20061104
-==================================
-
-
-
-Intro:
-======
-
-NUT is a free multimedia container format for storage of audio, video,
-subtitles and related user defined streams, it provides exact timestamps for
-synchronization and seeking, is simple, has low overhead and can recover
-in case of errors in the stream.
-
-Other common multimedia container formats are AVI, Ogg, Matroska, MP4, MOV
-ASF, MPEG-PS, MPEG-TS.
-
-
-Features / goals:
- (supported by the format, not necessarily by a specific implementation)
-
-Simplicity
- Use the same encoding for nearly all fields.
- Simple decoding, so slow CPUs (and embedded systems) can handle it.
-
-Extensibility
- No limit for the possible values of all fields (using universal vlc).
- Allow adding of new headers in the future.
- Allow adding more fields at the end of headers.
-
-Compactness
- ~0.2% overhead for normal bitrates.
- The index is <100kb per hour.
- A typical file header is about 100 bytes (audio + video headers together).
- A packet header is about ~1-5 bytes.
-
-Error resistance
- Seeking / playback is possible without an index.
- Headers & index can be repeated.
- Damaged files can be played back with minimal data loss and fast
- resynchronization times.
-
-The specification is frozen. All files following the specification will be
-compatible unless the specification is unfrozen.
-
-
-Definitions:
-============
-
-MUST The specific part must be done to conform to this standard.
-SHOULD It is recommended to be done that way, but not strictly required.
-
-keyframe
- A keyframe is a frame from which you can start decoding, a more
- exact definition is below
- The nth frame is a keyframe if and only if frames n, n+1, ... in
- presentation order (that are all frames with a pts >= frame[n].pts) can
- be decoded successfully without reference to frames prior n in storage
- order (that are all frames with a dts < frame[n].dts).
- If no such frames exist (for example due to using overlapped transforms
- like the MDCT in an audio codec), then the definition shall be extended
- by dropping n out of the set of frames which must be decodable, if this
- is still insufficient then n+1 shall be dropped, and so on until there is
- a keyframe.
- Every frame which is marked as a keyframe MUST be a keyframe according to
- the definition above, a muxer MUST mark every frame it knows is a keyframe
- as such, a muxer SHOULD NOT analyze future frames to determine the
- keyframe status of the current frame but instead just set the frame as
- non-keyframe.
- (FIXME maybe move somewhere else?)
-pts
- Presentation time of the first frame/sample that is completed by decoding
- the coded frame.
-dts
- The time when a frame is input into a synchronous 1-in-1-out decoder.
-
-
-Syntax:
-=======
-
-Since NUT heavily uses variable length fields, the simplest way to describe it
-is using a pseudocode approach.
-
-
-
-Conventions:
-============
-
-The data types have a name, used in the bitstream syntax description, a short
-text description and a pseudocode (functional) definition, optional notes may
-follow:
-
-name (text description)
- functional definition
- [Optional notes]
-
-The bitstream syntax elements have a tagname and a functional definition, they
-are presented in a bottom-up approach, again optional notes may follow and
-are reproduced in the tag description:
-
-name: (optional note)
- functional definition
- [Optional notes]
-
-The in-depth tag description follows the bitstream syntax.
-The functional definition has a C-like syntax.
-
-
-
-Type definitions:
-=================
-
-f(n) (n fixed bits in big-endian order)
-u(n) (unsigned number encoded in n bits in MSB-first order)
-
-v (variable length value, unsigned)
- value=0
- do{
- more_data u(1)
- data u(7)
- value= 128*value + data
- }while(more_data)
-
-s (variable length value, signed)
- temp v
- temp++
- if(temp&1) value= -(temp>>1)
- else value= (temp>>1)
-
-b (binary data or string, to be use in vb, see below)
- for(i=0; i<length; i++){
- data[i] u(8)
- }
- [Note: strings MUST be encoded in UTF-8]
- [Note: the character NUL (U+0000) is not legal within
- or at the end of a string.]
-
-vb (variable length binary data or string)
- length v
- value b
-
-t (v coded universal timestamp)
- tmp v
- id= tmp % time_base_count
- value= (tmp / time_base_count) * time_base[id]
-
-
-Bitstream syntax:
-=================
-
-file:
- file_id_string
- while(!eof){
- if(next_byte == 'N'){
- packet_header
- switch(startcode){
- case main_startcode: main_header; break;
- case stream_startcode:stream_header; break;
- case info_startcode: info_packet; break;
- case index_startcode: index; break;
- case syncpoint_startcode: syncpoint; break;
- }
- packet_footer
- }else
- frame
- }
-
-The structure of an undamaged file should look like the following, but
-demuxers should be flexible and be able to deal with damaged headers so the
-above is a better loop in practice (not to mention it is simpler).
-Note: Demuxers MUST be able to deal with new and unknown headers.
-
-file:
- file_id_string
- while(!eof){
- packet_header, main_header, packet_footer
- reserved_headers
- for(i=0; i<stream_count; i++){
- packet_header, stream_header, packet_footer
- reserved_headers
- }
- while(next_code == info_startcode){
- packet_header, info_packet, packet_footer
- reserved_headers
- }
- if(next_code == index_startcode){
- packet_header, index_packet, packet_footer
- }
- if (!eof) while(next_code != main_startcode){
- if(next_code == syncpoint_startcode){
- packet_header, syncpoint, packet_footer
- }
- frame
- reserved_headers
- }
- }
-
-
-Common elements:
-----------------
-
-reserved_bytes:
- for(i=0; i<forward_ptr - length_of_non_reserved; i++)
- reserved u(8)
- [A demuxer MUST ignore any reserved bytes.
- A muxer MUST NOT write any reserved bytes, as this would make it
- impossible to add new fields at the end of packets in the future
- in a compatible way.]
-
-packet_header
- startcode f(64)
- forward_ptr v
- if(forward_ptr > 4096)
- header_checksum u(32)
-
-packet_footer
- checksum u(32)
-
-reserved_headers
- while(next_byte == 'N' && next_code != main_startcode
- && next_code != stream_startcode
- && next_code != info_startcode
- && next_code != index_startcode
- && next_code != syncpoint_startcode){
- packet_header
- reserved_bytes
- packet_footer
- }
-
- Headers:
-
-main_header:
- version v
- stream_count v
- max_distance v
- time_base_count v
- for(i=0; i<time_base_count; i++)
- time_base_num v
- time_base_denom v
- time_base[i]= time_base_num/time_base_denom
- tmp_pts=0
- tmp_mul=1
- tmp_stream=0
- for(i=0; i<256; ){
- tmp_flag v
- tmp_fields v
- if(tmp_fields>0) tmp_pts s
- if(tmp_fields>1) tmp_mul v
- if(tmp_fields>2) tmp_stream v
- if(tmp_fields>3) tmp_size v
- else tmp_size=0
- if(tmp_fields>4) tmp_res v
- else tmp_res=0
- if(tmp_fields>5) count v
- else count= tmp_mul - tmp_size
- for(j=6; j<tmp_fields; j++){
- tmp_reserved[i] v
- }
- for(j=0; j<count && i<256; j++, i++){
- if (i == 'N') {
- flags[i]= FLAG_INVALID;
- j--;
- continue;
- }
- flags[i]= tmp_flag;
- stream_id[i]= tmp_stream;
- data_size_mul[i]= tmp_mul;
- data_size_lsb[i]= tmp_size + j;
- pts_delta[i]= tmp_pts;
- reserved_count[i]= tmp_res;
- }
- }
- reserved_bytes
-
-stream_header:
- stream_id v
- stream_class v
- fourcc vb
- time_base_id v
- msb_pts_shift v
- max_pts_distance v
- decode_delay v
- stream_flags v
- codec_specific_data vb
- if(stream_class == video){
- width v
- height v
- sample_width v
- sample_height v
- colorspace_type v
- }else if(stream_class == audio){
- samplerate_num v
- samplerate_denom v
- channel_count v
- }
- reserved_bytes
-
- Basic Packets:
-
-frame:
- frame_code f(8)
- frame_flags= flags[frame_code]
- frame_res= reserved_count[frame_code]
- if(frame_flags&FLAG_CODED){
- coded_flags v
- frame_flags ^= coded_flags
- }
- if(frame_flags&FLAG_STREAM_ID){
- stream_id v
- }
- if(frame_flags&FLAG_CODED_PTS){
- coded_pts v
- }
- if(frame_flags&FLAG_SIZE_MSB){
- data_size_msb v
- }
- if(frame_flags&FLAG_RESERVED)
- frame_res v
- for(i=0; i<frame_res; i++)
- reserved v
- if(frame_flags&FLAG_CHECKSUM){
- checksum u(32)
- }
- data
-
-index:
- max_pts t
- syncpoints v
- for(i=0; i<syncpoints; i++){
- syncpoint_pos_div16 v
- }
- for(i=0; i<stream_count; i++){
- last_pts= -1
- for(j=0; j<syncpoints; ){
- x v
- type= x & 1
- x>>=1
- n=j
- if(type){
- flag= x & 1
- x>>=1
- while(x--)
- has_keyframe[n++][i]=flag
- has_keyframe[n++][i]=!flag;
- }else{
- while(x != 1){
- has_keyframe[n++][i]=x&1;
- x>>=1;
- }
- }
- for(; j<n && j<syncpoints; j++){
- if (!has_keyframe[j][i]) continue
- A v
- if(!A){
- A v
- B v
- eor_pts[j][i] = last_pts + A + B
- }else
- B=0
- keyframe_pts[j][i] = last_pts + A
- last_pts += A + B
- }
- }
- }
- reserved_bytes
- index_ptr u(64)
-
-info_packet:
- stream_id_plus1 v
- chapter_id s (Note: Due to a typo this was v
- until 2006-11-04.)
- chapter_start t
- chapter_len v
- count v
- for(i=0; i<count; i++){
- name vb
- value s
- if (value==-1){
- type= "UTF-8"
- value vb
- }else if (value==-2){
- type vb
- value vb
- }else if (value==-3){
- type= "s"
- value s
- }else if (value==-4){
- type= "t"
- value t
- }else if (value<-4){
- type= "r"
- value.den= -value-4
- value.num s
- }else{
- type= "v"
- }
- }
- reserved_bytes
-
-syncpoint:
- global_key_pts t
- back_ptr_div16 v
- reserved_bytes
-
- Complete definition:
-
-
-Tag description:
-----------------
-
-file_id_string
- "nut/multimedia container\0"
- The very first thing in every NUT file, useful for identifying NUT files.
-
-*_startcode (f(64))
- all startcodes start with 'N'
-
-main_startcode (f(64))
- 0x7A561F5F04ADULL + (((uint64_t)('N'<<8) + 'M')<<48)
-
-stream_startcode (f(64))
- 0x11405BF2F9DBULL + (((uint64_t)('N'<<8) + 'S')<<48)
-
-syncpoint_startcode (f(64))
- 0xE4ADEECA4569ULL + (((uint64_t)('N'<<8) + 'K')<<48)
-
-index_startcode (f(64))
- 0xDD672F23E64EULL + (((uint64_t)('N'<<8) + 'X')<<48)
-
-info_startcode (f(64))
- 0xAB68B596BA78ULL + (((uint64_t)('N'<<8) + 'I')<<48)
-
-version (v)
- NUT version. The current value is 3. All lower values are pre-freeze.
-
-stream_count (v)
- number of streams in this file
-
-time_base_count (v)
- number of different time bases in this file
- This MUST NOT be 0.
-
-forward_ptr (v)
- Size of the packet data (exactly the distance from the first byte
- after the packet_header to the first byte of the next packet).
- Every NUT packet contains a forward_ptr immediately after its startcode
- with the exception of frame_code-based packets. The forward pointer
- can be used to skip over the packet without decoding its contents.
-
-max_distance (v)
- maximum distance between startcodes. If p1 and p2 are the byte
- positions of the first byte of two consecutive startcodes, then
- p2-p1 MUST be less than or equal to max_distance unless the entire
- span from p1 to p2 comprises a single packet or a syncpoint
- followed by a single frame. This imposition places efficient upper
- bounds on seek operations and allows for the detection of damaged
- frame headers, should a chain of frame headers pass max_distance
- without encountering any startcode.
-
- Syncpoints SHOULD be placed immediately before a keyframe if the
- previous frame of the same stream was a non-keyframe, unless such
- non-keyframe - keyframe transitions are very frequent.
-
- SHOULD be set to <=32768.
- If the stored value is >65536 then max_distance MUST be set to 65536.
-
- This is also half the maximum frame size without a checksum after the
- frame header.
-
-
-max_pts_distance (v)
- Maximum absolute difference of the pts of the new frame from last_pts in
- the timebase of the stream, without a checksum after the frame header.
- A frame header MUST include a checksum if abs(pts-last_pts) is
- strictly greater than max_pts_distance.
- Note that last_pts is not necessarily the pts of the last frame
- on the same stream, as it is altered by syncpoint timestamps.
- SHOULD NOT be higher than 1/timebase.
-
-stream_id (v)
- Stream identifier
- stream_id MUST be < stream_count
-
-stream_class (v)
- 0 video
- 1 audio
- 2 subtitles
- 3 userdata
- Note: The remaining values are reserved and MUST NOT be used.
- A demuxer MUST ignore streams with reserved classes.
-
-fourcc (vb)
- identification for the codec
- example: "H264"
- MUST contain 2 or 4 bytes, note, this might be increased in the future
- if needed.
- The ID values used are the same as in AVI, so if a codec uses a specific
- FourCC in AVI then the same FourCC MUST be used here.
-
-time_base_num (v) / time_base_denom (v) = time_base
- the length of a timer tick in seconds, this MUST be equal to the 1/fps
- if FLAG_FIXED_FPS is set
- time_base_num and time_base_denom MUST NOT be 0
- time_base_num and time_base_denom MUST be relatively prime
- time_base_denom MUST be < 2^31
- examples:
- fps time_base_num time_base_denom
- 30 1 30
- 29.97 1001 30000
- 23.976 1001 24000
- There MUST NOT be 2 identical timebases in a file.
- There SHOULD NOT be more timebases than streams.
-
-time_base_id (v)
- index into the time_base table
- MUST be < time_base_count.
-
-convert_ts
- To switch from 2 different timebases, the following calculation is
- defined:
-
- ln = from_time_base_num*to_time_base_denom
- sn = from_timestamp
- d1 = from_time_base_denom
- d2 = to_time_base_num
- timestamp = (ln/d1*sn + ln%d1*sn/d1)/d2
- Note: This calculation MUST be done with unsigned 64 bit integers, and
- is equivalent to (ln*sn)/(d1*d2) but this would require a 96 bit integer.
-
-compare_ts
- Compares timestamps from 2 different timebases,
- if a is before b then compare_ts(a, b) = -1
- if a is after b then compare_ts(a, b) = 1
- else compare_ts(a, b) = 0
-
- Care must be taken that this is done exactly with no rounding errors,
- simply casting to float or double and doing the obvious
- a*timebase > b*timebase is not compliant or correct, neither is the
- same with integers, and
- a*a_timebase.num*b_timebase.den > b*b_timebase.num*a_timebase.den
- will overflow. One possible implementation which shouldn't overflow
- within the range of legal timestamps and timebases is:
-
- if (convert_ts(a, a_timebase, b_timebase) < b) return -1;
- if (convert_ts(b, b_timebase, a_timebase) < a) return 1;
- return 0;
-
-msb_pts_shift (v)
- amount of bits in lsb_pts
- MUST be <16.
-
-decode_delay (v)
- Size of the reordering buffer used to convert pts to dts.
- Codecs which do not support B-frames normally use 0.
- MPEG-1/MPEG-2-style codecs with B-frames use 1.
- H.264-style B-pyramid uses 2.
- H.264 and future codecs might need values >2.
- Audio codecs generally use 0. (We are not aware of any, but it
- is theoretically possible that a codec might need a value >0.)
- decode_delay MUST NOT be set higher than necessary for a codec.
-
-stream_flags (v)
- Bit Name Description
- 1 FLAG_FIXED_FPS indicates that the fps is fixed
-
-codec_specific_data (vb)
- Private global data for a codec (could be huffman tables or ...).
- If a codec has a global header it SHOULD be placed in here instead of
- at the start of every keyframe.
- The exact format is specified in the codec specification.
- For H.264 the NAL units MUST be formatted as in a bytestream
- (with 00 00 01 prefixes).
- codec_specific_data SHOULD contain exactly the essential global packets
- needed to decode a stream, more specifically it SHOULD NOT contain packets
- which contain only non essential metadata like author, title, ...
- It also MUST NOT contain normal packets which cause the reference decoder
- to generate any specific decoded samples.
- The encoder name and version shall be considered essential as it is very
- useful to work around possible encoder bugs.
- The global headers MUST consist of the normal
- sequence of header packets required for codec initialization, in the
- order defined in the codec spec. An implementation MAY strip metadata and
- other redundant information not necessary for correct playback from the
- global headers as long as no incorrect values are stored and as long as
- the stripped result is not less valid per codec spec as before stripping.
-
-frame_code (f(8))
- frame_code is an 8-bit field which exists before every frame, it can
- store part of the size of the frame, the stream number, the timestamp
- and some flags amongst other things. What is not directly stored
- in it but is needed is stored in various fields immediately after it.
- The values stored in it can be found in the main header.
- The value 78 ('N') is forbidden to ensure that the byte is always
- different from the first byte of any startcode.
- A muxer SHOULD mark 0x00 and 0xFF as invalid to improve error
- detection.
-
-flags[frame_code], frame_flags (v)
- Bit Name Description
- 0 FLAG_KEY If set, the frame is a keyframe.
- 1 FLAG_EOR If set, the stream has no relevance on
- presentation. (EOR)
- 3 FLAG_CODED_PTS If set, coded_pts is in the frame header.
- 4 FLAG_STREAM_ID If set, stream_id is coded in the frame header.
- 5 FLAG_SIZE_MSB If set, data_size_msb at the frame header,
- otherwise data_size_msb is 0.
- 6 FLAG_CHECKSUM If set, the frame header contains a checksum.
- 7 FLAG_RESERVED If set, reserved_count is coded in the frame header.
- 12 FLAG_CODED If set, coded_flags are stored in the frame header.
- 13 FLAG_INVALID If set, frame_code is invalid.
-
- EOR frames MUST be zero-length and must be set keyframe.
- All streams SHOULD end with EOR, where the pts of the EOR indicates the
- end presentation time of the final frame.
- An EOR set stream is unset by the first content frames.
- EOR can only be unset in streams with zero decode_delay .
- FLAG_CHECKSUM MUST be set if the frame's data_size is strictly greater than
- 2*max_distance or the difference abs(pts-last_pts) is strictly greater than
- max_pts_distance (where pts represents this frame's pts and last_pts is
- defined as below).
-
-last_pts
- The timestamp of the last frame with the same stream_id as the current.
- If there is no such frame between the last syncpoint and the current
- frame then the syncpoint timestamp is used, see global_key_pts.
-
-stream_id[frame_code] (v)
- If FLAG_STREAM_ID is not set then this is the stream number for the
- frame following this frame_code.
- If FLAG_STREAM_ID is set then this value has no meaning.
- MUST be <250.
-
-data_size_mul[frame_code] (v)
- If FLAG_SIZE_MSB is set then data_size_msb which is stored after the
- frame code is multiplied with it and forms the more significant part
- of the size of the following frame.
- If FLAG_SIZE_MSB is not set then this field has no meaning.
- MUST be <16384.
-
-data_size_lsb[frame_code] (v)
- The less significant part of the size of the following frame.
- This added together with data_size_mul*data_size_msb is the size of
- the following frame.
- MUST be <16384.
-
-pts_delta[frame_code] (s)
- If FLAG_CODED_PTS is set in the flags of the current frame then this
- value MUST be ignored, if FLAG_CODED_PTS is not set then pts_delta is the
- difference between the current pts and last_pts.
- MUST be <16384 and >-16384.
-
-reserved_count[frame_code] (v)
- MUST be <256.
-
-data_size
- The size of the following frame.
- data_size = data_size_lsb + data_size_msb * data_size_mul ;
-
-coded_pts (v)
- If coded_pts < ( 1 << msb_pts_shift ) then it is an lsb
- pts, otherwise it is a full pts + ( 1 << msb_pts_shift ).
- lsb pts is converted to a full pts by:
- mask = ( 1 << msb_pts_shift ) - 1;
- delta = last_pts - mask / 2
- pts = ( (pts_lsb - delta) & mask ) + delta
-
-lsb_pts
- Least significant bits of the pts in time_base precision.
- Example: IBBP display order
- keyframe pts=0 -> pts=0
- frame lsb_pts=3 -> pts=3
- frame lsb_pts=1 -> pts=1
- frame lsb_pts=2 -> pts=2
- ...
- keyframe msb_pts=257 -> pts=257
- frame lsb_pts=255 -> pts=255
- frame lsb_pts=0 -> pts=256
- frame lsb_pts=4 -> pts=260
- frame lsb_pts=2 -> pts=258
- frame lsb_pts=3 -> pts=259
- All pts values of keyframes of a single stream MUST be monotone.
-
-dts
- decoding timestamp
- The dts of a frame is the timestamp of the first sample which is
- output by a decoder when it is fed with the frame. Note that the
- data output is not necessarily what is coded in the frame, but may
- be data from previous frames.
- dts is calculated by using a decode_delay + 1 sized buffer for each
- stream, into which the current pts is inserted and the element with
- the smallest value is removed. This is then the current dts.
- This buffer is initialized with decode_delay - 1 elements.
-
- pts of all frames in all streams MUST be bigger or equal to dts of all
- previous frames in all streams, compared in common timebase. (EOR
- frames are NOT exempt from this rule.)
- dts of all frames MUST be bigger or equal to dts of all previous frames
- in the same stream.
-
-width (v) / height (v)
- Width and height of the video in pixels.
- MUST be set to the coded width/height, MUST NOT be 0.
-
-sample_width (v) /sample_height (v) (aspect ratio)
- sample_width is the horizontal distance between samples.
- sample_width and sample_height MUST be relatively prime if not zero.
- Both MUST be 0 if unknown otherwise both MUST be nonzero.
-
-colorspace_type (v)
- 0 unknown
- 1 ITU Rec 624 / ITU Rec 601 Y range: 16..235 Cb/Cr range: 16..240
- 2 ITU Rec 709 Y range: 16..235 Cb/Cr range: 16..240
- 17 ITU Rec 624 / ITU Rec 601 Y range: 0..255 Cb/Cr range: 0..255
- 18 ITU Rec 709 Y range: 0..255 Cb/Cr range: 0..255
-
-samplerate_num (v) / samplerate_denom (v) = samplerate
- The number of samples per second, MUST NOT be 0.
-
-crc32 checksum
- Generator polynomial is 0x104C11DB7. Starting value is zero.
-
-checksum (u(32))
- crc32 checksum
- The checksum is calculated for the area pointed to by forward_ptr
- not including the checksum itself (from first byte after the
- packet_header until last byte before the checksum).
- For frame headers the checksum contains the framecode byte and all
- following bytes up to the checksum itself.
-
-header_checksum (u(32))
- Checksum over the startcode and forward pointer.
-
-Syncpoint tags:
----------------
-
-back_ptr_div16 (v)
- back_ptr = back_ptr_div16 * 16 + 15
- back_ptr must point to a position up to 15 bytes before a syncpoint
- startcode, relative to position of current syncpoint. The syncpoint
- pointed to MUST be the closest syncpoint such that at least one keyframe
- with a pts lower or equal to the current syncpoint's global_key_pts for
- all streams lies between it and the current syncpoint.
-
- A stream where EOR is set is to be ignored for back_ptr.
-
-global_key_pts (t)
- After a syncpoint, last_pts of each stream is to be set to:
- last_pts[i] = convert_ts(global_key_pts, time_base[id], time_base[i])
-
- global_key_pts MUST be bigger or equal to dts of all past frames across
- all streams, and smaller or equal to pts of all future frames.
-
-Index tags:
------------
-
-max_pts (t)
- the highest pts in the entire file
-
-syncpoints (v)
- number of indexed syncpoints
-
-syncpoint_pos_div16 (v)
- The offset from the beginning of the file to up to 15 bytes before the
- syncpoint referred to in this index entry. Relative to position of last
- syncpoint.
-
-has_keyframe
- Indicates whether this stream has a keyframe between this syncpoint and
- the last syncpoint.
-
-keyframe_pts
- The pts of the first keyframe for this stream in the region between the
- 2 syncpoints, in the stream's timebase. (EOR frames are also keyframes.)
-
-eor_pts
- Coded only if EOR is set at the position of the syncpoint. The pts of
- that EOR. EOR is unset by the first keyframe after it.
-
-index_ptr (u(64))
- Length in bytes of the entire index, from the first byte of the
- startcode until the last byte of the checksum.
- Note: A demuxer can use this to find the index when it is written at
- EOF, as index_ptr will always be 12 bytes before the end of file if
- there is an index at all.
-
-
-Info tags:
-----------
-
-stream_id_plus1 (v)
- Stream this info packet applies to. If zero, packet applies to the
- whole file.
-
-chapter_id (s)
- The ID of the chapter this packet applies to. If zero, the packet applies
- to the whole file. Positive chapter_id values represent real chapters and
- MUST NOT overlap.
- A negative chapter_id indicates a sub region of the file and not a real
- chapter. chapter_id MUST be unique to the region it represents.
- chapter_id n MUST NOT be used unless there are at least n chapters in the
- file.
-
-chapter_start (t)
- timestamp of start of chapter
-
-chapter_len (v)
- Length of chapter in the same timebase as chapter_start.
-
-count (v)
- number of name/value pairs in this info packet
-
-type
- for example: "UTF8" -> string or "JPEG" -> JPEG image
- "v" -> unsigned integer
- "s" -> signed integer
- "r" -> rational
- Note: Nonstandard fields should be prefixed by "X-".
- Note: MUST be less than 6 byte long (might be increased to 64 later).
-
-info packet types
- The name of the info entry. Valid names are
- "Author"
- "Description"
- "Copyright"
- "Encoder"
- The name & version of the software used for encoding.
- "Title"
- "Cover" (allowed types are "PNG" and "JPEG")
- image of the (CD, DVD, VHS, ..) cover (preferably PNG or JPEG)
- "Source"
- "DVD", "VCD", "CD", "MD", "FM radio", "VHS", "TV", "LD"
- Optional: Appended PAL, NTSC, SECAM, ... in parentheses.
- "SourceContainer"
- "nut", "mkv", "mov", "avi", "ogg", "rm", "mpeg-ps", "mpeg-ts", "raw"
- "SourceCodecTag"
- The source codec ID like a FourCC which was used to store a specific
- stream in its SourceContainer.
- "CaptureDevice"
- "BT878", "BT848", "webcam", ... (or more precise names)
- "CreationTime"
- "2003-01-20 20:13:15Z", ...
- (ISO 8601 format, see http://www.cl.cam.ac.uk/~mgk25/iso-time.html)
- Note: Do not forget the timezone.
- "Keywords"
- "Language"
- An ISO 639-2 (three-letter) language code, optionally followed by an
- ISO 3166-1 country code that is separated from the language
- code by a hyphen. All codes defined in ISO 639-2 are allowed,
- including "und" (Undetermined), "mul" (Multiple languages).
- See http://www.loc.gov/standards/iso639-2/
- and http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1/en_listp1.html
- the language code
- A demuxer MUST ignore unknown language and country codes instead of
- treating them as an error.
- "Disposition"
- "original", "dub" (translated), "comment", "lyrics", "karaoke"
- Note: If someone needs some others, please tell us about them, so we
- can add them to the official standard (if they are sane).
- Note: Nonstandard fields should be prefixed by "X-".
- Note: Names of fields SHOULD be in English if a word with the same
- meaning exists in English.
- Note: MUST be less than 64 bytes long.
-
-value
- value of this name/type pair
-
-stuffing
- 0x80 can be placed in front of any type v entry for stuffing purposes.
- Exceptions are the forward_ptr and all fields in the frame header where
- a maximum of 8 stuffing bytes per field are allowed.
-
-
-Structure:
-----------
-
-The headers MUST be in exactly the following order (to simplify demuxer design).
-
-main header
-stream_header (id=0)
-stream_header (id=1)
-...
-stream_header (id=n)
-
-Headers may be repeated, but if they are, then they MUST all be repeated
-together and repeated headers MUST be identical.
-
-Each set of repeated headers not at the beginning or end of the file SHOULD
-be stored at the earliest possible position after 2^x where x is an integer
-and the end of the file. So the headers may be repeated at 4102 if that is
-the closest position after 2^12=4096 at which the headers can be placed.
-
-Note: This allows an implementation reading the file to locate backup
-headers in O(log filesize) time as opposed to O(filesize).
-
-Headers MUST be placed at least at the start of the file and immediately before
-the index or at the end of the file if there is no index.
-Headers MUST be repeated at least twice (so they exist three times in a file).
-
-There MUST be a syncpoint immediately before the first frame after any headers.
-
-
-Index:
-------
-
-Note: With realtime streaming, there is no end, so no index there either.
-Index MAY only be repeated after main headers.
-If an index is written anywhere in the file, it MUST be written at end of
-file as well.
-
-
-Info:
------
-
-If an info packet is stored anywhere then a muxer MUST also store an identical
-info packet after every main-stream-header set.
-
-If a demuxer has seen several info packets with the same chapter_id and
-stream_id then it MUST ignore all but the one with the highest position in
-the file.
-
-Demuxers SHOULD NOT search the whole file for info packets.
-
-demuxer (non-normative):
-------------------------
-
-In the absence of a valid header at the beginning, players SHOULD search for
-backup headers starting at offset 2^x; for each x players SHOULD end their
-search at a particular offset when any startcode (including a syncpoint) is
-found.
-
-
-Seeking without an index (non-normative):
------------------------------------------
-A. backward seeking
- 1. Perform a binary search on the syncpoint timestamps finding the one
- which is largest and <= the target timestamp.
-B. forward seeking
- 1a. Perform a binary search on the syncpoint timestamps finding the one
- which is smallest and >= the target timestamp.
- 1b. Perform a binary search on the syncpoint back pointers finding the
- smallest one which has a back ptr >= the position of what was found in 1.
-2. Follow the back pointer to the corresponding syncpoint.
-
-Seeking with an index (non-normative):
---------------------------------------
-The demuxer only has to find the appropriate keyframe in the index and
-start demuxing from the previous syncpoint.
-
-Note, more complicated seeking methods exist which are capable of quickly
-seeking to the optimal point in the presence of an index even if only a
-subset of all streams is active.
-
-A muxer SHOULD place syncpoints so that that simple low complexity seeking
-works with fine granularity. That is, syncpoints should be placed prior
-to keyframes instead of non-keyframes and with high enough frequency
-(once per second unless there are no keyframes between this and the previous
-syncpoint).
-
-Encoders SHOULD place keyframes so that the number of points where all
-streams have a keyframe at the same time is maximized. This ensures that
-seeking (complicated or not) does not need to demux and decode significant
-amounts of data to reach a point where a presentable frame for each stream
-is available after seeking.
-
-
-Semantic requirements:
-======================
-
-If more than one stream of a given stream class is present, each one SHOULD
-have info tags specifying disposition, and if applicable, language.
-It often highly improves usability and is therefore strongly encouraged.
-
-A demuxer MUST NOT demux a stream which contains more than one stream, or which
-is wrapped in a structure to facilitate more than one stream or otherwise
-duplicate the role of a container. Any such file is to be considered invalid.
-For example Vorbis in Ogg in NUT is invalid, as is
-mpegvideo + mpegaudio in MPEG-PS/TS in NUT or dvvideo + dvaudio in DV in NUT.
-
-
-
-Sample code (Public Domain, & untested):
-========================================
-
-typedef BufferContext{
- uint8_t *buf;
- uint8_t *buf_ptr;
-}BufferContext;
-
-static inline uint64_t get_bytes(BufferContext *bc, int count){
- uint64_t val=0;
-
- assert(count>0 && count<9);
-
- for(i=0; i<count; i++){
- val <<=8;
- val += *(bc->buf_ptr++);
- }
-
- return val;
-}
-
-static inline void put_bytes(BufferContext *bc, int count, uint64_t val){
- uint64_t val=0;
-
- assert(count>0 && count<9);
-
- for(i=count-1; i>=0; i--){
- *(bc->buf_ptr++)= val >> (8*i);
- }
-
- return val;
-}
-
-static inline uint64_t get_v(BufferContext *bc){
- uint64_t val= 0;
-
- for(; space_left(bc) > 0; ){
- int tmp= *(bc->buf_ptr++);
- if(tmp&0x80)
- val= (val<<7) + tmp - 0x80;
- else
- return (val<<7) + tmp;
- }
-
- return -1;
-}
-
-static inline int put_v(BufferContext *bc, uint64_t val){
- int i;
-
- if(space_left(bc) < 9) return -1;
-
- val &= 0x7FFFFFFFFFFFFFFFULL; // FIXME: Can only encode up to 63 bits ATM.
- for(i=7; ; i+=7){
- if(val>>i == 0) break;
- }
-
- for(i-=7; i>0; i-=7){
- *(bc->buf_ptr++)= 0x80 | (val>>i);
- }
- *(bc->buf_ptr++)= val&0x7F;
-
- return 0;
-}
-
-static int64_t get_dts(int64_t pts, int64_t *pts_cache, int delay, int reset){
- if(reset) memset(pts_cache, -1, delay*sizeof(int64_t));
-
- while(delay--){
- int64_t t= pts_cache[delay];
- if(t < pts){
- pts_cache[delay]= pts;
- pts= t;
- }
- }
-
- return pts;
-}
-
-
-
-Authors:
-========
-
-Folks from the MPlayer developers mailing list (http://www.mplayerhq.hu/).
-Authors in alphabetical order: (FIXME! Tell us if we left you out)
- Beregszaszi, Alex (alex@fsn.hu)
- Bunkus, Moritz (moritz@bunkus.org)
- Diedrich, Tobias (ranma+mplayer@tdiedrich.de)
- Felker, Rich (dalias@aerifal.cx)
- Franz, Fabian (FabianFranz@gmx.de)
- Gereoffy, Arpad (arpi@thot.banki.hu)
- Hess, Andreas (jaska@gmx.net)
- Niedermayer, Michael (michaelni@gmx.at)
- Shimon, Oded (ods15@ods15.dyndns.org)