123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459 |
- <?xml version="1.0" standalone="no"?>
- <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
- "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
- ]>
- <section id="vorbis-spec-codebook">
- <sectioninfo>
- <releaseinfo>
- $Id$
- </releaseinfo>
- </sectioninfo>
- <title>Probability Model and Codebooks</title>
- <section>
- <title>Overview</title>
- <para>
- Unlike practically every other mainstream audio codec, Vorbis has no
- statically configured probability model, instead packing all entropy
- decoding configuration, VQ and Huffman, into the bitstream itself in
- the third header, the codec setup header. This packed configuration
- consists of multiple 'codebooks', each containing a specific
- Huffman-equivalent representation for decoding compressed codewords as
- well as an optional lookup table of output vector values to which a
- decoded Huffman value is applied as an offset, generating the final
- decoded output corresponding to a given compressed codeword.</para>
- <section><title>Bitwise operation</title>
- <para>
- The codebook mechanism is built on top of the vorbis bitpacker. Both
- the codebooks themselves and the codewords they decode are unrolled
- from a packet as a series of arbitrary-width values read from the
- stream according to <xref linkend="vorbis-spec-bitpacking"/>.</para>
- </section>
- </section>
- <section>
- <title>Packed codebook format</title>
- <para>
- For purposes of the examples below, we assume that the storage
- system's native byte width is eight bits. This is not universally
- true; see <xref linkend="vorbis-spec-bitpacking"/> for discussion
- relating to non-eight-bit bytes.</para>
- <section><title>codebook decode</title>
- <para>
- A codebook begins with a 24 bit sync pattern, 0x564342:
- <screen>
- byte 0: [ 0 1 0 0 0 0 1 0 ] (0x42)
- byte 1: [ 0 1 0 0 0 0 1 1 ] (0x43)
- byte 2: [ 0 1 0 1 0 1 1 0 ] (0x56)
- </screen></para>
- <para>
- 16 bit <varname>[codebook_dimensions]</varname> and 24 bit <varname>[codebook_entries]</varname> fields:
- <screen>
- byte 3: [ X X X X X X X X ]
- byte 4: [ X X X X X X X X ] [codebook_dimensions] (16 bit unsigned)
- byte 5: [ X X X X X X X X ]
- byte 6: [ X X X X X X X X ]
- byte 7: [ X X X X X X X X ] [codebook_entries] (24 bit unsigned)
- </screen></para>
- <para>
- Next is the <varname>[ordered]</varname> bit flag:
- <screen>
- byte 8: [ X ] [ordered] (1 bit)
- </screen></para>
- <para>
- Each entry, numbering a
- total of <varname>[codebook_entries]</varname>, is assigned a codeword length.
- We now read the list of codeword lengths and store these lengths in
- the array <varname>[codebook_codeword_lengths]</varname>. Decode of lengths is
- according to whether the <varname>[ordered]</varname> flag is set or unset.
- <itemizedlist>
- <listitem>
- <para>If the <varname>[ordered]</varname> flag is unset, the codeword list is not
- length ordered and the decoder needs to read each codeword length
- one-by-one.</para>
- <para>The decoder first reads one additional bit flag, the
- <varname>[sparse]</varname> flag. This flag determines whether or not the
- codebook contains unused entries that are not to be included in the
- codeword decode tree:
- <screen>
- byte 8: [ X 1 ] [sparse] flag (1 bit)
- </screen></para>
- <para>
- The decoder now performs for each of the <varname>[codebook_entries]</varname>
- codebook entries:
- <screen>
-
- 1) if([sparse] is set){
- 2) [flag] = read one bit;
- 3) if([flag] is set){
- 4) [length] = read a five bit unsigned integer;
- 5) codeword length for this entry is [length]+1;
- } else {
- 6) this entry is unused. mark it as such.
- }
- } else the sparse flag is not set {
- 7) [length] = read a five bit unsigned integer;
- 8) the codeword length for this entry is [length]+1;
-
- }
- </screen></para>
- </listitem>
- <listitem>
- <para>If the <varname>[ordered]</varname> flag is set, the codeword list for this
- codebook is encoded in ascending length order. Rather than reading
- a length for every codeword, the encoder reads the number of
- codewords per length. That is, beginning at entry zero:
- <screen>
- 1) [current_entry] = 0;
- 2) [current_length] = read a five bit unsigned integer and add 1;
- 3) [number] = read <link linkend="vorbis-spec-ilog">ilog</link>([codebook_entries] - [current_entry]) bits as an unsigned integer
- 4) set the entries [current_entry] through [current_entry]+[number]-1, inclusive,
- of the [codebook_codeword_lengths] array to [current_length]
- 5) set [current_entry] to [number] + [current_entry]
- 6) increment [current_length] by 1
- 7) if [current_entry] is greater than [codebook_entries] ERROR CONDITION;
- the decoder will not be able to read this stream.
- 8) if [current_entry] is less than [codebook_entries], repeat process starting at 3)
- 9) done.
- </screen></para>
- </listitem>
- </itemizedlist>
- After all codeword lengths have been decoded, the decoder reads the
- vector lookup table. Vorbis I supports three lookup types:
- <orderedlist>
- <listitem>
- <simpara>No lookup</simpara>
- </listitem><listitem>
- <simpara>Implicitly populated value mapping (lattice VQ)</simpara>
- </listitem><listitem>
- <simpara>Explicitly populated value mapping (tessellated or 'foam'
- VQ)</simpara>
- </listitem>
- </orderedlist>
- </para>
- <para>
- The lookup table type is read as a four bit unsigned integer:
- <screen>
- 1) [codebook_lookup_type] = read four bits as an unsigned integer
- </screen></para>
- <para>
- Codebook decode precedes according to <varname>[codebook_lookup_type]</varname>:
- <itemizedlist>
- <listitem>
- <para>Lookup type zero indicates no lookup to be read. Proceed past
- lookup decode.</para>
- </listitem><listitem>
- <para>Lookup types one and two are similar, differing only in the
- number of lookup values to be read. Lookup type one reads a list of
- values that are permuted in a set pattern to build a list of vectors,
- each vector of order <varname>[codebook_dimensions]</varname> scalars. Lookup
- type two builds the same vector list, but reads each scalar for each
- vector explicitly, rather than building vectors from a smaller list of
- possible scalar values. Lookup decode proceeds as follows:
- <screen>
- 1) [codebook_minimum_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer)
- 2) [codebook_delta_value] = <link linkend="vorbis-spec-float32_unpack">float32_unpack</link>( read 32 bits as an unsigned integer)
- 3) [codebook_value_bits] = read 4 bits as an unsigned integer and add 1
- 4) [codebook_sequence_p] = read 1 bit as a boolean flag
- if ( [codebook_lookup_type] is 1 ) {
-
- 5) [codebook_lookup_values] = <link linkend="vorbis-spec-lookup1_values">lookup1_values</link>(<varname>[codebook_entries]</varname>, <varname>[codebook_dimensions]</varname> )
- } else {
- 6) [codebook_lookup_values] = <varname>[codebook_entries]</varname> * <varname>[codebook_dimensions]</varname>
- }
- 7) read a total of [codebook_lookup_values] unsigned integers of [codebook_value_bits] each;
- store these in order in the array [codebook_multiplicands]
- </screen></para>
- </listitem><listitem>
- <para>A <varname>[codebook_lookup_type]</varname> of greater than two is reserved
- and indicates a stream that is not decodable by the specification in this
- document.</para>
- </listitem>
- </itemizedlist>
- </para>
- <para>
- An 'end of packet' during any read operation in the above steps is
- considered an error condition rendering the stream undecodable.</para>
- <section><title>Huffman decision tree representation</title>
- <para>
- The <varname>[codebook_codeword_lengths]</varname> array and
- <varname>[codebook_entries]</varname> value uniquely define the Huffman decision
- tree used for entropy decoding.</para>
- <para>
- Briefly, each used codebook entry (recall that length-unordered
- codebooks support unused codeword entries) is assigned, in order, the
- lowest valued unused binary Huffman codeword possible. Assume the
- following codeword length list:
- <screen>
- entry 0: length 2
- entry 1: length 4
- entry 2: length 4
- entry 3: length 4
- entry 4: length 4
- entry 5: length 2
- entry 6: length 3
- entry 7: length 3
- </screen></para>
- <para>
- Assigning codewords in order (lowest possible value of the appropriate
- length to highest) results in the following codeword list:
- <screen>
- entry 0: length 2 codeword 00
- entry 1: length 4 codeword 0100
- entry 2: length 4 codeword 0101
- entry 3: length 4 codeword 0110
- entry 4: length 4 codeword 0111
- entry 5: length 2 codeword 10
- entry 6: length 3 codeword 110
- entry 7: length 3 codeword 111
- </screen></para>
- <note>
- <para>
- Unlike most binary numerical values in this document, we
- intend the above codewords to be read and used bit by bit from left to
- right, thus the codeword '001' is the bit string 'zero, zero, one'.
- When determining 'lowest possible value' in the assignment definition
- above, the leftmost bit is the MSb.</para>
- </note>
- <para>
- It is clear that the codeword length list represents a Huffman
- decision tree with the entry numbers equivalent to the leaves numbered
- left-to-right:
- <mediaobject>
- <imageobject>
- <imagedata fileref="hufftree.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>[huffman tree illustration]</phrase>
- </textobject>
- </mediaobject>
- </para>
- <para>
- As we assign codewords in order, we see that each choice constructs a
- new leaf in the leftmost possible position.</para>
- <para>
- Note that it's possible to underspecify or overspecify a Huffman tree
- via the length list. In the above example, if codeword seven were
- eliminated, it's clear that the tree is unfinished:
- <mediaobject>
- <imageobject>
- <imagedata fileref="hufftree-under.png" format="PNG"/>
- </imageobject>
- <textobject>
- <phrase>[underspecified huffman tree illustration]</phrase>
- </textobject>
- </mediaobject>
- </para>
- <para>
- Similarly, in the original codebook, it's clear that the tree is fully
- populated and a ninth codeword is impossible. Both underspecified and
- overspecified trees are an error condition rendering the stream
- undecodable.</para>
- <para>
- Codebook entries marked 'unused' are simply skipped in the assigning
- process. They have no codeword and do not appear in the decision
- tree, thus it's impossible for any bit pattern read from the stream to
- decode to that entry number.</para>
- </section>
- <section><title>VQ lookup table vector representation</title>
- <para>
- Unpacking the VQ lookup table vectors relies on the following values:
- <programlisting>
- the [codebook_multiplicands] array
- [codebook_minimum_value]
- [codebook_delta_value]
- [codebook_sequence_p]
- [codebook_lookup_type]
- [codebook_entries]
- [codebook_dimensions]
- [codebook_lookup_values]
- </programlisting>
- </para>
- <para>
- Decoding (unpacking) a specific vector in the vector lookup table
- proceeds according to <varname>[codebook_lookup_type]</varname>. The unpacked
- vector values are what a codebook would return during audio packet
- decode in a VQ context.</para>
- <section><title>Vector value decode: Lookup type 1</title>
- <para>
- Lookup type one specifies a lattice VQ lookup table built
- algorithmically from a list of scalar values. Calculate (unpack) the
- final values of a codebook entry vector from the entries in
- <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
- is the output vector representing the vector of values for entry number
- <varname>[lookup_offset]</varname> in this codebook):
- <screen>
- 1) [last] = 0;
- 2) [index_divisor] = 1;
- 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
-
- 4) [multiplicand_offset] = ( [lookup_offset] divided by [index_divisor] using integer
- division ) integer modulo [codebook_lookup_values]
- 5) vector [value_vector] element [i] =
- ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
- [codebook_delta_value] + [codebook_minimum_value] + [last];
- 6) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
- 7) [index_divisor] = [index_divisor] * [codebook_lookup_values]
- }
-
- 8) vector calculation completed.
- </screen></para>
- </section>
- <section><title>Vector value decode: Lookup type 2</title>
- <para>
- Lookup type two specifies a VQ lookup table in which each scalar in
- each vector is explicitly set by the <varname>[codebook_multiplicands]</varname>
- array in a one-to-one mapping. Calculate [unpack] the
- final values of a codebook entry vector from the entries in
- <varname>[codebook_multiplicands]</varname> as follows (<varname>[value_vector]</varname>
- is the output vector representing the vector of values for entry number
- <varname>[lookup_offset]</varname> in this codebook):
- <screen>
- 1) [last] = 0;
- 2) [multiplicand_offset] = [lookup_offset] * [codebook_dimensions]
- 3) iterate [i] over the range 0 ... [codebook_dimensions]-1 (once for each scalar value in the value vector) {
- 4) vector [value_vector] element [i] =
- ( [codebook_multiplicands] array element number [multiplicand_offset] ) *
- [codebook_delta_value] + [codebook_minimum_value] + [last];
- 5) if ( [codebook_sequence_p] is set ) then set [last] = vector [value_vector] element [i]
- 6) increment [multiplicand_offset]
- }
-
- 7) vector calculation completed.
- </screen></para>
- </section>
- </section>
- </section>
- </section>
- <section>
- <title>Use of the codebook abstraction</title>
- <para>
- The decoder uses the codebook abstraction much as it does the
- bit-unpacking convention; a specific codebook reads a
- codeword from the bitstream, decoding it into an entry number, and then
- returns that entry number to the decoder (when used in a scalar
- entropy coding context), or uses that entry number as an offset into
- the VQ lookup table, returning a vector of values (when used in a context
- desiring a VQ value). Scalar or VQ context is always explicit; any call
- to the codebook mechanism requests either a scalar entry number or a
- lookup vector.</para>
- <para>
- Note that VQ lookup type zero indicates that there is no lookup table;
- requesting decode using a codebook of lookup type 0 in any context
- expecting a vector return value (even in a case where a vector of
- dimension one) is forbidden. If decoder setup or decode requests such
- an action, that is an error condition rendering the packet
- undecodable.</para>
- <para>
- Using a codebook to read from the packet bitstream consists first of
- reading and decoding the next codeword in the bitstream. The decoder
- reads bits until the accumulated bits match a codeword in the
- codebook. This process can be though of as logically walking the
- Huffman decode tree by reading one bit at a time from the bitstream,
- and using the bit as a decision boolean to take the 0 branch (left in
- the above examples) or the 1 branch (right in the above examples).
- Walking the tree finishes when the decode process hits a leaf in the
- decision tree; the result is the entry number corresponding to that
- leaf. Reading past the end of a packet propagates the 'end-of-stream'
- condition to the decoder.</para>
- <para>
- When used in a scalar context, the resulting codeword entry is the
- desired return value.</para>
- <para>
- When used in a VQ context, the codeword entry number is used as an
- offset into the VQ lookup table. The value returned to the decoder is
- the vector of scalars corresponding to this offset.</para>
- </section>
- </section>
- <!-- end section of probablity model and codebooks -->
|