02-bitpacking.xml 8.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281
  1. <?xml version="1.0" standalone="no"?>
  2. <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
  4. ]>
  5. <section id="vorbis-spec-bitpacking">
  6. <sectioninfo>
  7. <releaseinfo>
  8. $Id$
  9. </releaseinfo>
  10. </sectioninfo>
  11. <title>Bitpacking Convention</title>
  12. <section>
  13. <title>Overview</title>
  14. <para>
  15. The Vorbis codec uses relatively unstructured raw packets containing
  16. arbitrary-width binary integer fields. Logically, these packets are a
  17. bitstream in which bits are coded one-by-one by the encoder and then
  18. read one-by-one in the same monotonically increasing order by the
  19. decoder. Most current binary storage arrangements group bits into a
  20. native word size of eight bits (octets), sixteen bits, thirty-two bits
  21. or, less commonly other fixed word sizes. The Vorbis bitpacking
  22. convention specifies the correct mapping of the logical packet
  23. bitstream into an actual representation in fixed-width words.
  24. </para>
  25. <section><title>octets, bytes and words</title>
  26. <para>
  27. In most contemporary architectures, a 'byte' is synonymous with an
  28. 'octet', that is, eight bits. This has not always been the case;
  29. seven, ten, eleven and sixteen bit 'bytes' have been used. For
  30. purposes of the bitpacking convention, a byte implies the native,
  31. smallest integer storage representation offered by a platform. On
  32. modern platforms, this is generally assumed to be eight bits (not
  33. necessarily because of the processor but because of the
  34. filesystem/memory architecture. Modern filesystems invariably offer
  35. bytes as the fundamental atom of storage). A 'word' is an integer
  36. size that is a grouped multiple of this smallest size.</para>
  37. <para>
  38. The most ubiquitous architectures today consider a 'byte' to be an
  39. octet (eight bits) and a word to be a group of two, four or eight
  40. bytes (16, 32 or 64 bits). Note however that the Vorbis bitpacking
  41. convention is still well defined for any native byte size; Vorbis uses
  42. the native bit-width of a given storage system. This document assumes
  43. that a byte is one octet for purposes of example.</para>
  44. </section><section><title>bit order</title>
  45. <para>
  46. A byte has a well-defined 'least significant' bit (LSb), which is the
  47. only bit set when the byte is storing the two's complement integer
  48. value +1. A byte's 'most significant' bit (MSb) is at the opposite
  49. end of the byte. Bits in a byte are numbered from zero at the LSb to
  50. <emphasis>n</emphasis> (<emphasis>n</emphasis>=7 in an octet) for the
  51. MSb.</para>
  52. </section>
  53. <section><title>byte order</title>
  54. <para>
  55. Words are native groupings of multiple bytes. Several byte orderings
  56. are possible in a word; the common ones are 3-2-1-0 ('big endian' or
  57. 'most significant byte first' in which the highest-valued byte comes
  58. first), 0-1-2-3 ('little endian' or 'least significant byte first' in
  59. which the lowest value byte comes first) and less commonly 3-1-2-0 and
  60. 0-2-1-3 ('mixed endian').</para>
  61. <para>
  62. The Vorbis bitpacking convention specifies storage and bitstream
  63. manipulation at the byte, not word, level, thus host word ordering is
  64. of a concern only during optimization when writing high performance
  65. code that operates on a word of storage at a time rather than by byte.
  66. Logically, bytes are always coded and decoded in order from byte zero
  67. through byte <emphasis>n</emphasis>.</para>
  68. </section>
  69. <section><title>coding bits into byte sequences</title>
  70. <para>
  71. The Vorbis codec has need to code arbitrary bit-width integers, from
  72. zero to 32 bits wide, into packets. These integer fields are not
  73. aligned to the boundaries of the byte representation; the next field
  74. is written at the bit position at which the previous field ends.</para>
  75. <para>
  76. The encoder logically packs integers by writing the LSb of a binary
  77. integer to the logical bitstream first, followed by next least
  78. significant bit, etc, until the requested number of bits have been
  79. coded. When packing the bits into bytes, the encoder begins by
  80. placing the LSb of the integer to be written into the least
  81. significant unused bit position of the destination byte, followed by
  82. the next-least significant bit of the source integer and so on up to
  83. the requested number of bits. When all bits of the destination byte
  84. have been filled, encoding continues by zeroing all bits of the next
  85. byte and writing the next bit into the bit position 0 of that byte.
  86. Decoding follows the same process as encoding, but by reading bits
  87. from the byte stream and reassembling them into integers.</para>
  88. </section>
  89. <section><title>signedness</title>
  90. <para>
  91. The signedness of a specific number resulting from decode is to be
  92. interpreted by the decoder given decode context. That is, the three
  93. bit binary pattern 'b111' can be taken to represent either 'seven' as
  94. an unsigned integer, or '-1' as a signed, two's complement integer.
  95. The encoder and decoder are responsible for knowing if fields are to
  96. be treated as signed or unsigned.</para>
  97. </section>
  98. <section><title>coding example</title>
  99. <para>
  100. Code the 4 bit integer value '12' [b1100] into an empty bytestream.
  101. Bytestream result:
  102. <screen>
  103. |
  104. V
  105. 7 6 5 4 3 2 1 0
  106. byte 0 [0 0 0 0 1 1 0 0] &lt;-
  107. byte 1 [ ]
  108. byte 2 [ ]
  109. byte 3 [ ]
  110. ...
  111. byte n [ ] bytestream length == 1 byte
  112. </screen>
  113. </para>
  114. <para>
  115. Continue by coding the 3 bit integer value '-1' [b111]:
  116. <screen>
  117. |
  118. V
  119. 7 6 5 4 3 2 1 0
  120. byte 0 [0 1 1 1 1 1 0 0] &lt;-
  121. byte 1 [ ]
  122. byte 2 [ ]
  123. byte 3 [ ]
  124. ...
  125. byte n [ ] bytestream length == 1 byte
  126. </screen>
  127. </para>
  128. <para>
  129. Continue by coding the 7 bit integer value '17' [b0010001]:
  130. <screen>
  131. |
  132. V
  133. 7 6 5 4 3 2 1 0
  134. byte 0 [1 1 1 1 1 1 0 0]
  135. byte 1 [0 0 0 0 1 0 0 0] &lt;-
  136. byte 2 [ ]
  137. byte 3 [ ]
  138. ...
  139. byte n [ ] bytestream length == 2 bytes
  140. bit cursor == 6
  141. </screen>
  142. </para>
  143. <para>
  144. Continue by coding the 13 bit integer value '6969' [b110 11001110 01]:
  145. <screen>
  146. |
  147. V
  148. 7 6 5 4 3 2 1 0
  149. byte 0 [1 1 1 1 1 1 0 0]
  150. byte 1 [0 1 0 0 1 0 0 0]
  151. byte 2 [1 1 0 0 1 1 1 0]
  152. byte 3 [0 0 0 0 0 1 1 0] &lt;-
  153. ...
  154. byte n [ ] bytestream length == 4 bytes
  155. </screen>
  156. </para>
  157. </section>
  158. <section><title>decoding example</title>
  159. <para>
  160. Reading from the beginning of the bytestream encoded in the above example:
  161. <screen>
  162. |
  163. V
  164. 7 6 5 4 3 2 1 0
  165. byte 0 [1 1 1 1 1 1 0 0] &lt;-
  166. byte 1 [0 1 0 0 1 0 0 0]
  167. byte 2 [1 1 0 0 1 1 1 0]
  168. byte 3 [0 0 0 0 0 1 1 0] bytestream length == 4 bytes
  169. </screen>
  170. </para>
  171. <para>
  172. We read two, two-bit integer fields, resulting in the returned numbers
  173. 'b00' and 'b11'. Two things are worth noting here:
  174. <itemizedlist>
  175. <listitem>
  176. <para>Although these four bits were originally written as a single
  177. four-bit integer, reading some other combination of bit-widths from the
  178. bitstream is well defined. There are no artificial alignment
  179. boundaries maintained in the bitstream.</para>
  180. </listitem>
  181. <listitem>
  182. <para>The second value is the
  183. two-bit-wide integer 'b11'. This value may be interpreted either as
  184. the unsigned value '3', or the signed value '-1'. Signedness is
  185. dependent on decode context.</para>
  186. </listitem>
  187. </itemizedlist>
  188. </para>
  189. </section>
  190. <section><title>end-of-packet alignment</title>
  191. <para>
  192. The typical use of bitpacking is to produce many independent
  193. byte-aligned packets which are embedded into a larger byte-aligned
  194. container structure, such as an Ogg transport bitstream. Externally,
  195. each bytestream (encoded bitstream) must begin and end on a byte
  196. boundary. Often, the encoded bitstream is not an integer number of
  197. bytes, and so there is unused (uncoded) space in the last byte of a
  198. packet.</para>
  199. <para>
  200. Unused space in the last byte of a bytestream is always zeroed during
  201. the coding process. Thus, should this unused space be read, it will
  202. return binary zeroes.</para>
  203. <para>
  204. Attempting to read past the end of an encoded packet results in an
  205. 'end-of-packet' condition. End-of-packet is not to be considered an
  206. error; it is merely a state indicating that there is insufficient
  207. remaining data to fulfill the desired read size. Vorbis uses truncated
  208. packets as a normal mode of operation, and as such, decoders must
  209. handle reading past the end of a packet as a typical mode of
  210. operation. Any further read operations after an 'end-of-packet'
  211. condition shall also return 'end-of-packet'.</para>
  212. </section>
  213. <section><title> reading zero bits</title>
  214. <para>
  215. Reading a zero-bit-wide integer returns the value '0' and does not
  216. increment the stream cursor. Reading to the end of the packet (but
  217. not past, such that an 'end-of-packet' condition has not triggered)
  218. and then reading a zero bit integer shall succeed, returning 0, and
  219. not trigger an end-of-packet condition. Reading a zero-bit-wide
  220. integer after a previous read sets 'end-of-packet' shall also fail
  221. with 'end-of-packet'.</para>
  222. </section>
  223. </section>
  224. </section>