08-residue.xml 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501
  1. <?xml version="1.0" standalone="no"?>
  2. <!DOCTYPE section PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
  4. ]>
  5. <section id="vorbis-spec-residue">
  6. <sectioninfo>
  7. <releaseinfo>
  8. $Id$
  9. </releaseinfo>
  10. </sectioninfo>
  11. <title>Residue setup and decode</title>
  12. <section>
  13. <title>Overview</title>
  14. <para>
  15. A residue vector represents the fine detail of the audio spectrum of
  16. one channel in an audio frame after the encoder subtracts the floor
  17. curve and performs any channel coupling. A residue vector may
  18. represent spectral lines, spectral magnitude, spectral phase or
  19. hybrids as mixed by channel coupling. The exact semantic content of
  20. the vector does not matter to the residue abstraction.</para>
  21. <para>
  22. Whatever the exact qualities, the Vorbis residue abstraction codes the
  23. residue vectors into the bitstream packet, and then reconstructs the
  24. vectors during decode. Vorbis makes use of three different encoding
  25. variants (numbered 0, 1 and 2) of the same basic vector encoding
  26. abstraction.</para>
  27. </section>
  28. <section>
  29. <title>Residue format</title>
  30. <para>
  31. Residue format partitions each vector in the vector bundle into chunks,
  32. classifies each chunk, encodes the chunk classifications and finally
  33. encodes the chunks themselves using the the specific VQ arrangement
  34. defined for each selected classification.
  35. The exact interleaving and partitioning vary by residue encoding number,
  36. however the high-level process used to classify and encode the residue
  37. vector is the same in all three variants.</para>
  38. <para>
  39. A set of coded residue vectors are all of the same length. High level
  40. coding structure, ignoring for the moment exactly how a partition is
  41. encoded and simply trusting that it is, is as follows:</para>
  42. <para>
  43. <itemizedlist>
  44. <listitem><para>Each vector is partitioned into multiple equal sized chunks
  45. according to configuration specified. If we have a vector size of
  46. <emphasis>n</emphasis>, a partition size <emphasis>residue_partition_size</emphasis>, and a total
  47. of <emphasis>ch</emphasis> residue vectors, the total number of partitioned chunks
  48. coded is <emphasis>n</emphasis>/<emphasis>residue_partition_size</emphasis>*<emphasis>ch</emphasis>. It is
  49. important to note that the integer division truncates. In the below
  50. example, we assume an example <emphasis>residue_partition_size</emphasis> of 8.</para></listitem>
  51. <listitem><para>Each partition in each vector has a classification number that
  52. specifies which of multiple configured VQ codebook setups are used to
  53. decode that partition. The classification numbers of each partition
  54. can be thought of as forming a vector in their own right, as in the
  55. illustration below. Just as the residue vectors are coded in grouped
  56. partitions to increase encoding efficiency, the classification vector
  57. is also partitioned into chunks. The integer elements of each scalar
  58. in a classification chunk are built into a single scalar that
  59. represents the classification numbers in that chunk. In the below
  60. example, the classification codeword encodes two classification
  61. numbers.</para></listitem>
  62. <listitem><para>The values in a residue vector may be encoded monolithically in a
  63. single pass through the residue vector, but more often efficient
  64. codebook design dictates that each vector is encoded as the additive
  65. sum of several passes through the residue vector using more than one
  66. VQ codebook. Thus, each residue value potentially accumulates values
  67. from multiple decode passes. The classification value associated with
  68. a partition is the same in each pass, thus the classification codeword
  69. is coded only in the first pass.</para></listitem>
  70. </itemizedlist>
  71. </para>
  72. <mediaobject>
  73. <imageobject>
  74. <imagedata fileref="residue-pack.png" format="PNG"/>
  75. </imageobject>
  76. <textobject>
  77. <phrase>[illustration of residue vector format]</phrase>
  78. </textobject>
  79. </mediaobject>
  80. </section>
  81. <section><title>residue 0</title>
  82. <para>
  83. Residue 0 and 1 differ only in the way the values within a residue
  84. partition are interleaved during partition encoding (visually treated
  85. as a black box--or cyan box or brown box--in the above figure).</para>
  86. <para>
  87. Residue encoding 0 interleaves VQ encoding according to the
  88. dimension of the codebook used to encode a partition in a specific
  89. pass. The dimension of the codebook need not be the same in multiple
  90. passes, however the partition size must be an even multiple of the
  91. codebook dimension.</para>
  92. <para>
  93. As an example, assume a partition vector of size eight, to be encoded
  94. by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
  95. <programlisting>
  96. original residue vector: [ 0 1 2 3 4 5 6 7 ]
  97. codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
  98. codebook dimensions = 4 encoded as: [ 0 2 4 6 ], [ 1 3 5 7 ]
  99. codebook dimensions = 2 encoded as: [ 0 4 ], [ 1 5 ], [ 2 6 ], [ 3 7 ]
  100. codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
  101. </programlisting>
  102. <para>
  103. It is worth mentioning at this point that no configurable value in the
  104. residue coding setup is restricted to a power of two.</para>
  105. </section>
  106. <section><title>residue 1</title>
  107. <para>
  108. Residue 1 does not interleave VQ encoding. It represents partition
  109. vector scalars in order. As with residue 0, however, partition length
  110. must be an integer multiple of the codebook dimension, although
  111. dimension may vary from pass to pass.</para>
  112. <para>
  113. As an example, assume a partition vector of size eight, to be encoded
  114. by residue 0 using codebook sizes of 8, 4, 2 and 1:</para>
  115. <programlisting>
  116. original residue vector: [ 0 1 2 3 4 5 6 7 ]
  117. codebook dimensions = 8 encoded as: [ 0 1 2 3 4 5 6 7 ]
  118. codebook dimensions = 4 encoded as: [ 0 1 2 3 ], [ 4 5 6 7 ]
  119. codebook dimensions = 2 encoded as: [ 0 1 ], [ 2 3 ], [ 4 5 ], [ 6 7 ]
  120. codebook dimensions = 1 encoded as: [ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ]
  121. </programlisting>
  122. </section>
  123. <section><title>residue 2</title>
  124. <para>
  125. Residue type two can be thought of as a variant of residue type 1.
  126. Rather than encoding multiple passed-in vectors as in residue type 1,
  127. the <emphasis>ch</emphasis> passed in vectors of length <emphasis>n</emphasis> are first
  128. interleaved and flattened into a single vector of length
  129. <emphasis>ch</emphasis>*<emphasis>n</emphasis>. Encoding then proceeds as in type 1. Decoding is
  130. as in type 1 with decode interleave reversed. If operating on a single
  131. vector to begin with, residue type 1 and type 2 are equivalent.</para>
  132. <mediaobject>
  133. <imageobject>
  134. <imagedata fileref="residue2.png" format="PNG"/>
  135. </imageobject>
  136. <textobject>
  137. <phrase>[illustration of residue type 2]</phrase>
  138. </textobject>
  139. </mediaobject>
  140. </section>
  141. <section>
  142. <title>Residue decode</title>
  143. <section><title>header decode</title>
  144. <para>
  145. Header decode for all three residue types is identical.</para>
  146. <programlisting>
  147. 1) [residue_begin] = read 24 bits as unsigned integer
  148. 2) [residue_end] = read 24 bits as unsigned integer
  149. 3) [residue_partition_size] = read 24 bits as unsigned integer and add one
  150. 4) [residue_classifications] = read 6 bits as unsigned integer and add one
  151. 5) [residue_classbook] = read 8 bits as unsigned integer
  152. </programlisting>
  153. <para>
  154. <varname>[residue_begin]</varname> and
  155. <varname>[residue_end]</varname> select the specific sub-portion of
  156. each vector that is actually coded; it implements akin to a bandpass
  157. where, for coding purposes, the vector effectively begins at element
  158. <varname>[residue_begin]</varname> and ends at
  159. <varname>[residue_end]</varname>. Preceding and following values in
  160. the unpacked vectors are zeroed. Note that for residue type 2, these
  161. values as well as <varname>[residue_partition_size]</varname>apply to
  162. the interleaved vector, not the individual vectors before interleave.
  163. <varname>[residue_partition_size]</varname> is as explained above,
  164. <varname>[residue_classifications]</varname> is the number of possible
  165. classification to which a partition can belong and
  166. <varname>[residue_classbook]</varname> is the codebook number used to
  167. code classification codewords. The number of dimensions in book
  168. <varname>[residue_classbook]</varname> determines how many
  169. classification values are grouped into a single classification
  170. codeword. Note that the number of entries and dimensions in book
  171. <varname>[residue_classbook]</varname>, along with
  172. <varname>[residue_classifications]</varname>, overdetermines to
  173. possible number of classification codewords. If
  174. <varname>[residue_classifications]</varname>^<varname>[residue_classbook]</varname>.dimensions
  175. exceeds <varname>[residue_classbook]</varname>.entries, the
  176. bitstream should be regarded to be undecodable. </para>
  177. <para>
  178. Next we read a bitmap pattern that specifies which partition classes
  179. code values in which passes.</para>
  180. <programlisting>
  181. 1) iterate [i] over the range 0 ... [residue_classifications]-1 {
  182. 2) [high_bits] = 0
  183. 3) [low_bits] = read 3 bits as unsigned integer
  184. 4) [bitflag] = read one bit as boolean
  185. 5) if ( [bitflag] is set ) then [high_bits] = read five bits as unsigned integer
  186. 6) vector [residue_cascade] element [i] = [high_bits] * 8 + [low_bits]
  187. }
  188. 7) done
  189. </programlisting>
  190. <para>
  191. Finally, we read in a list of book numbers, each corresponding to
  192. specific bit set in the cascade bitmap. We loop over the possible
  193. codebook classifications and the maximum possible number of encoding
  194. stages (8 in Vorbis I, as constrained by the elements of the cascade
  195. bitmap being eight bits):</para>
  196. <programlisting>
  197. 1) iterate [i] over the range 0 ... [residue_classifications]-1 {
  198. 2) iterate [j] over the range 0 ... 7 {
  199. 3) if ( vector [residue_cascade] element [i] bit [j] is set ) {
  200. 4) array [residue_books] element [i][j] = read 8 bits as unsigned integer
  201. } else {
  202. 5) array [residue_books] element [i][j] = unused
  203. }
  204. }
  205. }
  206. 6) done
  207. </programlisting>
  208. <para>
  209. An end-of-packet condition at any point in header decode renders the
  210. stream undecodable. In addition, any codebook number greater than the
  211. maximum numbered codebook set up in this stream also renders the
  212. stream undecodable.</para>
  213. </section>
  214. <section><title>packet decode</title>
  215. <para>
  216. Format 0 and 1 packet decode is identical except for specific
  217. partition interleave. Format 2 packet decode can be built out of the
  218. format 1 decode process. Thus we describe first the decode
  219. infrastructure identical to all three formats.</para>
  220. <para>
  221. In addition to configuration information, the residue decode process
  222. is passed the number of vectors in the submap bundle and a vector of
  223. flags indicating if any of the vectors are not to be decoded. If the
  224. passed in number of vectors is 3 and vector number 1 is marked 'do not
  225. decode', decode skips vector 1 during the decode loop. However, even
  226. 'do not decode' vectors are allocated and zeroed.</para>
  227. <para>
  228. Depending on the values of <varname>[residue_begin]</varname> and
  229. <varname>[residue_end]</varname>, it is obvious that the encoded
  230. portion of a residue vector may be the entire possible residue vector
  231. or some other strict subset of the actual residue vector size with
  232. zero padding at either uncoded end. However, it is also possible to
  233. set <varname>[residue_begin]</varname> and
  234. <varname>[residue_end]</varname> to specify a range partially or
  235. wholly beyond the maximum vector size. Before beginning residue
  236. decode, limit <varname>[residue_begin]</varname> and
  237. <varname>[residue_end]</varname> to the maximum possible vector size
  238. as follows. We assume that the number of vectors being encoded,
  239. <varname>[ch]</varname> is provided by the higher level decoding
  240. process.</para>
  241. <programlisting>
  242. 1) [actual_size] = current blocksize/2;
  243. 2) if residue encoding is format 2
  244. 3) [actual_size] = [actual_size] * [ch];
  245. 4) [limit_residue_begin] = maximum of ([residue_begin],[actual_size]);
  246. 5) [limit_residue_end] = maximum of ([residue_end],[actual_size]);
  247. </programlisting>
  248. <para>
  249. The following convenience values are conceptually useful to clarifying
  250. the decode process:</para>
  251. <programlisting>
  252. 1) [classwords_per_codeword] = [codebook_dimensions] value of codebook [residue_classbook]
  253. 2) [n_to_read] = [limit_residue_end] - [limit_residue_begin]
  254. 3) [partitions_to_read] = [n_to_read] / [residue_partition_size]
  255. </programlisting>
  256. <para>
  257. Packet decode proceeds as follows, matching the description offered earlier in the document. </para>
  258. <programlisting>
  259. 1) allocate and zero all vectors that will be returned.
  260. 2) if ([n_to_read] is zero), stop; there is no residue to decode.
  261. 3) iterate [pass] over the range 0 ... 7 {
  262. 4) [partition_count] = 0
  263. 5) while [partition_count] is less than [partitions_to_read]
  264. 6) if ([pass] is zero) {
  265. 7) iterate [j] over the range 0 .. [ch]-1 {
  266. 8) if vector [j] is not marked 'do not decode' {
  267. 9) [temp] = read from packet using codebook [residue_classbook] in scalar context
  268. 10) iterate [i] descending over the range [classwords_per_codeword]-1 ... 0 {
  269. 11) array [classifications] element [j],([i]+[partition_count]) =
  270. [temp] integer modulo [residue_classifications]
  271. 12) [temp] = [temp] / [residue_classifications] using integer division
  272. }
  273. }
  274. }
  275. }
  276. 13) iterate [i] over the range 0 .. ([classwords_per_codeword] - 1) while [partition_count]
  277. is also less than [partitions_to_read] {
  278. 14) iterate [j] over the range 0 .. [ch]-1 {
  279. 15) if vector [j] is not marked 'do not decode' {
  280. 16) [vqclass] = array [classifications] element [j],[partition_count]
  281. 17) [vqbook] = array [residue_books] element [vqclass],[pass]
  282. 18) if ([vqbook] is not 'unused') {
  283. 19) decode partition into output vector number [j], starting at scalar
  284. offset [limit_residue_begin]+[partition_count]*[residue_partition_size] using
  285. codebook number [vqbook] in VQ context
  286. }
  287. }
  288. 20) increment [partition_count] by one
  289. }
  290. }
  291. }
  292. 21) done
  293. </programlisting>
  294. <para>
  295. An end-of-packet condition during packet decode is to be considered a
  296. nominal occurrence. Decode returns the result of vector decode up to
  297. that point.</para>
  298. </section>
  299. <section><title>format 0 specifics</title>
  300. <para>
  301. Format zero decodes partitions exactly as described earlier in the
  302. 'Residue Format: residue 0' section. The following pseudocode
  303. presents the same algorithm. Assume:</para>
  304. <para>
  305. <itemizedlist>
  306. <listitem><simpara> <varname>[n]</varname> is the value in <varname>[residue_partition_size]</varname></simpara></listitem>
  307. <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
  308. <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
  309. </itemizedlist>
  310. </para>
  311. <programlisting>
  312. 1) [step] = [n] / [codebook_dimensions]
  313. 2) iterate [i] over the range 0 ... [step]-1 {
  314. 3) vector [entry_temp] = read vector from packet using current codebook in VQ context
  315. 4) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
  316. 5) vector [v] element ([offset]+[i]+[j]*[step]) =
  317. vector [v] element ([offset]+[i]+[j]*[step]) +
  318. vector [entry_temp] element [j]
  319. }
  320. }
  321. 6) done
  322. </programlisting>
  323. </section>
  324. <section><title>format 1 specifics</title>
  325. <para>
  326. Format 1 decodes partitions exactly as described earlier in the
  327. 'Residue Format: residue 1' section. The following pseudocode
  328. presents the same algorithm. Assume:</para>
  329. <para>
  330. <itemizedlist>
  331. <listitem><simpara> <varname>[n]</varname> is the value in
  332. <varname>[residue_partition_size]</varname></simpara></listitem>
  333. <listitem><simpara><varname>[v]</varname> is the residue vector</simpara></listitem>
  334. <listitem><simpara><varname>[offset]</varname> is the beginning read offset in [v]</simpara></listitem>
  335. </itemizedlist>
  336. </para>
  337. <programlisting>
  338. 1) [i] = 0
  339. 2) vector [entry_temp] = read vector from packet using current codebook in VQ context
  340. 3) iterate [j] over the range 0 ... [codebook_dimensions]-1 {
  341. 4) vector [v] element ([offset]+[i]) =
  342. vector [v] element ([offset]+[i]) +
  343. vector [entry_temp] element [j]
  344. 5) increment [i]
  345. }
  346. 6) if ( [i] is less than [n] ) continue at step 2
  347. 7) done
  348. </programlisting>
  349. </section>
  350. <section><title>format 2 specifics</title>
  351. <para>
  352. Format 2 is reducible to format 1. It may be implemented as an additional step prior to and an additional post-decode step after a normal format 1 decode.
  353. </para>
  354. <para>
  355. Format 2 handles 'do not decode' vectors differently than residue 0 or
  356. 1; if all vectors are marked 'do not decode', no decode occurrs.
  357. However, if at least one vector is to be decoded, all the vectors are
  358. decoded. We then request normal format 1 to decode a single vector
  359. representing all output channels, rather than a vector for each
  360. channel. After decode, deinterleave the vector into independent vectors, one for each output channel. That is:</para>
  361. <orderedlist>
  362. <listitem><simpara>If all vectors 0 through <emphasis>ch</emphasis>-1 are marked 'do not decode', allocate and clear a single vector <varname>[v]</varname>of length <emphasis>ch*n</emphasis> and skip step 2 below; proceed directly to the post-decode step.</simpara></listitem>
  363. <listitem><simpara>Rather than performing format 1 decode to produce <emphasis>ch</emphasis> vectors of length <emphasis>n</emphasis> each, call format 1 decode to produce a single vector <varname>[v]</varname> of length <emphasis>ch*n</emphasis>. </simpara></listitem>
  364. <listitem><para>Post decode: Deinterleave the single vector <varname>[v]</varname> returned by format 1 decode as described above into <emphasis>ch</emphasis> independent vectors, one for each outputchannel, according to:
  365. <programlisting>
  366. 1) iterate [i] over the range 0 ... [n]-1 {
  367. 2) iterate [j] over the range 0 ... [ch]-1 {
  368. 3) output vector number [j] element [i] = vector [v] element ([i] * [ch] + [j])
  369. }
  370. }
  371. 4) done
  372. </programlisting>
  373. </para></listitem>
  374. </orderedlist>
  375. </section>
  376. </section>
  377. </section>