overview.html 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383
  1. <html>
  2. <head>
  3. <title>libvorbisenc - API Overview</title>
  4. <link rel=stylesheet href="style.css" type="text/css">
  5. </head>
  6. <body bgcolor=white text=black link="#5555ff" alink="#5555ff" vlink="#5555ff">
  7. <table border=0 width=100%>
  8. <tr>
  9. <td><p class=tiny>libvorbisenc documentation</p></td>
  10. <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
  11. </tr>
  12. </table>
  13. <h1>Libvorbisenc API Overview</h1>
  14. <p>Libvorbisenc is an encoding convenience library intended to
  15. encapsulate the elaborate setup that libvorbis requires for encoding.
  16. Libvorbisenc gives easy access to all high-level adjustments an
  17. application may require when encoding and also exposes some low-level
  18. tuning parameters to allow applications to make detailed adjustments
  19. to the encoding process. <p>
  20. All the <b>libvorbisenc</b> routines are declared in "vorbis/vorbisenc.h".
  21. <em>Note: libvorbis and libvorbisenc always
  22. encode in a single pass. Thus, all possible encoding setups will work
  23. properly with live input and produce streams that decode properly when
  24. streamed. See the subsection titled <a href="#BBR">"managed bitrate
  25. modes"</a> for details on setting limits on bitrate usage when Vorbis
  26. streams are used in a limited-bandwidth environment.</em>
  27. <h2>workflow</h2>
  28. <p>Libvorbisenc is used only during encoder setup; its function
  29. is to automate initialization of a multitude of settings in a
  30. <tt>vorbis_info</tt> structure which libvorbis then uses as a reference
  31. during the encoding process. Libvorbisenc plays no part in the
  32. encoding process after setup.
  33. <p>Encode setup using libvorbisenc consists of three steps:
  34. <ol>
  35. <li>high-level initialization of a <tt>vorbis_info</tt> structure by
  36. calling one of <a
  37. href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
  38. href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
  39. with the basic input audio parameters (rate and channels) and the
  40. basic desired encoded audio output parameters (VBR quality or ABR/CBR
  41. bitrate)<p>
  42. <li>optional adjustment of the basic setup defaults using <a
  43. href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a><p>
  44. <li>calling <a
  45. href="vorbis_encode_setup_init.html">vorbis_encode_setup_init()</a> to
  46. finalize the high-level setup into the detailed low-level reference
  47. values needed by libvorbis to encode audio. The <tt>vorbis_info</tt>
  48. structure is then ready to use for encoding by libvorbis.<p>
  49. </ol>
  50. These three steps can be collapsed into a single call by using <a
  51. href="vorbis_encode_init_vbr.html">vorbis_encode_init_vbr</a> to set up a
  52. quality-based VBR stream or <a
  53. href="vorbis_encode_init.html">vorbis_encode_init</a> to set up a managed
  54. bitrate (ABR or CBR) stream.<p>
  55. <h2>adjustable encoding parameters</h2>
  56. <h3>input audio parameters</h3>
  57. <p>
  58. <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
  59. <tr bgcolor=#cccccc>
  60. <td><b>parameter</b></td>
  61. <td><b>description</b></td>
  62. </tr>
  63. <tr valign=top>
  64. <td>sampling rate</td>
  65. <td>
  66. The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample.
  67. </td>
  68. </tr>
  69. <tr valign=top>
  70. <td>channels</td>
  71. <td>
  72. The number of channels encoded in each input sample. By default,
  73. stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such
  74. that the stereo relationship between the samples is taken into account
  75. when encoding. Stereo coupling my be disabled by using <a
  76. href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
  77. href="vorbis_encode_ctl.html#OV_ECTL_COUPLE_SET">OV_ECTL_COUPLE_SET</a>.
  78. </td>
  79. </tr>
  80. </table>
  81. <h3>quality and VBR modes</h3>
  82. Vorbis is natively a VBR codec; a user requests a given constant
  83. <em>quality</em> and the encoder keeps the encoding quality constant
  84. while allowing the bitrate to vary. 'Quality' modes (Variable BitRate)
  85. will always produce the most consistent encoding results as well as
  86. the highest quality for the amount of bits used.
  87. <p>
  88. <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
  89. <tr bgcolor=#cccccc>
  90. <td><b>parameter</b></td>
  91. <td><b>description</b></td>
  92. </tr>
  93. <tr valign=top>
  94. <td>quality</td>
  95. <td>
  96. A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times.
  97. </td>
  98. </tr>
  99. </table>
  100. <a name="BBR">
  101. <h3>managed bitrate modes</h3>
  102. Although the Vorbis codec is natively VBR, libvorbis includes
  103. infrastructure for 'managing' the bitrate of streams by setting
  104. minimum and maximum usage constraints, as well as functionality for
  105. nudging a stream toward a desired average value. These features
  106. should <em>only</em> be used when there is a requirement to limit
  107. bitrate in some way. Although the difference is usually slight,
  108. managed bitrate modes will always produce output inferior to VBR
  109. (given equal bitrate usage). Setting overly or impossibly tight
  110. bitrate management requirements can affect output quality dramatically
  111. for the worse.<p>
  112. Beginning in libvorbis 1.1, bitrate management is implemented using a
  113. <em>bit-reservoir</em> algorithm. The encoder has a fixed-size
  114. reservoir used as a 'savings account' in encoding. When a frame is
  115. smaller than the target rate, the unused bits go into the reservoir so
  116. that they may be used by future frames. When a frame is larger than
  117. target bitrate, it draws 'banked' bits out of the reservoir. Encoding
  118. is managed so that the reservoir never goes negative (when a maximum
  119. bitrate is specified) or fills beyond a fixed limit (when a minimum
  120. bitrate is specified). An 'average bitrate' request is used as the
  121. set-point in a long-range bitrate tracker which adjusts the encoder's
  122. aggressiveness up or down depending on whether or not frames are coming
  123. in larger or smaller than the requested average point.
  124. <p>
  125. <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
  126. <tr bgcolor=#cccccc>
  127. <td><b>parameter</b></td>
  128. <td><b>description</b></td>
  129. </tr>
  130. <tr valign=top>
  131. <td>maximum bitrate</td> <td> The maximum allowed bitrate, set in bits
  132. per second. If the bitrate would otherwise rise such that oversized
  133. frames would underflow the bit-reservoir by consuming banked bits,
  134. bitrate management will force the encoder to use fewer bits per frame
  135. by encoding with a more aggressive psychoacoustic model.<p> This
  136. setting is a hard limit; the bitstream will never be allowed, under
  137. any circumstances, to increase above the specified bitrate over the
  138. average period set by the reservoir; it may momentarily rise over if
  139. inspected on a granularity much finer than the average period across
  140. the reservoir. Normally, the encoder will conserve bits gracefully by
  141. using more aggressive psychoacoustics to shrink a frame when forced
  142. to. However, if the encoder runs out of means of gracefully shrinking
  143. a frame, it will simply take the smallest frame it can otherwise
  144. generate and truncate it to the maximum allowed length. Note that
  145. this is not an error and although it will obviously adversely affect
  146. audio quality, a Vorbis decoder will be able to decode a truncated
  147. frame into audio.
  148. </td>
  149. </tr>
  150. <tr valign=top>
  151. <td>average bitrate</td>
  152. <td>
  153. The average desired bitrate of a stream, set
  154. in bits per second. Average bitrate is tracked via a reservoir like
  155. minimum and maximum bitrate, however the averaging reservior does not
  156. impose a hard limit; it is used to nudge the bitrate toward the
  157. desired average by slowly adjusting the psychoacoustic aggressiveness.
  158. As such, the reservoir size does not affect the average bitrate
  159. behavior. Because this setting alone is not used to impose hard
  160. bitrate limits, the bitrate of a stream produced using only the
  161. <tt>average bitrate</tt> constraint will track the average over time
  162. but not necessarily adhere strictly to that average for any given
  163. period. Should a strict localized average be required, <tt>average
  164. bitrate</tt> should be used along with <tt>minimum bitrate</tt> and
  165. <tt>maximum bitrate</tt>.
  166. </td>
  167. </tr>
  168. <tr valign=top>
  169. <td>minimum bitrate</td>
  170. <td>
  171. The minimum allowed bitrate, set in bits per second. If
  172. the bitrate would otherwise fall such that undersized frames would
  173. overflow the bit-reservoir with unused bits, bitrate management will
  174. force the encoder to use more bits per frame by encoding with a less
  175. aggressive psychoacoustic model.<p> This setting is a hard limit; the
  176. bitstream will never be allowed, under any circumstances, to drop
  177. below the specified bitrate over the average period set by the
  178. reservoir; it may momentarily fall under if inspected on a granularity
  179. much finer than the average period across the reservoir. Normally,
  180. the encoder will fill out undersided frames with additional useful
  181. coding information by increasing the perceived quality of the stream.
  182. If the encoder runs out of useful ways to consume more bits, it will
  183. pad frames out with zeroes.
  184. </td>
  185. </tr>
  186. <tr valign=top>
  187. <td>reservoir size</td> <td> The size of the minimum/maximum bitrate
  188. tracking reservoir, set in bits. The reservoir is used as a 'bit
  189. bank' to average out localized surges and dips in bitrate while
  190. providing predictable, guaranteed buffering behavior for streams to be
  191. used in situations with constrained transport bandwidth. The default
  192. setting is two seconds of average bitrate.<p>
  193. When a single frame is larger than the maximum allowed overall
  194. bitrate, the bits are 'borrowed' from the bitrate reservoir; if the
  195. reservoir contains insufficient bits to cover the defecit, the encoder
  196. must find some way to reduce the frame size. <p>
  197. When a frame is under the minimum limit, the surplus bits are placed
  198. into the reservoir, banking them for future use. If the reservoir is
  199. already full of banked bits, the encoder is forced to find some way to
  200. make the frame larger.<p>
  201. If the frame size is between the minimum and maximum rates (thus
  202. implying the minimum and maximum allowed rates are different), the
  203. reservoir gravitates toward a fill point configured by the
  204. <tt>reservoir bias</tt> setting described next. If the reservoir is
  205. fuller than the fill point (a 'surplus of surplus'), the encoder will
  206. consume a number bits from the reservoir equal to the number of the
  207. bits by which the frame exceeds minimum size. If the reservoir is
  208. emptier than the fillpoint (a 'surplus of defecit'), bits are returned
  209. to the reservoir equaling the current frame's number of bits under the
  210. maximum frame size. The idea of the fill point is to buffer against
  211. both underruns and overruns, by trying to hold the reservoir to a
  212. middle course.
  213. </td>
  214. </tr>
  215. <tr valign=top>
  216. <td>reservoir bias</td>
  217. <td>
  218. Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate
  219. management toward smoothing bitrate spikes (0.0) or bitrate peaks
  220. (1.0); the default setting is 0.1.<p>
  221. Using settings toward 0.0 causes the bitrate manager to hoard bits in
  222. the bit reservoir such that there is a large pool of banked surplus to
  223. draw upon during short spikes in bitrate. As a result, the encoder
  224. will react less aggressively and less drastically to curtail framesize
  225. during brief surges in bitrate.<p>
  226. Using settings toward 1.0 causes the bitrate manager to empty the bit
  227. reservoir such that there is a large buffer available to store surplus
  228. bits during sudden drops in bitrate. As a result, the encoder will
  229. react less aggressively and less drastically to support minimum frame
  230. sizes during drops in bitrate and will tend not to store any extra
  231. bits in the reservoir for future bitrate spikes.<p>
  232. </td>
  233. </tr>
  234. <tr valign=top>
  235. <td>average track damping</td>
  236. <td>
  237. A decimal value, in seconds, that controls how quickly the average
  238. bitrate tracker is allowed to slew from enforcing minimum frame sizes
  239. to maximum framesizes and vice versa. Default value is 1.5
  240. seconds.<p>
  241. When the 'average bitrate' setting is in use, the average bitrate
  242. tracker uses an unbounded reservoir to track overall bitrate-to-date
  243. in the stream. When bitrates are too low, the tracker will try to
  244. nudge bitrates up and when the bitrate is too high, nudge it down.
  245. The damping value regulates the maximum strength of the nudge; it
  246. describes, in seconds, how quickly the tracker may transition from an
  247. extreme nudge in one direction to an extreme nudge in the other.<p>
  248. </td>
  249. </tr>
  250. </table>
  251. <h3>encoding model adjustments</h3>
  252. The <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> call provides
  253. a generalized interface for making encoding setup adjustments to the
  254. basic high-level setup provided by <a
  255. href="vorbis_encode_setup_vbr.html">vorbis_encode_setup_vbr()</a> or <a
  256. href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>.
  257. In reality, these two calls use <a
  258. href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> internally, and <a
  259. href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can be used to adjust
  260. most of the parameters set by other calls.<p>
  261. In Vorbis 1.1, <a href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> can
  262. adjust the following additional parameters not described elsewhere:
  263. <p>
  264. <table border=1 color=black width=50% cellspacing=0 cellpadding=7>
  265. <tr bgcolor=#cccccc>
  266. <td><b>parameter</b></td>
  267. <td><b>description</b></td>
  268. </tr>
  269. <tr valign=top>
  270. <td>management mode</td> <td> Configures whether or not bitrate
  271. management is in use or not. Normally, this value is set implicitly
  272. during encoding setup; however, the supported means of selecting a
  273. quality mode by bitrate (that is, requesting a true VBR stream, but
  274. doing so by asking for an approximate bitrate) is to use <a
  275. href="vorbis_encode_setup_managed.html">vorbis_encode_setup_managed()</a>
  276. and then to explicitly turn off bitrate management by calling <a
  277. href="vorbis_encode_ctl.html">vorbis_encode_ctl()</a> with <a
  278. href="vorbis_encode_ctl.html#OV_ECTL_RATEMANAGE2_SET">OV_ECTL_RATEMANAGE2_SET</a>
  279. </td>
  280. </tr>
  281. <tr valign=top>
  282. <td>coupling</td> <td> Stereo encoding (and in the future, surround
  283. encodings) are normally encoded assuming the channels form a stereo
  284. image and that lossy-stereo modelling is appropriate; this is called
  285. 'coupling'. Stereo coupling may be explicitly enabled or disabled.
  286. </td>
  287. </tr>
  288. <tr valign=top>
  289. <td>lowpass</td> <td> Sets the hard lowpass of a given encoding mode;
  290. this may be used to conserve a few bits in high-rate audio that has
  291. limited bandwidth, or in testing of the encoder's acoustic model. The
  292. encoder is generally already configured with ideal lowpasses (if any
  293. at all) for given modes; use of this parameter is strongly discouraged
  294. if the point is to try to 'improve' a given encoding mode for general
  295. encoding.
  296. </td>
  297. </tr>
  298. <tr valign=top>
  299. <td>impulse coding aggressiveness</td> <td>By default, libvorbis
  300. attempts to compromise between preventing wide bitrate swings and
  301. high-resolution impulse coding (which is required for the crispest
  302. possible attacks, but also requires a relatively large momentary
  303. bitrate increase). This parameter allows an application to tune the
  304. compromise or eliminate it; A value of 0.0 indicates normal behavior
  305. while a value of -15.0 requests maximum possible impulse
  306. resolution.</td>
  307. </tr>
  308. </table>
  309. <br><br>
  310. <hr noshade>
  311. <table border=0 width=100%>
  312. <tr valign=top>
  313. <td><p class=tiny>copyright &copy; 2004 Vorbis team</p></td>
  314. <td align=right><p class=tiny><a href="http://www.xiph.org/ogg/vorbis/index.html">Ogg Vorbis</a><br><a href="mailto:team@vorbis.org">team@vorbis.org</a></p></td>
  315. </tr><tr>
  316. <td><p class=tiny>libvorbisenc documentation</p></td>
  317. <td align=right><p class=tiny>libvorbisenc release 1.1 - 20040709</p></td>
  318. </tr>
  319. </table>
  320. </body>
  321. </html>