programming.html 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
  2. <html>
  3. <head>
  4. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-15"/>
  5. <title>Ogg Vorbis Documentation</title>
  6. <style type="text/css">
  7. body {
  8. margin: 0 18px 0 18px;
  9. padding-bottom: 30px;
  10. font-family: Verdana, Arial, Helvetica, sans-serif;
  11. color: #333333;
  12. font-size: .8em;
  13. }
  14. a {
  15. color: #3366cc;
  16. }
  17. img {
  18. border: 0;
  19. }
  20. #xiphlogo {
  21. margin: 30px 0 16px 0;
  22. }
  23. #content p {
  24. line-height: 1.4;
  25. }
  26. h1, h1 a, h2, h2 a, h3, h3 a {
  27. font-weight: bold;
  28. color: #ff9900;
  29. margin: 1.3em 0 8px 0;
  30. }
  31. h1 {
  32. font-size: 1.3em;
  33. }
  34. h2 {
  35. font-size: 1.2em;
  36. }
  37. h3 {
  38. font-size: 1.1em;
  39. }
  40. li {
  41. line-height: 1.4;
  42. }
  43. #copyright {
  44. margin-top: 30px;
  45. line-height: 1.5em;
  46. text-align: center;
  47. font-size: .8em;
  48. color: #888888;
  49. clear: both;
  50. }
  51. </style>
  52. </head>
  53. <body>
  54. <div id="xiphlogo">
  55. <a href="http://www.xiph.org/"><img src="fish_xiph_org.png" alt="Fish Logo and Xiph.org"/></a>
  56. </div>
  57. <h1>Programming with Xiph.org <tt>libvorbis</tt></h1>
  58. <h2>Description</h2>
  59. <p>Libvorbis is the Xiph.org Foundation's portable Ogg Vorbis CODEC
  60. implemented as a programmatic library. Libvorbis provides primitives
  61. to handle framing and manipulation of Ogg bitstreams (used by the
  62. Vorbis for streaming), a full analysis (encoding) interface as well as
  63. packet decoding and synthesis for playback.</p>
  64. <p>The libvorbis library does not provide any system interface; a
  65. full-featured demonstration player included with the library
  66. distribtion provides example code for a variety of system interfaces
  67. as well as a working example of using libvorbis in production code.</p>
  68. <h2>Encoding Overview</h2>
  69. <h2>Decoding Overview</h2>
  70. <p>Decoding a bitstream with libvorbis follows roughly the following
  71. steps:</p>
  72. <ol>
  73. <li>Frame the incoming bitstream into pages</li>
  74. <li>Sort the pages by logical bitstream and buffer then into logical streams</li>
  75. <li>Decompose the logical streams into raw packets</li>
  76. <li>Reconstruct segments of the original data from each packet</li>
  77. <li>Glue the reconstructed segments back into a decoded stream</li>
  78. </ol>
  79. <h3>Framing</h3>
  80. <p>An Ogg bitstream is logically arranged into pages, but to decode
  81. the pages, we have to find them first. The raw bitstream is first fed
  82. into an <tt>ogg_sync_state</tt> buffer using <tt>ogg_sync_buffer()</tt>
  83. and <tt>ogg_sync_wrote()</tt>. After each block we submit to the sync
  84. buffer, we should check to see if we can frame and extract a complete
  85. page or pages using <tt>ogg_sync_pageout()</tt>. Extra pages are
  86. buffered; allowing them to build up in the <tt>ogg_sync_state</tt>
  87. buffer will eventually exhaust memory.</p>
  88. <p>The Ogg pages returned from <tt>ogg_sync_pageout</tt> need not be
  89. decoded further to be used as landmarks in seeking; seeking can be
  90. either a rough process of simply jumping to approximately intuited
  91. portions of the bitstream, or it can be a precise bisection process
  92. that captures pages and inspects data position. When seeking,
  93. however, sequential multiplexing (chaining) must be accounted for;
  94. beginning play in a new logical bitstream requires initializing a
  95. synthesis engine with the headers from that bitstream. Vorbis
  96. bitstreams do not make use of concurent multiplexing (grouping).</p>
  97. <h3>Sorting</h3>
  98. <p>The pages produced by <tt>ogg_sync_pageout</tt> are then sorted by
  99. serial number to seperate logical bitstreams. Initialize logical
  100. bitstream buffers (<tt>og_stream_state</tt>) using
  101. <tt>ogg_stream_init()</tt>. Pages are submitted to the matching
  102. logical bitstream buffer using <tt>ogg_stream_pagein</tt>; the serial
  103. number of the page and the stream buffer must match, or the page will
  104. be rejected. A page submitted out of sequence will simply be noted,
  105. and in the course of outputting packets, the hole will be flagged
  106. (<tt>ogg_sync_pageout</tt> and <tt>ogg_stream_packetout</tt> will
  107. return a negative value at positions where they had to recapture the
  108. stream).</p>
  109. <h3>Extracting packets</h3>
  110. <p>After submitting page[s] to a logical stream, read available packets
  111. using <tt>ogg_stream_packetout</tt>.</p>
  112. <h3>Decoding packets</h3>
  113. <h3>Reassembling data segments</h3>
  114. <h2>Ogg Bitstream Manipulation Structures</h2>
  115. <p>Two of the Ogg bitstream data structures are intended to be
  116. transparent to the developer; the fields should be used directly.</p>
  117. <h3>ogg_packet</h3>
  118. <pre>
  119. typedef struct {
  120. unsigned char *packet;
  121. long bytes;
  122. long b_o_s;
  123. long e_o_s;
  124. size64 granulepos;
  125. } ogg_packet;
  126. </pre>
  127. <dl>
  128. <dt>packet:</dt>
  129. <dd>a pointer to the byte data of the raw packet</dd>
  130. <dt>bytes:</dt>
  131. <dd>the size of the packet' raw data</dd>
  132. <dt>b_o_s:</dt>
  133. <dd>beginning of stream; nonzero if this is the first packet of
  134. the logical bitstream</dd>
  135. <dt>e_o_s:</dt>
  136. <dd>end of stream; nonzero if this is the last packet of the
  137. logical bitstream</dd>
  138. <dt>granulepos:</dt>
  139. <dd>the absolute position of this packet in the original
  140. uncompressed data stream.</dd>
  141. </dl>
  142. <h4>encoding notes</h4>
  143. <p>The encoder is responsible for setting all of
  144. the fields of the packet to appropriate values before submission to
  145. <tt>ogg_stream_packetin()</tt>; however, it is noted that the value in
  146. <tt>b_o_s</tt> is ignored; the first page produced from a given
  147. <tt>ogg_stream_state</tt> structure will be stamped as the initial
  148. page. <tt>e_o_s</tt>, however, must be set; this is the means by
  149. which the stream encoding primitives handle end of stream and cleanup.</p>
  150. <h4>decoding notes</h4>
  151. <p><tt>ogg_stream_packetout()</tt> sets the fields
  152. to appropriate values. Note that granulepos will be >= 0 only in the
  153. case that the given packet actually represents that position (ie, only
  154. the last packet completed on any page will have a meaningful
  155. <tt>granulepos</tt>). Intervening frames will see <tt>granulepos</tt> set
  156. to -1.</p>
  157. <h3>ogg_page</h3>
  158. <pre>
  159. typedef struct {
  160. unsigned char *header;
  161. long header_len;
  162. unsigned char *body;
  163. long body_len;
  164. } ogg_page;
  165. </pre>
  166. <dl>
  167. <dt>header:</dt>
  168. <dd>pointer to the page header data</dd>
  169. <dt>header_len:</dt>
  170. <dd>length of the page header in bytes</dd>
  171. <dt>body:</dt>
  172. <dd>pointer to the page body</dd>
  173. <dt>body_len:</dt>
  174. <dd>length of the page body</dd>
  175. </dl>
  176. <p>Note that although the <tt>header</tt> and <tt>body</tt> pointers do
  177. not necessarily point into a single contiguous page vector, the page
  178. body must immediately follow the header in the bitstream.</p>
  179. <h2>Ogg Bitstream Manipulation Functions</h2>
  180. <h3>
  181. int ogg_page_bos(ogg_page *og);
  182. </h3>
  183. <p>Returns the 'beginning of stream' flag for the given Ogg page. The
  184. beginning of stream flag is set on the initial page of a logical
  185. bitstream.</p>
  186. <p>Zero indicates the flag is cleared (this is not the initial page of a
  187. logical bitstream). Nonzero indicates the flag is set (this is the
  188. initial page of a logical bitstream).</p>
  189. <h3>
  190. int ogg_page_continued(ogg_page *og);
  191. </h3>
  192. <p>Returns the 'packet continued' flag for the given Ogg page. The packet
  193. continued flag indicates whether or not the body data of this page
  194. begins with packet continued from a preceeding page.</p>
  195. <p>Zero (unset) indicates that the body data begins with a new packet.
  196. Nonzero (set) indicates that the first packet data on the page is a
  197. continuation from the preceeding page.</p>
  198. <h3>
  199. int ogg_page_eos(ogg_page *og);
  200. </h3>
  201. <p>Returns the 'end of stream' flag for a give Ogg page. The end of page
  202. flag is set on the last (terminal) page of a logical bitstream.</p>
  203. <p>Zero (unset) indicates that this is not the last page of a logical
  204. bitstream. Nonzero (set) indicates that this is the last page of a
  205. logical bitstream and that no addiitonal pages belonging to this
  206. bitstream may follow.</p>
  207. <h3>
  208. size64 ogg_page_granulepos(ogg_page *og);
  209. </h3>
  210. <p>Returns the position of this page as an absolute position within the
  211. original uncompressed data. The position, as returned, is 'frames
  212. encoded to date up to and including the last whole packet on this
  213. page'. Partial packets begun on this page but continued to the
  214. following page are not included. If no packet ends on this page, the
  215. frame position value will be equal to the frame position value of the
  216. preceeding page. If none of the original uncompressed data is yet
  217. represented in the logical bitstream (for example, the first page of a
  218. bitstream consists only of a header packet; this packet encodes only
  219. metadata), the value shall be zero.</p>
  220. <p>The units of the framenumber are determined by media mapping. A
  221. vorbis audio bitstream, for example, defines one frame to be the
  222. channel values from a single sampling period (eg, a 16 bit stereo
  223. bitstream consists of two samples of two bytes for a total of four
  224. bytes, thus a frame would be four bytes). A video stream defines one
  225. frame to be a single frame of video.</p>
  226. <h3>
  227. int ogg_page_pageno(ogg_page *og);
  228. </h3>
  229. <p>Returns the sequential page number of the given Ogg page. The first
  230. page in a logical bitstream is numbered zero; following pages are
  231. numbered in increasing monotonic order.</p>
  232. <h3>
  233. int ogg_page_serialno(ogg_page *og);
  234. </h3>
  235. <p>Returns the serial number of the given Ogg page. The serial number is
  236. used as a handle to distinguish various logical bitstreams in a
  237. physical Ogg bitstresm. Every logical bitstream within a
  238. physical bitstream must use a unique (within the scope of the physical
  239. bitstream) serial number, which is stamped on all bitstream pages.</p>
  240. <h3>
  241. int ogg_page_version(ogg_page *og);
  242. </h3>
  243. <p>Returns the revision of the Ogg bitstream structure of the given page.
  244. Currently, the only permitted number is zero. Later revisions of the
  245. bitstream spec will increment this version should any changes be
  246. incompatable.</p>
  247. <h3>
  248. int ogg_stream_clear(ogg_stream_state *os);
  249. </h3>
  250. <p>Clears and deallocates the internal storage of the given Ogg stream.
  251. After clearing, the stream structure is not initialized for use;
  252. <tt>ogg_stream_init</tt> must be called to reinitialize for use.
  253. Use <tt>ogg_stream_reset</tt> to reset the stream state
  254. to a fresh, intiialized state.</p>
  255. <p><tt>ogg_stream_clear</tt> does not call <tt>free()</tt> on the pointer
  256. <tt>os</tt>, allowing use of this call on stream structures in static
  257. or automatic storage. <tt>ogg_stream_destroy</tt>is a complimentary
  258. function that frees the pointer as well.</p>
  259. <p>Returns zero on success and non-zero on failure. This function always
  260. succeeds.</p>
  261. <h3>
  262. int ogg_stream_destroy(ogg_stream_state *os);
  263. </h3>
  264. <p>Clears and deallocates the internal storage of the given Ogg stream,
  265. then frees the storage associated with the pointer <tt>os</tt>.</p>
  266. <p><tt>ogg_stream_clear</tt> does not call <tt>free()</tt> on the pointer
  267. <tt>os</tt>, allowing use of that call on stream structures in static
  268. or automatic storage.</p>
  269. <p>Returns zero on success and non-zero on failure. This function always
  270. succeeds.</p>
  271. <h3>
  272. int ogg_stream_init(ogg_stream_state *os,int serialno);
  273. </h3>
  274. <p>Initialize the storage associated with <tt>os</tt> for use as an Ogg
  275. stream. This call is used to initialize a stream for both encode and
  276. decode. The given serial number is the serial number that will be
  277. stamped on pages of the produced bitstream (during encode), or used as
  278. a check that pages match (during decode).</p>
  279. <p>Returns zero on success, nonzero on failure.</p>
  280. <h3>
  281. int ogg_stream_packetin(ogg_stream_state *os, ogg_packet *op);
  282. </h3>
  283. <p>Used during encoding to add the given raw packet to the given Ogg
  284. bitstream. The contents of <tt>op</tt> are copied;
  285. <tt>ogg_stream_packetin</tt> does not retain any pointers into
  286. <tt>op</tt>'s storage. The encoding proccess buffers incoming packets
  287. until enough packets have been assembled to form an entire page;
  288. <tt>ogg_stream_pageout</tt> is used to read complete pages.</p>
  289. <p>Returns zero on success, nonzero on failure.</p>
  290. <h3>
  291. int ogg_stream_packetout(ogg_stream_state *os,ogg_packet *op);
  292. </h3>
  293. <p>Used during decoding to read raw packets from the given logical
  294. bitstream. <tt>ogg_stream_packetout</tt> will only return complete
  295. packets for which checksumming indicates no corruption. The size and
  296. contents of the packet exactly match those given in the encoding
  297. process. </p>
  298. <p>Returns zero if the next packet is not ready to be read (not buffered
  299. or incomplete), positive if it returned a complete packet in
  300. <tt>op</tt> and negative if there is a gap, extra bytes or corruption
  301. at this position in the bitstream (essentially that the bitstream had
  302. to be recaptured). A negative value is not necessarily an error. It
  303. would be a common occurence when seeking, for example, which requires
  304. recapture of the bitstream at the position decoding continued.</p>
  305. <p>If the return value is positive, <tt>ogg_stream_packetout</tt> placed
  306. a packet in <tt>op</tt>. The data in <tt>op</tt> points to static
  307. storage that is valid until the next call to
  308. <tt>ogg_stream_pagein</tt>, <tt>ogg_stream_clear</tt>,
  309. <tt>ogg_stream_reset</tt>, or <tt>ogg_stream_destroy</tt>. The
  310. pointers are not invalidated by more calls to
  311. <tt>ogg_stream_packetout</tt>.</p>
  312. <h3>
  313. int ogg_stream_pagein(ogg_stream_state *os, ogg_page *og);
  314. </h3>
  315. <p>Used during decoding to buffer the given complete, pre-verified page
  316. for decoding into raw Ogg packets. The given page must be framed,
  317. normally produced by <tt>ogg_sync_pageout</tt>, and from the logical
  318. bitstream associated with <tt>os</tt> (the serial numbers must match).
  319. The contents of the given page are copied; <tt>ogg_stream_pagein</tt>
  320. retains no pointers into <tt>og</tt> storage.</p>
  321. <p>Returns zero on success and non-zero on failure.</p>
  322. <h3>
  323. int ogg_stream_pageout(ogg_stream_state *os, ogg_page *og);
  324. </h3>
  325. <p>Used during encode to read complete pages from the stream buffer. The
  326. returned page is ready for sending out to the real world.</p>
  327. <p>Returns zero if there is no complete page ready for reading. Returns
  328. nonzero when it has placed data for a complete page into
  329. <tt>og</tt>. Note that the storage returned in og points into internal
  330. storage; the pointers in <tt>og</tt> are valid until the next call to
  331. <tt>ogg_stream_pageout</tt>, <tt>ogg_stream_packetin</tt>,
  332. <tt>ogg_stream_reset</tt>, <tt>ogg_stream_clear</tt> or
  333. <tt>ogg_stream_destroy</tt>.</p>
  334. <h3>
  335. int ogg_stream_reset(ogg_stream_state *os);
  336. </h3>
  337. <p>Resets the given stream's state to that of a blank, unused stream;
  338. this may be used during encode or decode.</p>
  339. <p>Note that if used during encode, it does not alter the stream's serial
  340. number. In addition, the next page produced during encoding will be
  341. marked as the 'initial' page of the logical bitstream.</p>
  342. <p>When used during decode, this simply clears the data buffer of any
  343. pending pages. Beginning and end of stream cues are read from the
  344. bitstream and are unaffected by reset.</p>
  345. <p>Returns zero on success and non-zero on failure. This function always
  346. succeeds.</p>
  347. <h3>
  348. char *ogg_sync_buffer(ogg_sync_state *oy, long size);
  349. </h3>
  350. <p>This call is used to buffer a raw bitstream for framing and
  351. verification. <tt>ogg_sync_buffer</tt> handles stream capture and
  352. recapture, checksumming, and division into Ogg pages (as required by
  353. <tt>ogg_stream_pagein</tt>).</p>
  354. <p><tt>ogg_sync_buffer</tt> exposes a buffer area into which the decoder
  355. copies the next (up to) <tt>size</tt> bytes. We expose the buffer
  356. (rather than taking a buffer) in order to avoid an extra copy many
  357. uses; this way, for example, <tt>read()</tt> can transfer data
  358. directly into the stream buffer without first needing to place it in
  359. temporary storage.</p>
  360. <p>Returns a pointer into <tt>oy</tt>'s internal bitstream sync buffer;
  361. the remaining space in the sync buffer is at least <tt>size</tt>
  362. bytes. The decoder need not write all of <tt>size</tt> bytes;
  363. <tt>ogg_sync_wrote</tt> is used to inform the engine how many bytes
  364. were actually written. Use of <tt>ogg_sync_wrote</tt> after writing
  365. into the exposed buffer is mandantory.</p>
  366. <h3>
  367. int ogg_sync_clear(ogg_sync_state *oy);
  368. </h3>
  369. <p><tt>ogg_sync_clear</tt>
  370. clears and deallocates the internal storage of the given Ogg sync
  371. buffer. After clearing, the sync structure is not initialized for
  372. use; <tt>ogg_sync_init</tt> must be called to reinitialize for use.
  373. Use <tt>ogg_sync_reset</tt> to reset the sync state and buffer to a
  374. fresh, intiialized state.</p>
  375. <p><tt>ogg_sync_clear</tt> does not call <tt>free()</tt> on the pointer
  376. <tt>oy</tt>, allowing use of this call on sync structures in static
  377. or automatic storage. <tt>ogg_sync_destroy</tt>is a complimentary
  378. function that frees the pointer as well.</p>
  379. <p>Returns zero on success and non-zero on failure. This function always
  380. succeeds.</p>
  381. <h3>
  382. int ogg_sync_destroy(ogg_sync_state *oy);
  383. </h3>
  384. <p>Clears and deallocates the internal storage of the given Ogg sync
  385. buffer, then frees the storage associated with the pointer
  386. <tt>oy</tt>.</p>
  387. <p>An alternative function,<tt>ogg_sync_clear</tt>, does not call
  388. <tt>free()</tt> on the pointer <tt>oy</tt>, allowing use of that call on
  389. stream structures in static or automatic storage.</p>
  390. <p>Returns zero on success and non-zero on failure. This function always
  391. succeeds.</p>
  392. <h3>
  393. int ogg_sync_init(ogg_sync_state *oy);
  394. </h3>
  395. <p>Initializes the sync buffer <tt>oy</tt> for use.</p>
  396. <p>Returns zero on success and non-zero on failure. This function always
  397. succeeds.</p>
  398. <h3>
  399. int ogg_sync_pageout(ogg_sync_state *oy, ogg_page *og);
  400. </h3>
  401. <p>Reads complete, framed, verified Ogg pages from the sync buffer,
  402. placing the page data in <tt>og</tt>.</p>
  403. <p>Returns zero when there's no complete pages buffered for
  404. retrieval. Returns negative when a loss of sync or recapture occurred
  405. (this is not necessarily an error; recapture would be required after
  406. seeking, for example). Returns positive when a page is returned in
  407. <tt>og</tt>. Note that the data in <tt>og</tt> points into the sync
  408. buffer storage; the pointers are valid until the next call to
  409. <tt>ogg_sync_buffer</tt>, <tt>ogg_sync_clear</tt>,
  410. <tt>ogg_sync_destroy</tt> or <tt>ogg_sync_reset</tt>.</p>
  411. <h3>
  412. int ogg_sync_reset(ogg_sync_state *oy);
  413. </h3>
  414. <p><tt>ogg_sync_reset</tt> resets the sync state in <tt>oy</tt> to a
  415. clean, empty state. This is useful, for example, when seeking to a
  416. new location in a bitstream.</p>
  417. <p>Returns zero on success, nonzero on failure.</p>
  418. <h3>
  419. int ogg_sync_wrote(ogg_sync_state *oy, long bytes);
  420. </h3>
  421. <p>Used to inform the sync state as to how many bytes were actually
  422. written into the exposed sync buffer. It must be equal to or less
  423. than the size of the buffer requested.</p>
  424. <p>Returns zero on success and non-zero on failure; failure occurs only
  425. when the number of bytes written were larger than the buffer.</p>
  426. <div id="copyright">
  427. The Xiph Fish Logo is a
  428. trademark (&trade;) of Xiph.Org.<br/>
  429. These pages &copy; 1994 - 2005 Xiph.Org. All rights reserved.
  430. </div>
  431. </body>
  432. </html>