programming.html 18 KB


  1. <HTML><HEAD><TITLE>xiph.org: Ogg Vorbis documentation</TITLE>
  2. <BODY bgcolor="#ffffff" text="#202020" link="#006666" vlink="#000000">
  3. <nobr><img src="white-ogg.png"><img src="vorbisword2.png"></nobr><p>
  4. <h1><font color=#000070>
  5. Programming with Xiphophorus <tt>libvorbis</tt>
  6. </font></h1>
  7. <em>Last update to this document: July 22, 1999</em><br>
  8. <h2>Description</h2>
  9. Libvorbis is Xiphophorus's portable Ogg Vorbis CODEC implemented as a
  10. programmatic library. Libvorbis provides primitives to handle framing
  11. and manipulation of Ogg bitstreams (used by the Vorbis for
  12. streaming), a full analysis (encoding) interface as well as packet
  13. decoding and synthesis for playback. <p>
  14. The libvorbis library does not provide any system interface; a
  15. full-featured demonstration player included with the library
  16. distribtion provides example code for a variety of system interfaces
  17. as well as a working example of using libvorbis in production code.
  18. <h2>Encoding Overview</h2>
  19. <h2>Decoding Overview</h2>
  20. Decoding a bitstream with libvorbis follows roughly the following
  21. steps:
  22. <ol>
  23. <li>Frame the incoming bitstream into pages
  24. <li>Sort the pages by logical bitstream and buffer then into logical streams
  25. <li>Decompose the logical streams into raw packets
  26. <li>Reconstruct segments of the original data from each packet
  27. <li>Glue the reconstructed segments back into a decoded stream
  28. </ol>
  29. <h3>Framing</h3>
  30. An Ogg bitstream is logically arranged into pages, but to decode
  31. the pages, we have to find them first. The raw bitstream is first fed
  32. into an <tt>ogg_sync_state</tt> buffer using <tt>ogg_sync_buffer()</tt>
  33. and <tt>ogg_sync_wrote()</tt>. After each block we submit to the sync
  34. buffer, we should check to see if we can frame and extract a complete
  35. page or pages using <tt>ogg_sync_pageout()</tt>. Extra pages are
  36. buffered; allowing them to build up in the <tt>ogg_sync_state</tt>
  37. buffer will eventually exhaust memory.<p>
  38. The Ogg pages returned from <tt>ogg_sync_pageout</tt> need not be
  39. decoded further to be used as landmarks in seeking; seeking can be
  40. either a rough process of simply jumping to approximately intuited
  41. portions of the bitstream, or it can be a precise bisection process
  42. that captures pages and inspects data position. When seeking,
  43. however, sequential multiplexing (chaining) must be accounted for;
  44. beginning play in a new logical bitstream requires initializing a
  45. synthesis engine with the headers from that bitstream. Vorbis
  46. bitstreams do not make use of concurent multiplexing (grouping).<p>
  47. <h3>Sorting</h3>
  48. The pages produced by <tt>ogg_sync_pageout</tt> are then sorted by
  49. serial number to seperate logical bitstreams. Initialize logical
  50. bitstream buffers (<tt>og_stream_state</tt>) using
  51. <tt>ogg_stream_init()</tt>. Pages are submitted to the matching
  52. logical bitstream buffer using <tt>ogg_stream_pagein</tt>; the serial
  53. number of the page and the stream buffer must match, or the page will
  54. be rejected. A page submitted out of sequence will simply be noted,
  55. and in the course of outputting packets, the hole will be flagged
  56. (<tt>ogg_sync_pageout</tt> and <tt>ogg_stream_packetout</tt> will
  57. return a negative value at positions where they had to recapture the
  58. stream).
  59. <h3>Extracting packets</h3>
  60. After submitting page[s] to a logical stream, read available packets
  61. using <tt>ogg_stream_packetout</tt>.
  62. <h3>Decoding packets</h3>
  63. <h3>Reassembling data segments</h3>
  64. <h2>Ogg Bitstream Manipulation Structures</h3>
  65. Two of the Ogg bitstream data structures are intended to be
  66. transparent to the developer; the fields should be used directly.<p>
  67. <h3>ogg_packet</h3>
  68. <pre>
  69. typedef struct {
  70. unsigned char *packet;
  71. long bytes;
  72. long b_o_s;
  73. long e_o_s;
  74. size64 frameno;
  75. } ogg_packet;
  76. </pre>
  77. <dl>
  78. <dt>packet: <dd>a pointer to the byte data of the raw packet
  79. <dt>bytes: <dd>the size of the packet' raw data
  80. <dt>b_o_s: <dd>beginning of stream; nonzero if this is the first packet of
  81. the logical bitstream
  82. <dt>e_o_s: <dd>end of stream; nonzero if this is the last packet of the
  83. logical bitstream
  84. <dt>frameno: <dd>the absolute position of this packet in the original
  85. uncompressed data stream.
  86. </dl>
  87. <h4>encoding notes</h4> The encoder is responsible for setting all of
  88. the fields of the packet to appropriate values before submission to
  89. <tt>ogg_stream_packetin()</tt>; however, it is noted that the value in
  90. <tt>b_o_s</tt> is ignored; the first page produced from a given
  91. <tt>ogg_stream_state</tt> structure will be stamped as the initial
  92. page. <tt>e_o_s</tt>, however, must be set; this is the means by
  93. which the stream encoding primitives handle end of stream and cleanup.
  94. <h4>decoding notes</h4><tt>ogg_stream_packetout()</tt> sets the fields
  95. to appropriate values. Note that frameno will be >= 0 only in the
  96. case that the given packet actually represents that position (ie, only
  97. the last packet completed on any page will have a meaningful
  98. <tt>frameno</tt>). Intervening frames will see <tt>frameno</tt> set
  99. to -1.
  100. <h3>ogg_page</h3>
  101. <pre>
  102. typedef struct {
  103. unsigned char *header;
  104. long header_len;
  105. unsigned char *body;
  106. long body_len;
  107. } ogg_page;
  108. </pre>
  109. <dl>
  110. <dt>header: <dd>pointer to the page header data
  111. <dt>header_len: <dd>length of the page header in bytes
  112. <dt>body: <dd>pointer to the page body
  113. <dt>body_len: <dd>length of the page body
  114. </dl>
  115. Note that although the <tt>header</tt> and <tt>body</tt> pointers do
  116. not necessarily point into a single contiguous page vector, the page
  117. body must immediately follow the header in the bitstream.<p>
  118. <h2>Ogg Bitstream Manipulation Functions</h3>
  119. <h3>
  120. int ogg_page_bos(ogg_page *og);
  121. </h3>
  122. Returns the 'beginning of stream' flag for the given Ogg page. The
  123. beginning of stream flag is set on the initial page of a logical
  124. bitstream.<P>
  125. Zero indicates the flag is cleared (this is not the initial page of a
  126. logical bitstream). Nonzero indicates the flag is set (this is the
  127. initial page of a logical bitstream).<p>
  128. <h3>
  129. int ogg_page_continued(ogg_page *og);
  130. </h3>
  131. Returns the 'packet continued' flag for the given Ogg page. The packet
  132. continued flag indicates whether or not the body data of this page
  133. begins with packet continued from a preceeding page.<p>
  134. Zero (unset) indicates that the body data begins with a new packet.
  135. Nonzero (set) indicates that the first packet data on the page is a
  136. continuation from the preceeding page.
  137. <h3>
  138. int ogg_page_eos(ogg_page *og);
  139. </h3>
  140. Returns the 'end of stream' flag for a give Ogg page. The end of page
  141. flag is set on the last (terminal) page of a logical bitstream.<p>
  142. Zero (unset) indicates that this is not the last page of a logical
  143. bitstream. Nonzero (set) indicates that this is the last page of a
  144. logical bitstream and that no addiitonal pages belonging to this
  145. bitstream may follow.<p>
  146. <h3>
  147. size64 ogg_page_frameno(ogg_page *og);
  148. </h3>
  149. Returns the position of this page as an absolute position within the
  150. original uncompressed data. The position, as returned, is 'frames
  151. encoded to date up to and including the last whole packet on this
  152. page'. Partial packets begun on this page but continued to the
  153. following page are not included. If no packet ends on this page, the
  154. frame position value will be equal to the frame position value of the
  155. preceeding page. If none of the original uncompressed data is yet
  156. represented in the logical bitstream (for example, the first page of a
  157. bitstream consists only of a header packet; this packet encodes only
  158. metadata), the value shall be zero.<p>
  159. The units of the framenumber are determined by media mapping. A
  160. vorbis audio bitstream, for example, defines one frame to be the
  161. channel values from a single sampling period (eg, a 16 bit stereo
  162. bitstream consists of two samples of two bytes for a total of four
  163. bytes, thus a frame would be four bytes). A video stream defines one
  164. frame to be a single frame of video.<p>
  165. <h3>
  166. int ogg_page_pageno(ogg_page *og);
  167. </h3>
  168. Returns the sequential page number of the given Ogg page. The first
  169. page in a logical bitstream is numbered zero; following pages are
  170. numbered in increasing monotonic order.<p>
  171. <h3>
  172. int ogg_page_serialno(ogg_page *og);
  173. </h3>
  174. Returns the serial number of the given Ogg page. The serial number is
  175. used as a handle to distinguish various logical bitstreams in a
  176. physical Ogg bitstresm. Every logical bitstream within a
  177. physical bitstream must use a unique (within the scope of the physical
  178. bitstream) serial number, which is stamped on all bitstream pages.<p>
  179. <h3>
  180. int ogg_page_version(ogg_page *og);
  181. </h3>
  182. Returns the revision of the Ogg bitstream structure of the given page.
  183. Currently, the only permitted number is zero. Later revisions of the
  184. bitstream spec will increment this version should any changes be
  185. incompatable.</p>
  186. <h3>
  187. int ogg_stream_clear(ogg_stream_state *os);
  188. </h3>
  189. Clears and deallocates the internal storage of the given Ogg stream.
  190. After clearing, the stream structure is not initialized for use;
  191. <tt>ogg_stream_init</tt> must be called to reinitialize for use.
  192. Use <tt>ogg_stream_reset</tt> to reset the stream state
  193. to a fresh, intiialized state.<p>
  194. <tt>ogg_stream_clear</tt> does not call <tt>free()</tt> on the pointer
  195. <tt>os</tt>, allowing use of this call on stream structures in static
  196. or automatic storage. <tt>ogg_stream_destroy</tt>is a complimentary
  197. function that frees the pointer as well.<p>
  198. Returns zero on success and non-zero on failure. This function always
  199. succeeds.<p>
  200. <h3>
  201. int ogg_stream_destroy(ogg_stream_state *os);
  202. </h3>
  203. Clears and deallocates the internal storage of the given Ogg stream,
  204. then frees the storage associated with the pointer <tt>os</tt>.<p>
  205. <tt>ogg_stream_clear</tt> does not call <tt>free()</tt> on the pointer
  206. <tt>os</tt>, allowing use of that call on stream structures in static
  207. or automatic storage.<p>
  208. Returns zero on success and non-zero on failure. This function always
  209. succeeds.<p>
  210. <h3>
  211. int ogg_stream_init(ogg_stream_state *os,int serialno);
  212. </h3>
  213. Initialize the storage associated with <tt>os</tt> for use as an Ogg
  214. stream. This call is used to initialize a stream for both encode and
  215. decode. The given serial number is the serial number that will be
  216. stamped on pages of the produced bitstream (during encode), or used as
  217. a check that pages match (during decode).<p>
  218. Returns zero on success, nonzero on failure.<p>
  219. <h3>
  220. int ogg_stream_packetin(ogg_stream_state *os, ogg_packet *op);
  221. </h3>
  222. Used during encoding to add the given raw packet to the given Ogg
  223. bitstream. The contents of <tt>op</tt> are copied;
  224. <tt>ogg_stream_packetin</tt> does not retain any pointers into
  225. <tt>op</tt>'s storage. The encoding proccess buffers incoming packets
  226. until enough packets have been assembled to form an entire page;
  227. <tt>ogg_stream_pageout</tt> is used to read complete pages.<p>
  228. Returns zero on success, nonzero on failure.<p>
  229. <h3>
  230. int ogg_stream_packetout(ogg_stream_state *os,ogg_packet *op);
  231. </h3>
  232. Used during decoding to read raw packets from the given logical
  233. bitstream. <tt>ogg_stream_packetout</tt> will only return complete
  234. packets for which checksumming indicates no corruption. The size and
  235. contents of the packet exactly match those given in the encoding
  236. process. <p>
  237. Returns zero if the next packet is not ready to be read (not buffered
  238. or incomplete), positive if it returned a complete packet in
  239. <tt>op</tt> and negative if there is a gap, extra bytes or corruption
  240. at this position in the bitstream (essentially that the bitstream had
  241. to be recaptured). A negative value is not necessarily an error. It
  242. would be a common occurence when seeking, for example, which requires
  243. recapture of the bitstream at the position decoding continued.<p>
  244. Iff the return value is positive, <tt>ogg_stream_packetout</tt> placed
  245. a packet in <tt>op</tt>. The data in <t>op</tt> points to static
  246. storage that is valid until the next call to
  247. <tt>ogg_stream_pagein</tt>, <tt>ogg_stream_clear</tt>,
  248. <tt>ogg_stream_reset</tt>, or <tt>ogg_stream_destroy</tt>. The
  249. pointers are not invalidated by more calls to
  250. <tt>ogg_stream_packetout</tt>.<p>
  251. <h3>
  252. int ogg_stream_pagein(ogg_stream_state *os, ogg_page *og);
  253. </h3>
  254. Used during decoding to buffer the given complete, pre-verified page
  255. for decoding into raw Ogg packets. The given page must be framed,
  256. normally produced by <tt>ogg_sync_pageout</tt>, and from the logical
  257. bitstream associated with <tt>os</tt> (the serial numbers must match).
  258. The contents of the given page are copied; <tt>ogg_stream_pagein</tt>
  259. retains no pointers into <tt>og</tt> storage.<p>
  260. Returns zero on success and non-zero on failure.<p>
  261. <h3>
  262. int ogg_stream_pageout(ogg_stream_state *os, ogg_page *og);
  263. </h3>
  264. Used during encode to read complete pages from the stream buffer. The
  265. returned page is ready for sending out to the real world.<p>
  266. Returns zero if there is no complete page ready for reading. Returns
  267. nonzero when it has placed data for a complete page into
  268. <tt>og</tt>. Note that the storage returned in og points into internal
  269. storage; the pointers in <tt>og</tt> are valid until the next call to
  270. <tt>ogg_stream_pageout</tt>, <tt>ogg_stream_packetin</tt>,
  271. <tt>ogg_stream_reset</tt>, <tt>ogg_stream_clear</tt> or
  272. <tt>ogg_stream_destroy</tt>.
  273. <h3>
  274. int ogg_stream_reset(ogg_stream_state *os);
  275. </h3>
  276. Resets the given stream's state to that of a blank, unused stream;
  277. this may be used during encode or decode. <p>
  278. Note that if used during encode, it does not alter the stream's serial
  279. number. In addition, the next page produced during encoding will be
  280. marked as the 'initial' page of the logical bitstream.<p>
  281. When used during decode, this simply clears the data buffer of any
  282. pending pages. Beginning and end of stream cues are read from the
  283. bitstream and are unaffected by reset.<p>
  284. Returns zero on success and non-zero on failure. This function always
  285. succeeds.<p>
  286. <h3>
  287. char *ogg_sync_buffer(ogg_sync_state *oy, long size);
  288. </h3>
  289. This call is used to buffer a raw bitstream for framing and
  290. verification. <tt>ogg_sync_buffer</tt> handles stream capture and
  291. recapture, checksumming, and division into Ogg pages (as required by
  292. <tt>ogg_stream_pagein</tt>).<p>
  293. <tt>ogg_sync_buffer</tt> exposes a buffer area into which the decoder
  294. copies the next (up to) <tt>size</tt> bytes. We expose the buffer
  295. (rather than taking a buffer) in order to avoid an extra copy many
  296. uses; this way, for example, <tt>read()</tt> can transfer data
  297. directly into the stream buffer without first needing to place it in
  298. temporary storage.<p>
  299. Returns a pointer into <tt>oy</tt>'s internal bitstream sync buffer;
  300. the remaining space in the sync buffer is at least <tt>size</tt>
  301. bytes. The decoder need not write all of <tt>size</tt> bytes;
  302. <tt>ogg_sync_wrote</tt> is used to inform the engine how many bytes
  303. were actually written. Use of <tt>ogg_sync_wrote</tt> after writing
  304. into the exposed buffer is mandantory.<p>
  305. <h3>
  306. int ogg_sync_clear(ogg_sync_state *oy);
  307. </h3>
  308. <tt>ogg_sync_clear</tt>
  309. Clears and deallocates the internal storage of the given Ogg sync
  310. buffer. After clearing, the sync structure is not initialized for
  311. use; <tt>ogg_sync_init</tt> must be called to reinitialize for use.
  312. Use <tt>ogg_sync_reset</tt> to reset the sync state and buffer to a
  313. fresh, intiialized state.<p>
  314. <tt>ogg_sync_clear</tt> does not call <tt>free()</tt> on the pointer
  315. <tt>oy</tt>, allowing use of this call on sync structures in static
  316. or automatic storage. <tt>ogg_sync_destroy</tt>is a complimentary
  317. function that frees the pointer as well.<p>
  318. Returns zero on success and non-zero on failure. This function always
  319. succeeds.<p>
  320. <h3>
  321. int ogg_sync_destroy(ogg_sync_state *oy);
  322. </h3>
  323. Clears and deallocates the internal storage of the given Ogg sync
  324. buffer, then frees the storage associated with the pointer
  325. <tt>oy</tt>.<p>
  326. <tt>ogg_sync_clear</tt> does not call <tt>free()</tt> on the pointer
  327. <tt>oy</tt>, allowing use of that call on stream structures in static
  328. or automatic storage.<p>
  329. Returns zero on success and non-zero on failure. This function always
  330. succeeds.<p>
  331. <h3>
  332. int ogg_sync_init(ogg_sync_state *oy);
  333. </h3>
  334. Initializes the sync buffer <tt>oy</tt> for use.<p>
  335. Returns zero on success and non-zero on failure. This function always
  336. succeeds.<p>
  337. <h3>
  338. int ogg_sync_pageout(ogg_sync_state *oy, ogg_page *og);
  339. </h3>
  340. Reads complete, framed, verified Ogg pages from the sync buffer,
  341. placing the page data in <tt>og</tt>.<p>
  342. Returns zero when there's no complete pages buffered for
  343. retrieval. Returns negative when a loss of sync or recapture occurred
  344. (this is not necessarily an error; recapture would be required after
  345. seeking, for example). Returns positive when a page is returned in
  346. <tt>og</tt>. Note that the data in <tt>og</tt> points into the sync
  347. buffer storage; the pointers are valid until the next call to
  348. <tt>ogg_sync_buffer</tt>, <tt>ogg_sync_clear</tt>,
  349. <tt>ogg_sync_destroy</tt> or <tt>ogg_sync_reset</tt>.
  350. <h3>
  351. int ogg_sync_reset(ogg_sync_state *oy);
  352. </h3>
  353. <tt>ogg_sync_reset</tt> resets the sync state in <tt>oy</tt> to a
  354. clean, empty state. This is useful, for example, when seeking to a
  355. new location in a bitstream.<p>
  356. Returns zero on success, nonzero on failure.<p>
  357. <h3>
  358. int ogg_sync_wrote(ogg_sync_state *oy, long bytes);
  359. </h3>
  360. Used to inform the sync state as to how many bytes were actually
  361. written into the exposed sync buffer. It must be equal to or less
  362. than the size of the buffer requested.<p>
  363. Returns zero on success and non-zero on failure; failure occurs only
  364. when the number of bytes written were larger than the buffer.<p>
  365. <hr>
  366. <a href="http://www.xiph.org/">
  367. <img src="white-xifish.png" align=left border=0>
  368. </a>
  369. <font size=-2 color=#505050>
  370. Ogg is a <a href="http://www.xiph.org">Xiphophorus</a> effort to
  371. protect essential tenets of Internet multimedia from corporate
  372. hostage-taking; Open Source is the net's greatest tool to keep
  373. everyone honest. See <a href="http://www.xiph.org/about.html">About
  374. Xiphophorus</a> for details.
  375. <p>
  376. Ogg Vorbis is the first Ogg audio CODEC. Anyone may
  377. freely use and distribute the Ogg and Vorbis specification,
  378. whether in a private, public or corporate capacity. However,
  379. Xiphophorus and the Ogg project (xiph.org) reserve the right to set
  380. the Ogg/Vorbis specification and certify specification compliance.<p>
  381. Xiphophorus's Vorbis software CODEC implementation is distributed
  382. under the Lesser/Library GNU Public License. This does not restrict
  383. third parties from distributing independent implementations of Vorbis
  384. software under other licenses.<p>
  385. OggSquish, Vorbis, Xiphophorus and their logos are trademarks (tm) of
  386. <a href="http://www.xiph.org/">Xiphophorus</a>. These pages are
  387. copyright (C) 1994-2000 Xiphophorus. All rights reserved.<p>
  388. </body>