draft-kerr-avt-vorbis-rtp-03.txt 36 KB


  1. Network Working Group Phil Kerr
  2. Internet-Draft Ogg Vorbis Community
  3. October 27, 2003 OpenDrama
  4. Expires: April 27, 2003
  5. RTP Payload Format for Vorbis Encoded Audio
  6. <draft-kerr-avt-vorbis-rtp-03.txt>
  7. Status of this Memo
  8. This document is an Internet-Draft and is in full conformance
  9. with all provisions of Section 10 of RFC2026.
  10. Internet-Drafts are working documents of the Internet Engineering
  11. Task Force (IETF), its areas, and its working groups. Note that
  12. other groups may also distribute working documents as
  13. Internet-Drafts.
  14. Internet-Drafts are draft documents valid for a maximum of six
  15. months and may be updated, replaced, or obsoleted by other
  16. documents at any time. It is inappropriate to use Internet-
  17. Drafts as reference material or to cite them other than as
  18. "work in progress".
  19. The list of current Internet-Drafts can be accessed at
  20. http://www.ietf.org/ietf/1id-abstracts.txt
  21. The list of Internet-Draft Shadow Directories can be accessed at
  22. http://www.ietf.org/shadow.html.
  23. Copyright Notice
  24. Copyright (C) The Internet Society (2003). All Rights Reserved.
  25. Abstract
  26. This document describes a RTP payload format for transporting
  27. Vorbis encoded audio. It details the RTP encapsulation mechanism
  28. for raw Vorbis data and details the delivery mechanisms for the
  29. decoder probability model, referred to as a codebook, metadata
  30. and other setup information.
  31. [Note to RFC Editor: All references to RFC XXXX are to be replaced
  32. by references to the RFC number of this memo, when published.]
  33. Kerr Expires April 27, 2003 [Page 1]
  34. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  35. Table of Contents
  36. 1. Introduction ........................................ 2
  37. 1.1 Terminology ......................................... 2
  38. 2. Payload Format ...................................... 3
  39. 2.1 RTP Header .......................................... 3
  40. 2.2 Payload Header ...................................... 4
  41. 2.3 Payload Data ........................................ 5
  42. 2.4 Example RTP Packet .................................. 5
  43. 3. Frame Packetizing ................................... 6
  44. 3.1 Example Fragmented Vorbis Packet .................... 6
  45. 3.2 Packet Loss ......................................... 8
  46. 4. Configuration Headers ............................... 8
  47. 4.1 RTCP Based Config Header Transmission ............... 9
  48. 4.2 Codebook Caching .................................... 11
  49. 5. Session Description ................................. 11
  50. 5.1 SDP Based Config Header Transmission ................ 12
  51. 6. IANA Considerations ................................. 13
  52. 7. Congestion Control .................................. 13
  53. 8. Security Considerations ............................. 14
  54. 9. Acknowledgements .................................... 14
  55. 10. Normative References ................................ 14
  56. 10.1 Informative References ................................ 15
  57. 11. Full Copyright Statement ............................ 15
  58. 11.1 IPR Statement ....................................... 15
  59. 12. Authors Address ..................................... 15
  60. 1 Introduction
  61. Vorbis is a general purpose perceptual audio codec intended to allow
  62. maximum encoder flexibility, thus allowing it to scale competitively
  63. over an exceptionally wide range of bitrates. At the high
  64. quality/bitrate end of the scale (CD or DAT rate stereo,
  65. 16/24 bits), it is in the same league as MPEG-2 and MPC. Similarly,
  66. the 1.0 encoder can encode high-quality CD and DAT rate stereo at
  67. below 48k bits/sec without resampling to a lower rate. Vorbis is
  68. also intended for lower and higher sample rates (from 8kHz
  69. telephony to 192kHz digital masters) and a range of channel
  70. representations (monaural, polyphonic, stereo, quadraphonic, 5.1,
  71. ambisonic, or up to 255 discrete channels).
  72. Vorbis encoded audio is generally encapsulated within an Ogg format
  73. bitstream [1], which provides framing and synchronization. For the
  74. purposes of RTP transport, this layer is unnecessary, and so raw
  75. Vorbis packets are used in the payload.
  76. 1.1 Terminology
  77. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
  78. "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
  79. document are to be interpreted as described in RFC 2119 [2].
  80. Kerr Expires April 27, 2003 [Page 2]
  81. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  82. 2 Payload Format
  83. For RTP based transportation of Vorbis encoded audio the standard
  84. RTP header is followed by an 8 bit payload header, then the payload
  85. data. The payload header is used to signify if the following packet
  86. contains fragmented Vorbis data and/or the the number of whole Vorbis
  87. data frames. The payload data contains the raw Vorbis bitstream
  88. information.
  89. 2.1 RTP Header
  90. 0 1 2 3
  91. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  92. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  93. |V=2|P|X| CC |M| PT | sequence number |
  94. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  95. | timestamp |
  96. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  97. | synchronization source (SSRC) identifier |
  98. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  99. | contributing source (CSRC) identifiers |
  100. | ... |
  101. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  102. The RTP header begins with an octet of fields (V, P, X, and CC) to
  103. support specialized RTP uses (see [4] and [5] for details). For
  104. Vorbis RTP, the following values are used.
  105. Version (V): 2 bits
  106. This field identifies the version of RTP. The version
  107. used by this specification is two (2).
  108. Padding (P): 1 bit
  109. Padding MAY be used with this payload format according to
  110. section 5.1 of [3].
  111. Extension (X): 1 bit
  112. Always set to 0, as audio silence suppression is not used by
  113. the Vorbis codec.
  114. CSRC count (CC): 4 bits
  115. The CSRC count is used in accordance with [3].
  116. Kerr Expires April 27, 2003 [Page 3]
  117. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  118. Marker (M): 1 bit
  119. Set to zero. Audio silence suppression not used. This conforms
  120. to section 4.1 of [6].
  121. Payload Type (PT): 7 bits
  122. An RTP profile for a class of applications is expected to assign
  123. a payload type for this format, or a dynamically allocated
  124. payload type SHOULD be chosen which designates the payload as
  125. Vorbis.
  126. Sequence number: 16 bits
  127. The sequence number increments by one for each RTP data packet
  128. sent, and may be used by the receiver to detect packet loss and
  129. to restore packet sequence. This field is detailed further in
  130. [3].
  131. Timestamp: 32 bits
  132. A timestamp representing the sampling time of the first sample of
  133. the first Vorbis packet in the RTP packet. The clock frequency
  134. MUST be set to the sample rate of the encoded audio data and is
  135. conveyed out-of-band as a SDP attribute.
  136. SSRC/CSRC identifiers:
  137. These two fields, 32 bits each with one SSRC field and a maximum
  138. of 16 CSRC fields, are as defined in [3].
  139. 2.2 Payload Header
  140. After the RTP Header section the next octet is the Payload Header.
  141. This octet is split into a number of bitfields detailing the format
  142. of the following Payload Data packets.
  143. 0 1 2 3 4 5 6 7
  144. +---+---+---+---+---+---+---+---+
  145. | C | F | R | # of packets |
  146. +---+---+---+---+---+---+---+---+
  147. Continuation (C): 1 bit
  148. Set to one if this is a continuation of a fragmented packet.
  149. Fragmented (F): 1 bit
  150. Set to one if the payload contains complete packets or if it
  151. contains the last fragment of a fragmented packet.
  152. Reserved (R): 1 bit
  153. Reserved, MUST be set to zero by senders, and ignored by
  154. receivers.
  155. The last 5 bits are the number of complete packets in this payload.
  156. This provides for a maximum number of 32 Vorbis packets in the
  157. payload. If C is set to one, this number MUST be 0.
  158. Kerr Expires April 27, 2003 [Page 4]
  159. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  160. 2.3 Payload Data
  161. Vorbis packets are unbounded in length currently. At some future
  162. point there will likely be a practical limit placed on packet
  163. length.
  164. Typical Vorbis packet sizes are from very small (2-3 bytes) to
  165. quite large (8-12 kilobytes). The reference implementation [11]
  166. typically produces packets less than ~800 bytes, except for the
  167. header packets which are ~4-12 kilobytes.
  168. Within a RTP context the maximum Vorbis packet SHOULD be kept below
  169. the MTU size, typically 1500 octets, including the RTP and payload
  170. headers, to avoid fragmentation. For the delivery of Vorbis audio
  171. using RTP the maximum size of the header block is limited to 64K.
  172. If the payload contains a single Vorbis packet or a Vorbis packet
  173. fragment, the Vorbis packet data follows the payload header.
  174. For payloads which consist of multiple Vorbis packets, payload data
  175. consists of one octet representing the packet length followed by
  176. the packet data for each of the Vorbis packets in the payload.
  177. The Vorbis packet length field is the length of the Vorbis data
  178. block minus one octet.
  179. The payload packing of the Vorbis data packets SHOULD follow the
  180. guidelines set-out in section 4.4 of [5] where the oldest packet
  181. occurs immediately after the RTP packet header.
  182. Channel mapping of the audio is in accordance with BS. 775-1
  183. ITU-R.
  184. 2.4 Example RTP Packet
  185. Here is an example RTP packet containing two Vorbis packets.
  186. RTP Packet Header:
  187. 0 1 2 3
  188. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  189. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  190. | 2 |0|0| 0 |0| PT | sequence number |
  191. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  192. | timestamp (in sample rate units) |
  193. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  194. | synchronisation source (SSRC) identifier |
  195. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  196. | contributing source (CSRC) identifiers |
  197. | ... |
  198. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  199. Kerr Expires April 27, 2003 [Page 5]
  200. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  201. Payload Data:
  202. 0 1 2 3
  203. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  204. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  205. |0|1|0| # pks: 2| len | vorbis data ... |
  206. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  207. | ...vorbis data... |
  208. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  209. | ... | len | next vorbis packet data... |
  210. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  211. 3 Frame Packetizing
  212. Each RTP packet contains either one complete Vorbis packet, one
  213. Vorbis packet fragment, or an integer number of complete Vorbis
  214. packets (up to a max of 32 packets, since the number of packets
  215. is defined by a 5 bit value).
  216. Any Vorbis packet that is larger than 256 octets and less than the
  217. path-MTU MUST be placed in a RTP packet by itself.
  218. Any Vorbis packet that is 256 bytes or less SHOULD be bundled in the
  219. RTP packet with as many Vorbis packets as will fit, up to a maximum
  220. of 32.
  221. If a Vorbis packet will not fit within the network MTU, it SHOULD be
  222. fragmented. A fragmented packet has a zero in the last five bits
  223. of the payload header. Each fragment after the first will also set
  224. the Continued (C) bit to one in the payload header. The RTP packet
  225. containing the last fragment of the Vorbis packet will have the
  226. Fragmented (F) bit set to one. To maintain the correct sequence
  227. for fragmented packet reception the timestamp field of fragmented
  228. packets MUST be the same as the first packet sent, with the sequence
  229. number incremented as normal for the subsequent RTP packets. Path
  230. MTU is detailed in [9] and [10].
  231. 3.1 Example Fragmented Vorbis Packet
  232. Here is an example fragmented Vorbis packet split over three RTP
  233. packets.
  234. Kerr Expires April 27, 2003 [Page 6]
  235. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  236. Packet 1:
  237. 0 1 2 3
  238. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  239. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  240. |V=2|P|X| CC |M| PT | 1000 |
  241. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  242. | xxxxx |
  243. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  244. | synchronization source (SSRC) identifier |
  245. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  246. | contributing source (CSRC) identifiers |
  247. | ... |
  248. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  249. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  250. |0|0|0| 0| len | vorbis data .. |
  251. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  252. | ..vorbis data.. |
  253. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  254. In this packet the initial sequence number is 1000 and the
  255. timestamp is xxxxx. The number of packets field is set to 0.
  256. Packet 2:
  257. 0 1 2 3
  258. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  259. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  260. |V=2|P|X| CC |M| PT | 1001 |
  261. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  262. | xxxxx |
  263. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  264. | synchronization source (SSRC) identifier |
  265. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  266. | contributing source (CSRC) identifiers |
  267. | ... |
  268. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  269. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  270. |1|0|0| 0| len | vorbis data ... |
  271. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  272. | ..vorbis data.. |
  273. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  274. The C bit is set to 1 and the number of packets field is set to 0.
  275. For large Vorbis fragments there can be several of these type of
  276. payload packets. The maximum packet size SHOULD be no greater
  277. than the path MTU, including all RTP and payload headers. The
  278. sequence number has been incremented by one but the timestamp field
  279. remains the same as the initial packet.
  280. Kerr Expires April 27, 2003 [Page 7]
  281. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  282. Packet 3:
  283. 0 1 2 3
  284. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  285. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  286. |V=2|P|X| CC |M| PT | 1002 |
  287. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  288. | xxxxx |
  289. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  290. | synchronization source (SSRC) identifier |
  291. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  292. | contributing source (CSRC) identifiers |
  293. | ... |
  294. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  295. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  296. |1|1|0| 0| len | vorbis data .. |
  297. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  298. | ..vorbis data.. |
  299. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  300. This is the last Vorbis fragment packet. The C and F bits are
  301. set and the packet count remains set to 0. As in the previous
  302. packets the timestamp remains set to the first packet in the
  303. sequence and the sequence number has been incremented.
  304. 3.2 Packet Loss
  305. As there is no error correction within the Vorbis stream, packet
  306. loss will result in a loss of signal. Packet loss is more of an
  307. issue for fragmented Vorbis packets as the client will have to
  308. cope with the handling of the C and F flags. If we use the
  309. fragmented Vorbis packet example above and the first packet is
  310. lost the client SHOULD detect that the next packet has the packet
  311. count field set to 0 and the C bit is set and MUST drop it. The
  312. next packet, which is the final fragmented packet, SHOULD be dropped
  313. in the same manner, or buffered. Feedback reports on lost and
  314. dropped packets MUST be sent back via RTCP.
  315. 4 Configuration Headers
  316. To decode a Vorbis stream three configuration header blocks are
  317. needed. The first header indicates the sample and bitrates, the
  318. number of channels and the version of the Vorbis encoder used.
  319. The second header contains the decoders probability model, or
  320. codebooks and the third header details stream metadata.
  321. Kerr Expires April 27, 2003 [Page 8]
  322. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  323. As the RTP stream may change certain configuration data mid-session
  324. there are two different methods for delivering this configuration
  325. data to a client, RTCP which is detailed below and SDP which is
  326. detailed in section 5. SDP delivery is used to set-up an initial
  327. state for the client application and RTCP is used to change state
  328. during the session. The changes may be due to different metadata
  329. or codebooks as well as different bitrates of the stream.
  330. Unlike other mainstream audio codecs Vorbis has no statically
  331. configured probability model, instead it packs all entropy decoding
  332. configuration, VQ and Huffman models into a self-contained codebook.
  333. This codebook block also requires additional identification
  334. information detailing the number of audio channels, bit rates and
  335. other information used to initialise the Vorbis stream.
  336. 4.1 RTCP Based Header Transmission
  337. The three header data blocks are sent out-of-band as an APP defined
  338. RTCP message with the 4 octet name field set to VORB.
  339. Synchronizing the configuration headers to the RTP stream is
  340. critical. A 32 bit timestamp field is used to indicate the
  341. timepoint when a VORB header MUST be applied to the RTP stream.
  342. VORB RTCP packets SHOULD be sent just ahead of the change in the
  343. RTP stream. As the reception loss of the RTCP header will mean
  344. the RTP stream will fail to decode properly the freqency of their
  345. periodic retransmission SHOULD be high enough to minimize the
  346. stream disturbance whilst remaining under the RTCP bandwidth
  347. allocation.
  348. 0 1 2 3
  349. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  350. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  351. |V=2|P| subtype | PT=APP=204 | Length |
  352. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  353. | SSRC/CSRC |
  354. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  355. | VORB |
  356. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  357. | Timestamp (in sample rate units) |
  358. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  359. | Vorbis Version |
  360. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  361. | Audio Sample Rate |
  362. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  363. | Bitrate Maximum |
  364. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  365. | Bitrate Nominal |
  366. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  367. Kerr Expires April 27, 2003 [Page 9]
  368. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  369. 0 1 2 3
  370. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
  371. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  372. | Bitrate Minimum |
  373. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  374. | bsz 0 | bsz 1 | Num Audio Channels |c|m|o|x|x|x|x|x|
  375. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  376. | Codebook length | Codebook checksum |
  377. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  378. .. Codebook |
  379. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  380. .. URI string |
  381. +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
  382. | Vendor string length |
  383. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  384. | Vendor string ..
  385. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  386. | User comments list length |
  387. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  388. .. User comment length / User comment |
  389. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  390. The first Vorbis config header defines the Vorbis stream
  391. attributes. The Vorbis version MUST be set to zero to comply with
  392. this document. The fields Sample Rate, Bitrate Maximum/Nominal/
  393. Minimum and Num Audio Channels are set in accordance with [6] with
  394. the bsz fields above referring to the blocksize parameters. The
  395. framing bit is not used for RTP transportation and so applications
  396. constructing Vorbis files MUST take care to set this if required.
  397. The next 8 bits are used to indicate the presence of the two
  398. other Vorbis stream config headers and the size overflow header.
  399. The c flag indicates the presence of a codebook header block, the
  400. m flag indicates the presence of a comment metadata block. The o
  401. flag indicates if the size of either of the c and m headers would
  402. make the VORB packet greater than that allowed for a RTCP message.
  403. The remaining five bits, indicated with an x, are reserved/unused
  404. and MUST be set to 0 for this version of the document.
  405. If the c flag is set then the next header block will contain the
  406. codebook configuration data.
  407. The configuration information detailed above MUST be completely
  408. intact, as a client can not decode a stream with an incomplete
  409. or corrupted codebook set.
  410. A 16 bit codebook length field and a 16 bit 1's complement checksum
  411. of the codebook precedes the codebook datablock. The length field
  412. allows for codebooks to be up to 64K in size. The checksum is used
  413. to detect a corrupted codebook.
  414. Kerr Expires April 27, 2003 [Page 10]
  415. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  416. If a checksum failure is detected then a new config header file
  417. SHOULD be obtained from SDP, if the codebook has not changed since
  418. the session has started. If no SDP value is set and no other method
  419. for obtaining the config headers exists then this is considered to
  420. be a failure and SHOULD be reported to the client application.
  421. If the m flag is set then the next header block will contain the
  422. comment metadata, such as artist name, track title and so on. These
  423. metadata messages are not intended to be fully descriptive but to
  424. offer basic track/song information. This message MUST be sent at
  425. the start of the stream, together with the setup and codebook
  426. headers, even if it contains no information. During a session the
  427. metadata associated with the stream may change from that specified
  428. at the start, e.g. a live concert broadcast changing acts/scenes, so
  429. clients MUST have the ability to receive m header blocks. Details
  430. on the format of the comments can be found in the Vorbis
  431. documentation [7].
  432. The format for the data takes the form of a 32 bit codec vendors
  433. name length field followed by the name encoded in UTF-8. The next
  434. field denotes the number of user comments and then the user comments
  435. length and text field pairs, up to the number indicated by the user
  436. comment list length.
  437. If the o, overflow, bit is set then the URI of a whole header block
  438. is specified in an overflow URI field, which is a null terminated
  439. UTF-8 string. The header file specified at the URI MUST NOT have
  440. the overflow flag set, otherwise a loop condition will occur.
  441. 4.2 Codebook Caching
  442. Codebook caching allows clients that have previously connected to a
  443. stream to re-use the codebooks and thus begin the playback of the
  444. session faster. When a client receives a codebook it may store
  445. it, together with the MD5 key, locally and can compare the MD5 key
  446. of locally cached codebooks with the key it receives via SDP, which
  447. is detailed in section 5.
  448. 5 Session Description for Vorbis RTP Streams
  449. Session description information concerning the Vorbis stream
  450. SHOULD be provided if possible and MUST be in accordance with [8].
  451. The SDP information is split into two sections, a mandatory
  452. section detailing the RTP stream and an optional section used to
  453. convey information needed for codebook caching.
  454. Kerr Expires April 27, 2003 [Page 11]
  455. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  456. Below is an outline of the mandatory SDP attributes.
  457. c=IN IP4/6 <Vorbis stream>
  458. m=audio <port> RTP/AVP 98
  459. a=rtpmap:98 vorbis/<sample rate>
  460. a=fmtp:98 header=<URI of Vorbis codebooks>
  461. a=fmtp:98 md5key=<MD5 key of codebook>
  462. The port value is specified by the server application bound to
  463. the address specified in the c attribute. The bitrate value
  464. specified in the a attribute MUST match the Vorbis sample rate
  465. value.
  466. The Vorbis codebook specified in the header attribute MUST contain
  467. all of the configuration data. If the codebook MD5 attribute,
  468. md5key, is set the key is compared to a locally held cache and
  469. if found the associated local codebook is used, if not the
  470. client MUST use the configuration headers specified with the
  471. header attribute.
  472. 5.1 SDP Based Config Header Transmission
  473. The optional SDP attributes are used to convey details of the
  474. Vorbis stream which are required for codebook caching. If the
  475. following attributes are set they take precedent over values
  476. specified in the u attribute detailed above. The maximum size
  477. of the mandatory and optional SDP attributes MUST be less than
  478. 1K in size to conform to section 4.1 of [8].
  479. a=fmtp:98 bitrate_min=<Bitrate Minimum>
  480. a=fmtp:98 bitrate_norm=<Bitrate Normal>
  481. a=fmtp:98 bitrate_max=<Bitrate Maximum>
  482. a=fmtp:98 bsz0=<Block Size 0>
  483. a=fmtp:98 bsz1=<Block Size 1>
  484. a=fmtp:98 channels=<Num Audio Channels>
  485. a=fmtp:98 meta_vendor=<Vendor Name>
  486. The metadata attribute, meta_vendor, provides the bare minimum
  487. information required for decoding but does not convey any
  488. meaningful stream metadata information. As outlined in the Vorbis
  489. comment field and header specification documentation, [7], a number
  490. of predefined field names are available which SHOULD be used. An
  491. example would be:
  492. Kerr Expires April 27, 2003 [Page 12]
  493. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  494. a=fmtp:98 meta_vendor=Xiph.Org libVorbis I 20020717
  495. a=fmtp:98 meta_artist=Honest Bob and the Factory-to-Dealer-Incentives
  496. a=fmtp:98 meta_title=I'm Still Around
  497. a=fmtp:98 meta_tracknumber=5
  498. 6 IANA Considerations
  499. MIME media type name: audio
  500. MIME subtype: vorbis
  501. Required Parameters:
  502. header indicates the URI of the decoding codebook.
  503. md5key indicates the MD5 key of the codebooks.
  504. Optional Parameters:
  505. bitrate_min, bitrate_norm and bitrate_max indicate the
  506. minimum, nominal and maximum bitrates. bsz0 and bsz1
  507. indicate the blocksize values. channels indicates the
  508. number of audio channels in the stream. meta_vendor
  509. indicates the encoding codec vendor.
  510. Encoding considerations:
  511. This type is only defined for transfer via RTP as specified
  512. in RFC XXXX.
  513. Security Considerations:
  514. See Section 6 of RFC 3047.
  515. Interoperability considerations: none
  516. Published specification:
  517. See the Vorbis documentation [2] for details.
  518. Applications which use this media type:
  519. Audio streaming and conferencing tools
  520. Additional information: none
  521. Person & email address to contact for further information:
  522. Phil Kerr
  523. philkerr@elec.gla.ac.uk/phil@plus24.com
  524. Intended usage: COMMON
  525. Author/Change controller:
  526. Author: Phil Kerr
  527. Change controller: IETF AVT Working Group
  528. 7 Congestion Control
  529. Vorbis clients SHOULD send regular receiver reports detailing
  530. congestion. A mechanism for dynamically downgrading the stream,
  531. known as bitrate peeling, will allow for a graceful backing off
  532. of the stream bitrate. This feature is not available at present
  533. so an alternative would be to redirect the client to a lower
  534. bitrate stream if one is available.
  535. Kerr Expires April 27, 2003 [Page 13]
  536. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  537. 8 Security Considerations
  538. RTP packets using this payload format are subject to the security
  539. considerations discussed in the RTP specification [3]. This implies
  540. that the confidentiality of the media stream is achieved by using
  541. encryption. Because the data compression used with this payload
  542. format is applied end-to-end, encryption may be performed on the
  543. compressed data. Where the size of a data block is set care MUST
  544. be taken to prevent buffer overflows in the client applications.
  545. 9 Acknowledgments
  546. This document is a continuation of draft-moffitt-vorbis-rtp-00.txt.
  547. The MIME type section is a continuation of draft-short-avt-rtp-
  548. vorbis-mime-00.txt
  549. Thanks to the AVT, Ogg Vorbis Communities / Xiph.org including
  550. Steve Casner, Ramon Garcia, Pascal Hennequin, Ralph Jiles,
  551. Tor-Einar Jarnbjo, Colin Law, John Lazzaro, Jack Moffitt,
  552. Colin Perkins, Barry Short, Mike Smith, Magnus Westerlund.
  553. 10 Normative References
  554. 1. The Ogg Encapsulation Format Version 0 (RFC 3533), S. Pfeiffer.
  555. 2. Key words for use in RFCs to Indicate Requirement Levels
  556. (RFC 2119), S. Bradner.
  557. 3. RTP: A Transport Protocol for Real-Time Applications (RFC 1889),
  558. Schulzrinne, et al.
  559. 4. RTP: A transport protocol for real-time applications. Work
  560. in progress, draft-ietf-avt-rtp-new-11.txt.
  561. 5. RTP Profile for Audio and Video Conferences with Minimal Control.
  562. Work in progress, draft-ietf-avt-profile-new-12.txt.
  563. 6. Ogg Vorbis I spec: Codec setup and packet decode.
  564. http://www.xiph.org/ogg/vorbis/doc/vorbis-spec-ref.html
  565. 7. Ogg Vorbis I spec: Comment field and header specification.
  566. http://www.xiph.org/ogg/vorbis/doc/v-comment.html
  567. 8. SDP: Session Description Protocol (RFC 2327), Handley, M. and
  568. V. Jacobson.
  569. 9. Path MTU Discovery (RFC 1063), Mogul & Deering
  570. 10. Path MTU Discovery for IP version 6 (RFC 1981), McCann, J. et al.
  571. Kerr Expires April 27, 2003 [Page 14]
  572. Internet Draft draft-kerr-avt-vorbis-rtp-03.txt October 27, 2003
  573. 10.1 Informative References
  574. 11. libvorbis: Available from the Xiph website, http://www.xiph.org
  575. 11 Full Copyright Statement
  576. Copyright (C) The Internet Society (2003). All Rights Reserved.
  577. This document and translations of it may be copied and furnished to
  578. others, and derivative works that comment on or otherwise explain it
  579. or assist in its implementation may be prepared, copied, published
  580. and distributed, in whole or in part, without restriction of any
  581. kind, provided that the above copyright notice and this paragraph are
  582. included on all such copies and derivative works. However, this
  583. document itself may not be modified in any way, such as by removing
  584. the copyright notice or references to the Internet Society or other
  585. Internet organizations, except as needed for the purpose of
  586. developing Internet standards in which case the procedures for
  587. copyrights defined in the Internet Standards process must be
  588. followed, or as required to translate it into languages other than
  589. English.
  590. The limited permissions granted above are perpetual and will not be
  591. revoked by the Internet Society or its successors or assigns.
  592. This document and the information contained herein is provided on an
  593. "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
  594. TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
  595. BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
  596. HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
  597. MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
  598. 10.1 IPR Statement
  599. "The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat."
  600. Further IPR details on the Vorbis bitstream may be found on the
  601. Xiph website: http://www.xiph.org
  602. 12 Authors Address
  603. Phil Kerr
  604. Centre for Music Technology
  605. University of Glasgow
  606. Glasgow, Scotland
  607. UK, G12 8LT
  608. Phone: +44 141 330 5740
  609. Email: philkerr@elec.gla.ac.uk
  610. phil@plus24.com
  611. WWW: http://www.xiph.org/
  612. Kerr Expires April 27, 2003 [Page 15]