123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929 |
- <?xml version='1.0'?>
- <!DOCTYPE rfc SYSTEM 'rfc2629.dtd'>
- <?rfc toc="yes" ?>
- <rfc ipr="full3667" docName="RTP Payload Format for Vorbis Encoded Audio">
- <front>
- <title>draft-kerr-avt-vorbis-rtp-04</title>
- <author initials="P" surname="Kerr" fullname="Phil Kerr">
- <organization>Xiph.Org</organization>
- <address>
- <email>phil@plus24.com</email>
- <uri>http://www.xiph.org/</uri>
- </address>
- </author>
- <date day="31" month="December" year="2004" />
- <area>General</area>
- <workgroup>AVT Working Group</workgroup>
- <keyword>I-D</keyword>
- <keyword>Internet-Draft</keyword>
- <keyword>Vorbis</keyword>
- <keyword>RTP</keyword>
- <abstract>
- <t>This document describes a RTP payload format for transporting
- Vorbis encoded audio. It details the RTP encapsulation mechanism
- for raw Vorbis data and details the delivery mechanisms for the
- decoder probability model, referred to as a codebook, metadata
- and other setup information.</t>
- </abstract>
- <note title="Editors Note">
- <t>
- All references to RFC XXXX are to be replaced by references to the RFC number of this memo, when published.
- </t>
- </note>
- </front>
- <middle>
- <section anchor="Introduction" title="Introduction">
- <t>
- Vorbis is a general purpose perceptual audio codec intended to allow
- maximum encoder flexibility, thus allowing it to scale competitively
- over an exceptionally wide range of bitrates. At the high
- quality/bitrate end of the scale (CD or DAT rate stereo,
- 16/24 bits), it is in the same league as MPEG-2 and MPC. Similarly,
- the 1.0 encoder can encode high-quality CD and DAT rate stereo at
- below 48k bits/sec without resampling to a lower rate. Vorbis is
- also intended for lower and higher sample rates (from 8kHz
- telephony to 192kHz digital masters) and a range of channel
- representations (monaural, polyphonic, stereo, quadraphonic, 5.1,
- ambisonic, or up to 255 discrete channels).
- Vorbis encoded audio is generally encapsulated within an Ogg format
- bitstream <xref target="rfc3533"></xref>, which provides framing and synchronization. For the
- purposes of RTP transport, this layer is unnecessary, and so raw
- Vorbis packets are used in the payload.
- </t>
- <section anchor="Terminology" title="Terminology">
- <t>
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
- "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
- document are to be interpreted as described in RFC 2119 <xref target="rfc2119"></xref>.
- </t>
- </section>
- </section>
- <section anchor="Payload Format" title="Payload Format">
- <t>
- For RTP based transportation of Vorbis encoded audio the standard
- RTP header is followed by a 5 octet payload header, then the payload
- data. The payload headers are used to associate the Vorbis data with
- its associated decoding codebooks as well as indicating if the following packet
- contains fragmented Vorbis data and/or the the number of whole Vorbis
- data frames. The payload data contains the raw Vorbis bitstream
- information.
- </t>
- <section anchor="RTP Header" title="RTP Header">
- <artwork><![CDATA[
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC |M| PT | sequence number |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | timestamp |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- The RTP header begins with an octet of fields (V, P, X, and CC) to
- support specialized RTP uses (see <xref target="rfc3550"></xref> and <xref target="rfc3551"></xref> for details). For Vorbis RTP, the following values are used.
- </t>
- <t>
- Version (V): 2 bits</t><t>
- This field identifies the version of RTP. The version
- used by this specification is two (2).
- </t>
- <t>
- Padding (P): 1 bit</t><t>
- Padding MAY be used with this payload format according to
- section 5.1 of <xref target="rfc3550"></xref>.
- </t>
- <t>
- Extension (X): 1 bit</t><t>
- Always set to 0, as audio silence suppression is not used by
- the Vorbis codec.
- </t>
- <t>
- CSRC count (CC): 4 bits</t><t>
- The CSRC count is used in accordance with <xref target="rfc3550"></xref>.
- </t>
- <t>
- Marker (M): 1 bit</t><t>
- Set to zero. Audio silence suppression not used. This conforms
- to section 4.1 of <xref target="vorbis-spec-ref"></xref>.
- </t>
- <t>
- Payload Type (PT): 7 bits</t><t>
- An RTP profile for a class of applications is expected to assign
- a payload type for this format, or a dynamically allocated
- payload type SHOULD be chosen which designates the payload as
- Vorbis.
- </t>
- <t>
- Sequence number: 16 bits</t><t>
- The sequence number increments by one for each RTP data packet
- sent, and may be used by the receiver to detect packet loss and
- to restore packet sequence. This field is detailed further in
- <xref target="rfc3550"></xref>.
- </t>
- <t>
- Timestamp: 32 bits</t><t>
- A timestamp representing the sampling time of the first sample of
- the first Vorbis packet in the RTP packet. The clock frequency
- MUST be set to the sample rate of the encoded audio data and is
- conveyed out-of-band as a SDP attribute.
- </t>
- <t>
- SSRC/CSRC identifiers: </t><t>
- These two fields, 32 bits each with one SSRC field and a maximum
- of 16 CSRC fields, are as defined in <xref target="rfc3550"></xref>.
- </t>
- </section>
- <section anchor="Payload Header" title="Payload Header">
- <t>
- After the RTP Header section the following five octets are the Payload Header.
- This header is split into a number of bitfields detailing the format
- of the following Payload Data packets.
- </t>
- <artwork><![CDATA[
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |C|F| T |# pkts.|
- +-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- Codebook Ident: 32 bits</t><t>
- This 32 bit field is used to associate the Vorbis data to a decoding Codebook.
- It is created by making a CRC32 checksum of the codebook required to decode the
- particular Vorbis audio stream.
- </t>
- <t>
- Continuation (C): 1 bit</t><t>
- Set to one if this is a continuation of a fragmented packet.
- </t>
- <t>
- Fragmented (F): 1 bit</t><t>
- Set to one if the payload contains complete packets or if it
- contains the last fragment of a fragmented packet.
- </t>
- <t>
- Payload Type (T): 2 bits</t><t>
- This field sets the packet payload type. There are currently four type of packet payloads.
- </t>
- <vspace blankLines="1" />
- <list style="empty">
- <t> 0 = Raw Vorbis payload</t>
- <t> 1 = Configuration payload</t>
- <t> 2 = Codebook payload</t>
- <t> 3 = Metadata payload</t>
- </list>
- <t>
- The last 4 bits are the number of complete packets in this payload.
- This provides for a maximum number of 15 Vorbis packets in the
- payload. If the packet contains fragmented data the number of packets MUST be set to 0.
- </t>
- </section>
- <section anchor="Payload Data" title="Payload Data">
- <t>
- Raw Vorbis packets are unbounded in length currently, although at some future
- point there will likely be a practical limit placed on them.
- Typical Vorbis packet sizes are from very small (2-3 bytes) to
- quite large (8-12 kilobytes). The reference implementation <xref target="libvorbis"></xref>
- typically produces packets less than ~800 bytes, except for the
- codebook header packets which are ~4-12 kilobytes.
- Within an RTP context the maximum Vorbis packet size, including the RTP and payload
- headers, SHOULD be kept below the path MTU to avoid packet fragmentation.
- </t>
- <t>
- Each Vorbis payload packet starts with a one octet length header,
- which is used to represent the size of the following data payload, followed
- by the raw Vorbis data.
- </t>
- <t>
- For payloads which consist of multiple Vorbis packets the payload data
- consists of the packet length followed by the packet data for each of
- the Vorbis packets in the payload.
- </t>
- <t>
- The Vorbis packet length header is the length of the Vorbis data
- block only and does not count the length octet.
- </t>
- <t>
- The payload packing of the Vorbis data packets SHOULD follow the
- guidelines set-out in <xref target="rfc3551"></xref> where the oldest packet
- occurs immediately after the RTP packet header.
- </t>
- <t>
- Channel mapping of the audio is in accordance with BS. 775-1
- ITU-R.
- </t>
- </section>
- <section anchor="Example RTP Packet" title="Example RTP Packet">
- <t>
- Here is an example RTP packet containing two Vorbis packets.
- </t>
- <t>
- RTP Packet Header:
- </t>
- <artwork><![CDATA[
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | 2 |0|0| 0 |0| PT | sequence number |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | timestamp (in sample rate units) |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronisation source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- Payload Data:
- </t>
- <artwork><![CDATA[
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |0|1| 0 | 2 pks | len | vorbis data ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- .. ...vorbis data... ..
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- .. data | len | next vorbis packet data... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- </section>
- </section>
- <section anchor="Frame Packetizing" title="Frame Packetizing">
- <t>
- Each RTP packet contains either one complete Vorbis packet, one
- Vorbis packet fragment, or an integer number of complete Vorbis
- packets (up to a max of 15 packets, since the number of packets
- is defined by a 4 bit value).
- </t>
- <t>
- Any Vorbis data packet that is 256 octets or less SHOULD be bundled in the
- RTP packet with as many Vorbis packets as will fit, up to a maximum
- of 15.
- </t>
- <t>
- If a Vorbis packet is larger than 256 octets it MUST be
- fragmented. A fragmented packet has a zero in the last four bits
- of the payload header. Each fragment after the first will also set
- the Continued (C) bit to one in the payload header. The RTP packet
- containing the last fragment of the Vorbis packet will have the
- Fragmented (F) bit set to one. To maintain the correct sequence
- for fragmented packet reception the timestamp field of fragmented
- packets MUST be the same as the first packet sent, with the sequence
- number incremented as normal for the subsequent RTP packets. Path
- MTU is detailed in <xref target="rfc1063"></xref> and <xref target="rfc1981"></xref>.
- </t>
- <section anchor="Example Fragmented Vorbis Packet" title="Example Fragmented Vorbis Packet">
- <t>
- Here is an example fragmented Vorbis packet split over three RTP
- packets.
- </t>
- <artwork><![CDATA[
- Packet 1:
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC |M| PT | 1000 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | xxxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |0|0| 0 | 0| len | vorbis data .. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | ..vorbis data.. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- In this packet the initial sequence number is 1000 and the
- timestamp is xxxxx. The number of packets field is set to 0.
- </t>
- <artwork><![CDATA[
- Packet 2:
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC |M| PT | 1001 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | xxxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |1|0| 0 | 0| len | vorbis data ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | ..vorbis data.. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- The C bit is set to 1 and the number of packets field is set to 0.
- For large Vorbis fragments there can be several of these type of
- payload packets. The maximum packet size SHOULD be no greater
- than the path MTU, including all RTP and payload headers. The
- sequence number has been incremented by one but the timestamp field
- remains the same as the initial packet.
- </t>
- <artwork><![CDATA[
- Packet 3:
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC |M| PT | 1002 |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | xxxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |1|1| 0 | 0| len | vorbis data .. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | ..vorbis data.. |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- This is the last Vorbis fragment packet. The C and F bits are
- set and the packet count remains set to 0. As in the previous
- packets the timestamp remains set to the first packet in the
- sequence and the sequence number has been incremented.
- </t>
- </section>
- <section anchor="Packet Loss" title="Packet Loss">
- <t>
- As there is no error correction within the Vorbis stream, packet
- loss will result in a loss of signal. Packet loss is more of an
- issue for fragmented Vorbis packets as the client will have to
- cope with the handling of the C and F flags. If we use the
- fragmented Vorbis packet example above and the first packet is
- lost the client SHOULD detect that the next packet has the packet
- count field set to 0 and the C bit is set and MUST drop it. The
- next packet, which is the final fragmented packet, SHOULD be dropped
- in the same manner, or buffered. Feedback reports on lost and
- dropped packets MUST be sent back via RTCP.
- </t>
- </section>
- </section>
- <section anchor="Configuration Headers" title="Configuration Headers">
- <t>
- Unlike other mainstream audio codecs Vorbis has no statically
- configured probability model, instead it packs all entropy decoding
- configuration, VQ and Huffman models into a self-contained codebook.
- This codebook block also requires additional identification
- information detailing the number of audio channels, bitrates and
- other information used to initialise the Vorbis stream.
- </t>
- <t>
- To decode a Vorbis stream three configuration header blocks are
- needed. The first header indicates the sample and bitrates, the
- number of channels and the version of the Vorbis encoder used.
- The second header contains the decoders probability model, or
- codebook and the third header details stream metadata.
- </t>
- <t>
- As the RTP stream may change certain configuration data mid-session
- there are two different methods for delivering this configuration
- data to a client, in-band and SDP which is
- detailed below. SDP delivery is used to set-up an initial
- state for the client application and in-band is used to change state
- during the session. The changes may be due to different metadata
- or codebooks as well as different bitrates of the stream.
- </t>
- <t>
- Out of the two delivery vectors the use of an SDP attribute to indicate an URI
- where the configuration and codebook data can be obtained is preferred
- as they can be fetched reliably using TCP. The in-band codebook delivery SHOULD
- only be used in situations where the link between the client is unidirectional or if
- the SDP-based information is not available.
- </t>
- <t>
- Synchronizing the configuration and codebook headers to the RTP stream is
- critical. The 32 bit Codebook Ident field is used to indicate when a change in the stream has
- taken place. The client application MUST have in advance the correct configuration and codebook
- headers and if the client detects a change in the Ident value and does not have this information
- it MUST NOT decode the raw Vorbis data.
- </t>
- <section anchor="In-band Header Transmission" title="In-band Header Transmission">
- <t>
- The three header data blocks are sent in-band with the packet type bits set to
- match the payload type. Normally the codebook and configuration
- headers are sent once per session if the stream is an encoding of live audio, as typically
- the encoder state will not change, but the encoder state can change at the boundary
- of chained Vorbis audio files. Metadata can be sent at the start as well as any time during
- the life of the session. Clients MUST be capable of dealing with periodic re-transmission of the
- configuration headers.
- </t>
- <t>
- A Vorbis configuration header is indicated with the payload type field set to 1.
- The Vorbis version MUST be set to zero to comply with
- this document. The fields Sample Rate, Bitrate Maximum/Nominal/
- Minimum and Num Audio Channels are set in accordance with <xref target="vorbis-spec-ref"></xref> with
- the bsz fields above referring to the blocksize parameters. The
- framing bit is not used for RTP transportation and so applications
- constructing Vorbis files MUST take care to set this if required.
- </t>
- <artwork><![CDATA[
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC |M| PT | xxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | xxxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |0|1| 2 | 1| bsz 0 | bsz 1 | Num Audio Channels |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Vorbis Version |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Audio Sample Rate |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Bitrate Maximum |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Bitrate Nominal |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Bitrate Minimum |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- If the payload type field is set to 2, this indicates the packet contains codebook data.
- </t>
- <t>
- The configuration information detailed below MUST be completely
- intact, as a client can not decode a stream with an incomplete
- or corrupted codebook set.
- </t>
- <t>
- A 16 bit codebook length field precedes the codebook datablock. The length field
- allows for codebooks to be up to 64K in size. Packet fragmentation,
- as per the Vorbis data, MUST be performed if the codebooks size exceeds
- path MTU. The Codebook Ident field MUST be set to match the associated codebook
- needed to decode the Vorbis stream.
- </t>
- <t>
- The Codebook Ident is the CRC32 checksum of the codebook and
- is used to detect a corrupted codebook as well as
- associating it with its Vorbis data stream. This Ident value
- MUST NOT be set to the value of the current stream if this header is
- being sent before the boundary of the chained file has been reached.
- If a checksum failure is detected then this is considered to
- be a failure and MUST be reported to the client application.
- </t>
- <artwork><![CDATA[
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC |M| PT | xxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | xxxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |0|1| 2 | 1| Codebook Length |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | length | Codebook ..
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- .. Codebook |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- <t>
- With the payload type flag set to 3, this indicates that the packet contain the
- comment metadata, such as artist name, track title and so on. These
- metadata messages are not intended to be fully descriptive but to
- offer basic track/song information. This message MUST be sent at
- the start of the stream, together with the setup and codebook
- headers, even if it contains no information. During a session the
- metadata associated with the stream may change from that specified
- at the start, e.g. a live concert broadcast changing acts/scenes, so
- clients MUST have the ability to receive header blocks. Details
- on the format of the comments can be found in the Vorbis
- documentation <xref target="v-comment"></xref>.
- </t>
- 1) [vendor_length] = read an unsigned integer of 32 bits
- 2) [vendor_string] = read a UTF-8 vector as [vendor_length] octets
- 3) [user_comment_list_length] = read an unsigned integer of 32 bits
- 4) iterate [user_comment_list_length] times {
- 5) [length] = read an unsigned integer of 32 bits
- 6) this iteration's user comment = read a UTF-8 vector as [length] octets
- }
- 7) [framing_bit] = read a single bit as boolean
- 8) if ( [framing_bit] unset or end of packet ) then ERROR
- 9) done.
- <t>
- The format for the data takes the form of a 32 bit codec vendors
- name length field followed by the name encoded in UTF-8. The next 32
- bit field denotes the number of user comments. Each of the user comments
- is prefixed by a 32 bit length field followed by the comment text.
- </t>
- <artwork><![CDATA[
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC |M| PT | xxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | xxxxx |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
- | contributing source (CSRC) identifiers |
- | ... |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Codebook Ident |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |0|1| 3 | 1| Vendor string length |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | length | Vendor string ..
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | User comments list length |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- .. User comment length / User comment |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- ]]></artwork>
- </section>
- <section anchor="Session Description for Vorbis RTP Streams" title="Session Description for Vorbis RTP Streams">
- <t>
- Session description information concerning the Vorbis stream
- SHOULD be provided if possible and MUST be in accordance with <xref target="rfc2327"></xref>.
- </t>
- <t>
- If the stream comprises chained Vorbis files the configuration and codebook headers for each
- file SHOULD be packaged together and passed to the client using the headers attribute.
- </t>
- <t>
- Below is an outline of the mandatory SDP attributes.
- </t>
- <vspace blankLines="1" />
- <list style="empty">
- <t>c=IN IP4/6 </t>
- <t>m=audio RTP/AVP 98</t>
- <t>a=rtpmap:98 VORBIS/44100/2</t>
- <t>a=fmtp:98 header=<URI of configuration header> </t>
- </list>
- <t>
- The Vorbis configuration specified in the header attribute MUST contain
- all of the configuration data and codebooks needed for the life of the session.
- </t>
- <t>
- The port value is specified by the server application bound to
- the address specified in the c attribute. The bitrate value
- and channels specified in the m attribute MUST match the Vorbis
- sample rate value.
- </t>
- </section>
- <section anchor="Codebook Caching" title="Codebook Caching">
- <t>
- Codebook caching allows clients that have previously connected to a
- stream to re-use the associated codebooks and configuration data.
- When a client receives a codebook it may store it locally and can
- compare the CRC32 key with that of the new stream and begin decoding
- before it has received any of the headers.
- </t>
- </section>
- </section>
- <section anchor="IANA Considerations" title="IANA Considerations">
- <t>
- MIME media type name: audio
- </t>
- <t>
- MIME subtype: vorbis
- </t>
- <t>
- Required Parameters:</t><t>
- header indicates the URI of the decoding configuration headers.
- </t>
- <t>
- Optional Parameters: </t><t>
- None.
- </t>
- <t>
- Encoding considerations:</t><t>
- This type is only defined for transfer via RTP as specified
- in RFC XXXX.
- </t>
- <t>
- Security Considerations:</t><t>
- See Section 6 of RFC 3047.
- </t>
- <t>
- Interoperability considerations: none
- </t>
- <t>
- Published specification:</t>
- <t>See the Vorbis documentation <xref target="vorbis-spec-ref"></xref> for details.</t>
- <t>
- Applications which use this media type:</t><t>
- Audio streaming and conferencing tools
- </t>
- <t>
- Additional information: none
- </t>
- <t>
- Person & email address to contact for further information:</t><t>
- Phil Kerr: <phil@plus24.com>
- </t>
- <t>
- Intended usage: COMMON
- </t>
- <t>
- Author/Change controller:</t><t>
- Author: Phil Kerr
- Change controller: IETF AVT Working Group
- </t>
- </section>
- <section anchor="Congestion Control" title="Congestion Control">
- <t>
- Vorbis clients SHOULD send regular receiver reports detailing
- congestion. A mechanism for dynamically downgrading the stream,
- known as bitrate peeling, will allow for a graceful backing off
- of the stream bitrate. This feature is not available at present
- so an alternative would be to redirect the client to a lower
- bitrate stream if one is available.
- </t>
- </section>
- <section anchor="Security Considerations" title="Security Considerations">
- <t>
- RTP packets using this payload format are subject to the security
- considerations discussed in the RTP specification <xref target="rfc3550"></xref>. This implies
- that the confidentiality of the media stream is achieved by using
- encryption. Because the data compression used with this payload
- format is applied end-to-end, encryption may be performed on the
- compressed data. Where the size of a data block is set care MUST
- be taken to prevent buffer overflows in the client applications.
- </t>
- </section>
- <section anchor="Acknowledgments" title="Acknowledgments">
- <t>
- This document is a continuation of draft-moffitt-vorbis-rtp-00.txt.
- The MIME type section is a continuation of draft-short-avt-rtp-
- vorbis-mime-00.txt
- </t>
- <t>
- Thanks to the AVT, Ogg Vorbis Communities / Xiph.org including
- Steve Casner, Aaron Colwell, Ross Finlayson, Ramon Garcia, Pascal Hennequin, Ralph Giles,
- Tor-Einar Jarnbjo, Colin Law, John Lazzaro, Jack Moffitt, Christopher Montgomery,
- Colin Perkins, Barry Short, Mike Smith, Magnus Westerlund.
- </t>
- </section>
- </middle>
- <back>
- <references title="Normative References">
- <reference anchor="rfc3533">
- <front>
- <title>The Ogg Encapsulation Format Version 0</title>
- <author initials="S." surname="Pfeiffer" fullname="Silvia Pfeiffer"></author>
- </front>
- <seriesInfo name="RFC" value="3533" />
- </reference>
- <reference anchor="rfc2119">
- <front>
- <title>Key words for use in RFCs to Indicate Requirement Levels </title>
- <author initials="S." surname="Bradner" fullname="Scott Bradner"></author>
- </front>
- <seriesInfo name="RFC" value="2119" />
- </reference>
- <reference anchor="rfc3550">
- <front>
- <title>RTP: A Transport Protocol for real-time applications</title>
- <author initials="H." surname="Schulzrinne" fullname=""></author>
- <author initials="S." surname="Casner" fullname=""></author>
- <author initials="R." surname="Frederick" fullname=""></author>
- <author initials="V." surname="Jacobson" fullname=""></author>
- </front>
- <seriesInfo name="RFC" value="3550" />
- </reference>
- <reference anchor="rfc3551">
- <front>
- <title>RTP Profile for Audio and Video Conferences with Minimal Control.</title>
- <author initials="H." surname="Schulzrinne" fullname=""></author>
- <author initials="S." surname="Casner" fullname=""></author>
- </front>
- <date month="July" year="2003" />
- <seriesInfo name="RFC" value="3551" />
- </reference>
-
- <reference anchor="rfc2327">
- <front>
- <title>SDP: Session Description Protocol</title>
- <author initials="M." surname="Handley" fullname="Mark Handley"></author>
- <author initials="V." surname="Jacobson" fullname="Van Jacobson"></author>
- </front>
- <seriesInfo name="RFC" value="2327" />
- </reference>
- <reference anchor="rfc1063">
- <front>
- <title>Path MTU Discovery</title>
- <author initials="J." surname="Mogul et al." fullname="J. Mogul et al."></author>
- </front>
- <seriesInfo name="RFC" value="1063" />
- </reference>
- <reference anchor="rfc1981">
- <front>
- <title>Path MTU Discovery for IP version 6</title>
- <author initials="J." surname="McCann et al." fullname="J. McCann et al."></author>
- </front>
- <seriesInfo name="RFC" value="1981" />
- </reference>
- </references>
- <references title="Informative References">
- <reference anchor="libvorbis">
- <front>
- <title>libvorbis: Available from the Xiph website, http://www.xiph.org</title>
- </front>
- </reference>
- <reference anchor="vorbis-spec-ref">
- <front>
- <title>Ogg Vorbis I spec: Codec setup and packet decode. http://www.xiph.org/ogg/vorbis/doc/vorbis-spec-ref.html</title>
- </front>
- </reference>
-
- <reference anchor="v-comment">
- <front>
- <title>Ogg Vorbis I spec: Comment field and header specification. http://www.xiph.org/ogg/vorbis/doc/v-comment.html</title>
- </front>
- </reference>
-
- </references>
- </back>
- </rfc>
|