123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262 |
- <?xml version="1.0" encoding="US-ASCII"?>
- <!DOCTYPE rfc SYSTEM "rfc2629.dtd">
- <?rfc toc="yes"?>
- <?rfc tocompact="yes"?>
- <?rfc tocdepth="3"?>
- <?rfc tocindent="yes"?>
- <?rfc symrefs="yes"?>
- <?rfc sortrefs="yes"?>
- <?rfc comments="yes"?>
- <?rfc inline="yes"?>
- <?rfc compact="yes"?>
- <?rfc subcompact="no"?>
- <rfc category="std" docName="draft-ietf-codec-opus-update-01"
- ipr="trust200902">
- <front>
- <title abbrev="Opus Update">Updates to the Opus Audio Codec</title>
- <author initials="JM" surname="Valin" fullname="Jean-Marc Valin">
- <organization>Mozilla Corporation</organization>
- <address>
- <postal>
- <street>331 E. Evelyn Avenue</street>
- <city>Mountain View</city>
- <region>CA</region>
- <code>94041</code>
- <country>USA</country>
- </postal>
- <phone>+1 650 903-0800</phone>
- <email>jmvalin@jmvalin.ca</email>
- </address>
- </author>
- <author initials="K." surname="Vos" fullname="Koen Vos">
- <organization>vocTone</organization>
- <address>
- <postal>
- <street></street>
- <city></city>
- <region></region>
- <code></code>
- <country></country>
- </postal>
- <phone></phone>
- <email>koenvos74@gmail.com</email>
- </address>
- </author>
- <date day="4" month="September" year="2014" />
- <abstract>
- <t>This document addresses minor issues that were found in the specification
- of the Opus audio codec in <xref target="RFC6716">RFC 6716</xref>.</t>
- </abstract>
- </front>
- <middle>
- <section title="Introduction">
- <t>This document addresses minor issues that were discovered in the reference
- implementation of the Opus codec that serves as the specification in
- <xref target="RFC6716">RFC 6716</xref>. Only issues affecting the decoder are
- listed here. An up-to-date implementation of the Opus encoder can be found at
- http://opus-codec.org/. The updated specification remains fully compatible with
- the original specification and only one of the changes results in any difference
- in the audio output.
- </t>
- </section>
- <section title="Terminology">
- <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
- "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
- document are to be interpreted as described in <xref
- target="RFC2119">RFC 2119</xref>.</t>
- </section>
- <section title="Stereo State Reset in SILK">
- <t>The reference implementation does not reinitialize the stereo state
- during a mode switch. The old stereo memory can produce a brief impulse
- (i.e. single sample) in the decoded audio. This can be fixed by changing
- silk/dec_API.c at line 72:
- <figure>
- <artwork><![CDATA[
- for( n = 0; n < DECODER_NUM_CHANNELS; n++ ) {
- ret = silk_init_decoder( &channel_state[ n ] );
- }
- + silk_memset(&((silk_decoder *)decState)->sStereo, 0,
- + sizeof(((silk_decoder *)decState)->sStereo));
- + /* Not strictly needed, but it's cleaner that way */
- + ((silk_decoder *)decState)->prev_decode_only_middle = 0;
-
- return ret;
- }
- ]]></artwork>
- </figure>
- This change affects the normative part of the decoder. Fortunately,
- the modified decoder is still compliant with the original specification because
- it still easily passes the testvectors. For example, for the float decoder
- at 48 kHz, the opus_compare (arbitrary) "quality score" changes from
- from 99.9333% to 99.925%.
- </t>
- </section>
- <section anchor="padding" title="Parsing of the Opus Packet Padding">
- <t>It was discovered that some invalid packets of very large size could trigger
- an out-of-bounds read in the Opus packet parsing code responsible for padding.
- This is due to an integer overflow if the signaled padding exceeds 2^31-1 bytes
- (the actual packet may be smaller). The code can be fixed by applying the following
- changes at line 596 of src/opus_decoder.c:
- <figure>
- <artwork><![CDATA[
- /* Padding flag is bit 6 */
- if (ch&0x40)
- {
- - int padding=0;
- int p;
- do {
- if (len<=0)
- return OPUS_INVALID_PACKET;
- p = *data++;
- len--;
- - padding += p==255 ? 254: p;
- + len -= p==255 ? 254: p;
- } while (p==255);
- - len -= padding;
- }
- ]]></artwork>
- </figure>
- </t>
- <t>This packet parsing issue is limited to reading memory up
- to about 60 kB beyond the compressed buffer. This can only be triggered
- by a compressed packet more than about 16 MB long, so it's not a problem
- for RTP. In theory, it <spanx style="emph">could</spanx> crash a file
- decoder (e.g. Opus in Ogg) if the memory just after the incoming packet
- is out-of-range, but our attempts to trigger such a crash in a production
- application built using an affected version of the Opus decoder failed.</t>
- </section>
- <section anchor="resampler" title="Resampler buffer">
- <t>The SILK resampler had the following issues:
- <list style="numbers">
- <t>The calls to memcpy() were using sizeof(opus_int32), but the type of the
- local buffer was opus_int16.</t>
- <t>Because the size was wrong, this potentially allowed the source
- and destination regions of the memcpy() to overlap.
- We <spanx style="emph">believe</spanx> that nSamplesIn is at least fs_in_khZ,
- which is at least 8.
- Since RESAMPLER_ORDER_FIR_12 is only 8, that should not be a problem once
- the type size is fixed.</t>
- <t>The size of the buffer used RESAMPLER_MAX_BATCH_SIZE_IN, but the
- data stored in it was actually _twice_ the input batch size
- (nSamplesIn<<1).</t>
- </list></t>
- <t>
- The fact that the code never produced any error in testing (including when run under the
- Valgrind memory debugger), suggests that in practice
- the batch sizes are reasonable enough that none of the issues above
- was ever a problem. However, proving that is non-obvious.
- </t>
- <t>The code can be fixed by applying the following changes to line 70 of silk/resampler_private_IIR_FIR.c:
- <figure>
- <artwork><![CDATA[
- )
- {
- silk_resampler_state_struct *S = \
- (silk_resampler_state_struct *)SS;
- opus_int32 nSamplesIn;
- opus_int32 max_index_Q16, index_increment_Q16;
- - opus_int16 buf[ RESAMPLER_MAX_BATCH_SIZE_IN + \
- RESAMPLER_ORDER_FIR_12 ];
- + opus_int16 buf[ 2*RESAMPLER_MAX_BATCH_SIZE_IN + \
- RESAMPLER_ORDER_FIR_12 ];
-
- /* Copy buffered samples to start of buffer */
- - silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 \
- * sizeof( opus_int32 ) );
- + silk_memcpy( buf, S->sFIR, RESAMPLER_ORDER_FIR_12 \
- * sizeof( opus_int16 ) );
-
- /* Iterate over blocks of frameSizeIn input samples */
- index_increment_Q16 = S->invRatio_Q16;
- while( 1 ) {
- nSamplesIn = silk_min( inLen, S->batchSize );
-
- /* Upsample 2x */
- silk_resampler_private_up2_HQ( S->sIIR, &buf[ \
- RESAMPLER_ORDER_FIR_12 ], in, nSamplesIn );
-
- max_index_Q16 = silk_LSHIFT32( nSamplesIn, 16 + 1 \
- ); /* + 1 because 2x upsampling */
- out = silk_resampler_private_IIR_FIR_INTERPOL( out, \
- buf, max_index_Q16, index_increment_Q16 );
- in += nSamplesIn;
- inLen -= nSamplesIn;
-
- if( inLen > 0 ) {
- /* More iterations to do; copy last part of \
- filtered signal to beginning of buffer */
- - silk_memcpy( buf, &buf[ nSamplesIn << 1 ], \
- RESAMPLER_ORDER_FIR_12 * sizeof( opus_int32 ) );
- + silk_memmove( buf, &buf[ nSamplesIn << 1 ], \
- RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) );
- } else {
- break;
- }
- }
-
- /* Copy last part of filtered signal to the state for \
- the next call */
- - silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], \
- RESAMPLER_ORDER_FIR_12 * sizeof( opus_int32 ) );
- + silk_memcpy( S->sFIR, &buf[ nSamplesIn << 1 ], \
- RESAMPLER_ORDER_FIR_12 * sizeof( opus_int16 ) );
- }
- ]]></artwork>
- </figure>
- Note: due to RFC formatting conventions, lines exceeding the column width
- in the patch above are split using a backslash character. The backslashes
- at the end of a line and the white space at the beginning
- of the following line are not part of the patch. A properly formatted patch
- including the three changes above is available at
- <eref target="http://jmvalin.ca/misc_stuff/opus_update.patch"/>.
- </t>
- </section>
- <section title="Downmix to Mono">
- <t>The last issue is not strictly a bug, but it is an issue that has been reported
- when downmixing an Opus decoded stream to mono, whether this is done inside the decoder
- or as a post-processing step on the stereo decoder output. Opus intensity stereo allows
- optionally coding the two channels 180-degrees out of phase on a per-band basis.
- This provides better stereo quality than forcing the two channels to be in phase,
- but when the output is downmixed to mono, the energy in the affected bands is cancelled
- sometimes resulting in audible artefacts.
- </t>
- <t>As a work-around for this issue, the decoder MAY choose not to apply the 180-degree
- phase shift when the output is meant to be downmixed (inside or
- outside of the decoder).
- </t>
- </section>
- <section anchor="IANA" title="IANA Considerations">
- <t>This document makes no request of IANA.</t>
- <t>Note to RFC Editor: this section may be removed on publication as an
- RFC.</t>
- </section>
- <section anchor="Acknowledgements" title="Acknowledgements">
- <t>We would like to thank Juri Aedla for reporting the issue with the parsing of
- the Opus padding.</t>
- </section>
- </middle>
- <back>
- <references title="References">
- <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml"?>
- <?rfc include="http://xml.resource.org/public/rfc/bibxml/reference.RFC.6716.xml"?>
- </references>
- </back>
- </rfc>
|