draft-maxwell-videocodec-requirements.xml 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520
  1. <?xml version='1.0'?>
  2. <!DOCTYPE rfc SYSTEM 'rfc2629.dtd'>
  3. <?rfc toc="yes" symrefs="yes" ?>
  4. <rfc ipr="trust200902" category="info" docName="draft-maxwell-videocodec-requirements-00">
  5. <front>
  6. <title abbrev="Video Codec Requirements">Requirements for an Internet Video Codec</title>
  7. <author initials="G." surname="Maxwell" fullname="Gregory Maxwell">
  8. <organization>Xiph.Org</organization>
  9. <address>
  10. <email>greg@xiph.org</email>
  11. </address>
  12. </author>
  13. <author initials="K." surname="Walsh" fullname="Kat Walsh">
  14. <organization></organization>
  15. <address>
  16. <postal>
  17. <street></street>
  18. <city></city>
  19. <region></region>
  20. <code></code>
  21. <country></country>
  22. </postal>
  23. <email>kat@mindspillage.org</email>
  24. </address>
  25. </author>
  26. <date day="15" month="October" year="2012" />
  27. <area>RAI</area>
  28. <keyword>video codec</keyword>
  29. <keyword>Internet-Draft</keyword>
  30. <abstract>
  31. <t>
  32. This document provides specific requirements for an Internet video codec.
  33. These requirements address quality, bit-rate, loss robustness,
  34. application suitability, as well as other desirable properties.
  35. </t>
  36. </abstract>
  37. </front>
  38. <middle>
  39. <section anchor="Introduction" title="Introduction">
  40. <t>
  41. This document provides requirements for a video codec designed specifically
  42. for use over the Internet.
  43. The requirements attempt to address the needs
  44. of the most common Internet video transmission applications and
  45. to ensure good quality when operating in conditions that are typical for the
  46. Internet.
  47. These requirements address quality, bit-rate, and loss robustness.
  48. Other desirable codec properties are considered as well.
  49. </t>
  50. </section>
  51. <section anchor="definitions" title="Definitions">
  52. <t>
  53. Codec bit-rates in bits per second (b/s) will be considered without counting any
  54. overhead (IP/UDP/RTP headers, padding, ...).
  55. </t>
  56. </section>
  57. <section anchor="applications" title="Applications">
  58. <t>
  59. The following applications should be considered for Internet video codecs,
  60. along with their requirements:
  61. <list style="symbols">
  62. <t>Live video streaming</t>
  63. <t>Video on demand</t>
  64. <t>Point to point video calls</t>
  65. <t>Video Conferencing</t>
  66. <t>Telepresence</t>
  67. <t>Teleoperation</t>
  68. <t>Remote software services</t>
  69. <t>Other applications</t>
  70. </list>
  71. </t>
  72. <section title="Point to point video calls">
  73. <t>
  74. Point to point calls are calls from two "standard" (fixed
  75. or mobile) phones, desktop, portable computers, or tablets, and implemented in
  76. hardware or software.
  77. </t>
  78. </section>
  79. <section title="Video Conferencing">
  80. <t>
  81. Video Conferencing applications (which support multi-party calls) have additional
  82. requirements on top of the requirements for point-to-point calls.
  83. Conferencing systems often have greater network bandwidth available.
  84. The ability to vary the bit-rate (VBR) is a desirable feature for the codec.
  85. This not only saves bandwidth "on average", but it can also help conference servers
  86. make more efficient use of the available bandwidth by using more bandwidth for
  87. important video streams and less bandwidth for less important ones.
  88. </t>
  89. </section>
  90. <section title="Telepresence">
  91. <t>
  92. Most telepresence applications can be considered to be essentially very
  93. high quality video conferencing environments, so all of the conferencing
  94. requirements also apply to telepresence.
  95. </t>
  96. </section>
  97. <section title="Teleoperation and Remote Software Services">
  98. <t>
  99. Teleoperation applications are similar to telepresence, with the exception that
  100. they involve remote physical interactions.
  101. For example, the user may be
  102. controlling a robot while receiving a real-time video feed from that robot.
  103. The other requirements
  104. of telepresence apply to teleoperation as well.
  105. </t>
  106. <t>
  107. The requirements for remote software services are similar to those of teleoperation.
  108. These applications include remote desktop applications, remote virtualization, and
  109. interactive media applications being rendered remotely (e.g. video games rendered
  110. on central servers).
  111. </t>
  112. </section>
  113. <section title="Other applications">
  114. <t>
  115. The above list is by no means a complete list of all applications involving
  116. interactive video transmission on the Internet.
  117. However, it is believed that meeting the needs of all these different
  118. applications should be sufficient to ensure that most applications not listed will also be met.
  119. </t>
  120. </section>
  121. </section>
  122. <section anchor="constraints" title="Constraints Imposed by the Internet on the
  123. Codec">
  124. <t>The bandwidth requirements of video are a significant obstacle for
  125. Internet deployment.
  126. A substantial portion of the hosts on the Internet have
  127. connectivity sufficient to carry perceptually lossless audio, even in
  128. inefficient uncompressed form.
  129. However, a much smaller portion of hosts have connectivity
  130. sufficient for the 200+ megabits per second required for uncompressed 30fps
  131. standard definition video, and expected resolutions for Internet video are
  132. increasing.
  133. Even the highest resolutions widely used off the Internet, where wide-area
  134. bandwidth is not a constraint, have not yet reached perceptual losslessness.
  135. In addition to increases in
  136. resolution, operating models which are less broadcast-oriented (including video
  137. on demand and video conferencing) limit the traffic mitigation effectiveness of
  138. CDNs and multicast, though support for these technologies remains essential.
  139. Because there are a few applications where increases in bandwidth efficiency are not
  140. important and many where improved efficiency is essential--such as delivering
  141. HD video to bandwidth-constrained network edges--it is important that the codec
  142. deliver competitive quality per bitrate and support a wide range of bandwidths.
  143. </t>
  144. <t>Packet losses are inevitable on the Internet and dealing with them is one of
  145. requirements for an Internet video codec.
  146. Efficient video compression typically
  147. uses very high gain backward prediction, which can result in infinite error
  148. propagation in the worst case if measures are not taken to mitigate it.
  149. Error propagation is usually mitigated in traditional file- and broadcast-oriented
  150. codecs through key-frames, periodic intra refresh, and constrained
  151. back-reference structure.
  152. While these techniques are important, and also enable random access, a codec
  153. designed for the Internet should also be able to take advantage of bidirectional
  154. communication to reduce the impact of loss when possible.
  155. </t>
  156. <t>In many high-latency and non-realtime applications, however, the relevant
  157. transport is lossless.
  158. While random access still is important, general error
  159. tolerance is not, and the codec may support modes which have very low error
  160. tolerance--including ones which prevent packet decode in the presence of loss,
  161. if it results in efficiency gains for non-realtime applications.</t>
  162. <t>For interactive applications latency is an important codec performance metric
  163. and many common input and output devices add frames of latency.
  164. To avoid adding further delays the codec must support operating in a mode that
  165. adds no more delay than that from processing a single frame at a time.
  166. Modes which permit sub-frame encoding may be useful but are hampered by the lack
  167. of subframe support in existing input and output devices.</t>
  168. <t>
  169. Another important property of the Internet is that it is mostly a best-effort
  170. network,
  171. with no guaranteed bandwidth.
  172. This means that the codec has to be able to vary its
  173. output bit-rate dynamically (in real-time), without requiring an out-of-band
  174. signaling mechanism, and without causing artifacts at the bit-rate
  175. change boundaries.
  176. Because the complete range of useful bit-rates may not be
  177. achievable at a single resolution the codec may need to support changing
  178. resolutions on the fly.
  179. Additional desirable features are:
  180. <list style="symbols">
  181. <t>Having the possibility to use smooth bit-rate changes with high bit-rate resolution;</t>
  182. <t>Making it possible for a codec to adapt its bit-rate based on the source
  183. signal being encoded (source-controlled VBR) to maximize the quality for a
  184. certain <spanx style="emph">average</spanx> bit-rate.</t>
  185. </list>
  186. Because the Internet transmits data in bytes, a codec should produce
  187. compressed data in integer numbers of bytes.
  188. In general, the codec design should take into consideration explicit congestion
  189. notification (ECN) and multicast and may include features that would improve the
  190. quality of an ECN or multicast enabled deployment.
  191. </t>
  192. <t>
  193. The IETF has defined a set of application-layer protocols to be used for
  194. transmitting real-time transport of multimedia data, including video.
  195. It
  196. is thus important for the resulting codec to be easy to use with these
  197. protocols.
  198. For example, it must be possible to create an <xref target="RTP"/> payload
  199. format that conforms to BCP 36 <xref target="PAYLOADS"/>.
  200. If any codec parameters need to be
  201. negotiated between end-points, the negotiation should be as easy as
  202. possible to carry over SIP <xref target="RFC3261"/>/SDP <xref target="RFC4566"/> or
  203. alternatively over XMPP <xref target="RFC6120"/>/Jingle <xref target="XEP-0167"/>.
  204. </t>
  205. </section>
  206. <section title="Detailed Basic Requirements">
  207. <t>
  208. This section summarizes all the constraints imposed by the target applications
  209. and by the Internet into a set of actual requirements for codec development.
  210. </t>
  211. <section title="Quality and bit-rate">
  212. <t>
  213. The quality of a codec is directly linked to the bit-rate, so these
  214. two must be considered jointly.
  215. When comparing the bit-rate of codecs, the
  216. overhead of IP/UDP/RTP headers should not be considered, but any additional
  217. bits required in the RTP payload format after the header (e.g. required
  218. signaling) should be considered.
  219. In terms of quality vs bit-rate, the codec
  220. to be developed must be better than the following codecs, that are generally
  221. considered as royalty-free:
  222. <list style="symbols">
  223. <t>VP8</t>
  224. <t>Theora</t>
  225. </list>
  226. </t>
  227. <t>It is desirable for the codecs to support source-controlled variable
  228. bit-rate (VBR) to take advantage from the fact that different inputs require
  229. a different bitrate to achieve the same quality.</t>
  230. </section>
  231. <section anchor="implementation" title="Computational resources">
  232. <t>
  233. The resulting codec should be implementable on a wide range of devices, and
  234. should not have a design which gratuitously complicates low power ASIC
  235. implementations.
  236. While the codec must not depend on special hardware features
  237. or instructions, the codec design should allow implementations to take full
  238. advantage of hardware accelerators and vector instructions where available.
  239. Complexity should generally scale with resolution, and it is also desirable to
  240. support multiple encoder and decoder complexity levels via mechanisms other than
  241. resolution, in order to achieve the best possible bitrate/quality trade-off
  242. available across many kinds of devices without unduly constraining resolution.
  243. The codec should also be able to take advantages of advances in computer speed
  244. and the deployment of hardware accelerators which would allow the use of higher
  245. complexity modes in a broader set of applications.
  246. </t>
  247. <t>
  248. In addition to computational complexity, dynamic memory for reference storage
  249. is a significant resource constraint for video codecs.
  250. It is desirable that the
  251. codec support different memory usage tradeoffs to fit on more devices, and that
  252. the codec not require implementations to utilize more memory without reasonable
  253. efficiency gains.
  254. </t>
  255. </section>
  256. </section>
  257. <section title="Additional considerations">
  258. <t>
  259. There are additional features or characteristics that may be desirable under
  260. some circumstances, but should not be part of the strict requirements.
  261. The benefit of meeting these considerations should be weighted against the
  262. associated cost.
  263. </t>
  264. </section>
  265. <section title="Encoder side potential for improvement">
  266. <t>
  267. In most video codecs, it is possible to improve the quality by improving the encoder
  268. without breaking compatibility (i.e. without changing the decoder).
  269. Potential for improvement varies from one codec to another.
  270. All things being equal, being able to improve a codec after the bit-stream
  271. is frozen is a desirable property.
  272. However, this should not be done at the expense of quality in a
  273. straight-forward encoder.
  274. </t>
  275. </section>
  276. <section title="Bit error robustness">
  277. <t>
  278. The vast majority of Internet-based applications do not need to be robust to bit
  279. errors because packets
  280. either arrive unaltered, or do not arrive at all.
  281. Considering that, the emphasis should be on packet loss robustness and packet loss concealment.
  282. That being said, it is often the case that extra robustness to bit errors can be achieved
  283. at no cost at all (i.e. no increase in size, complexity or bit-rate, no decrease
  284. in quality or packet loss robustness, ...).
  285. In those cases then it is useful to
  286. make a change that increases the robustness to bit errors.
  287. This can be useful for
  288. applications that use UDP Lite transmission (e.g. over a wireless LAN).
  289. Robustness to packet loss should <spanx style="strong">never</spanx> be
  290. sacrificed to achieve higher bit error robustness.
  291. </t>
  292. </section>
  293. <section title="Legacy compatibility">
  294. <t>
  295. In order to create the best possible codec for the Internet, there is no
  296. general requirement for compatibility with legacy Internet codecs.
  297. However, compatibility with commonly used video color formats is desirable.
  298. </t>
  299. </section>
  300. <section anchor="Security Considerations" title="Security Considerations">
  301. <t>
  302. Although this document itself does not have security considerations,
  303. this section describes the security requirements for the codec.
  304. </t>
  305. <t>
  306. Just like for any protocol to be used over the Internet, security is a
  307. very important aspect to consider.
  308. This goes beyond the obvious
  309. considerations of preventing buffer overflows and similar attacks that
  310. can lead to denial-of-service or remote code execution.
  311. One very important
  312. security aspect is to make sure that the decoders have a bounded and reasonable
  313. worst case complexity.
  314. This prevents an attacker from causing a
  315. DoS by sending packets that are specially crafted to take a very long (or
  316. infinite) time to decode.
  317. </t>
  318. </section>
  319. <section title="IANA Considerations ">
  320. <t>
  321. This document has no actions for IANA.
  322. </t>
  323. </section>
  324. </middle>
  325. <back>
  326. <references title="Informative References">
  327. <reference anchor='RFC3261'>
  328. <front>
  329. <title>SIP: Session Initiation Protocol</title>
  330. <author initials='J.' surname='Rosenberg' fullname='J. Rosenberg'>
  331. <organization /></author>
  332. <author initials='H.' surname='Schulzrinne' fullname='H. Schulzrinne'>
  333. <organization /></author>
  334. <author initials='G.' surname='Camarillo' fullname='G. Camarillo'>
  335. <organization /></author>
  336. <author initials='A.' surname='Johnston' fullname='A. Johnston'>
  337. <organization /></author>
  338. <author initials='J.' surname='Peterson' fullname='J. Peterson'>
  339. <organization /></author>
  340. <author initials='R.' surname='Sparks' fullname='R. Sparks'>
  341. <organization /></author>
  342. <author initials='M.' surname='Handley' fullname='M. Handley'>
  343. <organization /></author>
  344. <author initials='E.' surname='Schooler' fullname='E. Schooler'>
  345. <organization /></author>
  346. <date year='2002' month='June' />
  347. <abstract>
  348. <t>This document describes Session Initiation Protocol (SIP), an application-layer control (signaling) protocol for creating, modifying, and terminating sessions with one or more participants. These sessions include Internet telephone calls, multimedia distribution, and multimedia conferences. [STANDARDS-TRACK]</t></abstract></front>
  349. <seriesInfo name='RFC' value='3261' />
  350. <format type='TXT' octets='647976' target='http://www.rfc-editor.org/rfc/rfc3261.txt' />
  351. </reference>
  352. <reference anchor='RFC4566'>
  353. <front>
  354. <title>SDP: Session Description Protocol</title>
  355. <author initials='M.' surname='Handley' fullname='M. Handley'>
  356. <organization /></author>
  357. <author initials='V.' surname='Jacobson' fullname='V. Jacobson'>
  358. <organization /></author>
  359. <author initials='C.' surname='Perkins' fullname='C. Perkins'>
  360. <organization /></author>
  361. <date year='2006' month='July' />
  362. <abstract>
  363. <t>This memo defines the Session Description Protocol (SDP). SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. [STANDARDS-TRACK]</t></abstract></front>
  364. <seriesInfo name='RFC' value='4566' />
  365. <format type='TXT' octets='108820' target='http://www.rfc-editor.org/rfc/rfc4566.txt' />
  366. </reference>
  367. <reference anchor='RFC6120'>
  368. <front>
  369. <title>Extensible Messaging and Presence Protocol (XMPP): Core</title>
  370. <author initials='P.' surname='Saint-Andre' fullname='P. Saint-Andre'>
  371. <organization /></author>
  372. <date year='2011' month='March' />
  373. <abstract>
  374. <t>The Extensible Messaging and Presence Protocol (XMPP) is an application profile of the Extensible Markup Language (XML) that enables the near-real-time exchange of structured yet extensible data between any two or more network entities. This document defines XMPP's core protocol methods: setup and teardown of XML streams, channel encryption, authentication, error handling, and communication primitives for messaging, network availability ("presence"), and request-response interactions. This document obsoletes RFC 3920. [STANDARDS-TRACK]</t></abstract></front>
  375. <seriesInfo name='RFC' value='6120' />
  376. <format type='TXT' octets='451942' target='http://www.rfc-editor.org/rfc/rfc6120.txt' />
  377. </reference>
  378. <reference anchor="XEP-0167">
  379. <front>
  380. <title>Jingle RTP Sessions</title>
  381. <author initials="S." surname="Ludwig" fullname="Scott Ludwig">
  382. <organization/>
  383. <address>
  384. <email>scottlu@google.com</email>
  385. </address>
  386. </author>
  387. <author initials="P." surname="Saint-Andre" fullname="Peter Saint-Andre">
  388. <organization/>
  389. <address>
  390. <email>stpeter@jabber.org</email>
  391. </address>
  392. </author>
  393. <author initials="S." surname="Egan" fullname="Sean Egan">
  394. <organization/>
  395. <address>
  396. <email>seanegan@google.com</email>
  397. </address>
  398. </author>
  399. <author initials="R." surname="McQueen" fullname="Robert McQueen">
  400. <organization/>
  401. <address>
  402. <email>robert.mcqueen@collabora.co.uk</email>
  403. </address>
  404. </author>
  405. <author initials="D." surname="Cionoiu" fullname="Diana Cionoiu">
  406. <organization/>
  407. <address>
  408. <email>diana@null.ro</email>
  409. </address>
  410. </author>
  411. <date day="23" month="December" year="2009"/>
  412. </front>
  413. <seriesInfo name="XSF XEP" value="0167"/>
  414. <format type="HTML" target="http://xmpp.org/extensions/xep-0167.html"/>
  415. </reference>
  416. <reference anchor="PAYLOADS">
  417. <front>
  418. <title>Guidelines for Writers of RTP Payload Format Specifications</title>
  419. <author initials="M." surname="Handley" fullname="Mark Handley">
  420. <organization/></author>
  421. <author initials="C." surname="Perkins" fullname="Colin Perkins">
  422. <organization/></author>
  423. </front>
  424. <seriesInfo name="RFC" value="2736" />
  425. <seriesInfo name="BCP" value="36" />
  426. </reference>
  427. <reference anchor="RTP">
  428. <front>
  429. <title>RTP: A Transport Protocol for real-time applications</title>
  430. <author initials="H." surname="Schulzrinne" fullname="Henning Schulzrinne"><organization/></author>
  431. <author initials="S." surname="Casner" fullname="Stephen L. Casner"><organization/></author>
  432. <author initials="R." surname="Frederick" fullname="Ron Frederick"><organization/></author>
  433. <author initials="V." surname="Jacobson" fullname="Van Jacobson"><organization/></author>
  434. </front>
  435. <seriesInfo name="RFC" value="3550" />
  436. </reference>
  437. </references>
  438. </back>
  439. </rfc>