intra_prob.lyx 8.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344
  1. #LyX 2.1 created this file. For more info see http://www.lyx.org/
  2. \lyxformat 474
  3. \begin_document
  4. \begin_header
  5. \textclass article
  6. \use_default_options true
  7. \maintain_unincluded_children false
  8. \language english
  9. \language_package default
  10. \inputencoding auto
  11. \fontencoding global
  12. \font_roman default
  13. \font_sans default
  14. \font_typewriter default
  15. \font_math auto
  16. \font_default_family default
  17. \use_non_tex_fonts false
  18. \font_sc false
  19. \font_osf false
  20. \font_sf_scale 100
  21. \font_tt_scale 100
  22. \graphics default
  23. \default_output_format default
  24. \output_sync 0
  25. \bibtex_command default
  26. \index_command default
  27. \paperfontsize default
  28. \spacing onehalf
  29. \use_hyperref false
  30. \papersize default
  31. \use_geometry false
  32. \use_package amsmath 1
  33. \use_package amssymb 1
  34. \use_package cancel 1
  35. \use_package esint 1
  36. \use_package mathdots 1
  37. \use_package mathtools 1
  38. \use_package mhchem 1
  39. \use_package stackrel 1
  40. \use_package stmaryrd 1
  41. \use_package undertilde 1
  42. \cite_engine basic
  43. \cite_engine_type default
  44. \biblio_style plain
  45. \use_bibtopic false
  46. \use_indices false
  47. \paperorientation portrait
  48. \suppress_date false
  49. \justification true
  50. \use_refstyle 1
  51. \index Index
  52. \shortcut idx
  53. \color #008000
  54. \end_index
  55. \secnumdepth 3
  56. \tocdepth 3
  57. \paragraph_separation indent
  58. \paragraph_indentation default
  59. \quotes_language english
  60. \papercolumns 1
  61. \papersides 1
  62. \paperpagestyle default
  63. \tracking_changes false
  64. \output_changes false
  65. \html_math_output 0
  66. \html_css_as_file 0
  67. \html_be_strict false
  68. \end_header
  69. \begin_body
  70. \begin_layout Title
  71. Probability Modelling of Intra Prediction Modes
  72. \end_layout
  73. \begin_layout Author
  74. Jean-Marc Valin
  75. \end_layout
  76. \begin_layout Section
  77. Introduction
  78. \end_layout
  79. \begin_layout Standard
  80. Modern video codecs and still image codecs make use of intra prediction
  81. where a region (or block) of the image is predicted based on its surrounding.
  82. There are usually multiple intra predictor
  83. \emph on
  84. modes
  85. \emph default
  86. , each preforming a different kind of prediction.
  87. For example, some modes may predict along a particular direction, in which
  88. case the selected mode would typically represent the direction of the patterns
  89. in the region being coded.
  90. The mode is typically selected by the encoder and transmitted to the decoder.
  91. The cost of coding the mode can be large for small block sizes, so it is
  92. important to efficiently encode the information using entropy coding
  93. \begin_inset space ~
  94. \end_inset
  95. \begin_inset CommandInset citation
  96. LatexCommand cite
  97. key "MW98,SM98"
  98. \end_inset
  99. .
  100. The following describes an efficient way of modelling mode probabilities.
  101. \end_layout
  102. \begin_layout Section
  103. Probability Modelling
  104. \end_layout
  105. \begin_layout Standard
  106. Let
  107. \begin_inset Formula $m_{i,j}$
  108. \end_inset
  109. be the id of the intra prediction mode selected for block
  110. \begin_inset Formula $\left(i,j\right)$
  111. \end_inset
  112. .
  113. It is useful to know the probability
  114. \begin_inset Formula $p\left(m_{i,j}\right)$
  115. \end_inset
  116. to make optimal use of entropy coding when encoding the selected mode in
  117. the bitstream.
  118. In particular, coding can be made more efficient by making use of the context:
  119. modes selected for causal neighbors that are already encoded in the bitstream.
  120. For example, it is desirable to estimate
  121. \begin_inset Formula $p\left(m_{i,j}\left|m_{i-1,j-1},m_{i,j-1},m_{i-1,j}\right.\right)$
  122. \end_inset
  123. .
  124. Let
  125. \begin_inset Formula $M$
  126. \end_inset
  127. be the number of possible modes and
  128. \begin_inset Formula $N$
  129. \end_inset
  130. be the number of neighbors considered, then explicit modelling of the condition
  131. al probabilities using an explicit table requires
  132. \begin_inset Formula $M^{\left(N+1\right)}$
  133. \end_inset
  134. entries, which rapidly becomes prohibitive.
  135. For example, with 10 modes and using the left, up-left and up blocks as
  136. context requires a table with 10,000 entries.
  137. This is one reason why the VP8 codec only uses the left and up blocks as
  138. context.
  139. \end_layout
  140. \begin_layout Standard
  141. It is possible to reduce the size of the context by considering only whether
  142. the neighboring blocks use the same mode or a different mode, i.e.
  143. modelling the probability
  144. \begin_inset Formula $p\left(m_{i,j}\left|m_{i-1,j-1}=m_{i,j},m_{i,j-1}=m_{i,j},m_{i-1,j}=m_{i,j}\right.\right)$
  145. \end_inset
  146. instead of
  147. \begin_inset Formula $p\left(m_{i,j}\left|m_{i-1,j-1},m_{i,j-1},m_{i-1,j}\right.\right)$
  148. \end_inset
  149. .
  150. Because the conditional parameters are now binary (equal or not equal),
  151. then a lookup table only requires
  152. \begin_inset Formula $M\cdot2^{N}$
  153. \end_inset
  154. entries, e.g., only 80
  155. \begin_inset space ~
  156. \end_inset
  157. entries when using 3
  158. \begin_inset space ~
  159. \end_inset
  160. neighbours.
  161. One drawback of this approach is that
  162. \begin_inset Formula
  163. \begin{equation}
  164. S_{p}=\sum_{k=0}^{M}p\left(m_{i,j}=k\left|m_{i-1,j-1}=k,m_{i,j-1}=k,m_{i-1,j}=k\right.\right)\neq1
  165. \end{equation}
  166. \end_inset
  167. so some form of renormalization is required.
  168. \end_layout
  169. \begin_layout Standard
  170. The simplest way to renormalize probabilities is to multiply each of them
  171. by the same factor such that the sum equals 1.
  172. If the entropy coder, such as a classic non-binary arithmetic coder or
  173. a range coder, is set up to code symbols using cumulative frequency counts,
  174. then simply declaring
  175. \begin_inset Formula $S_{p}$
  176. \end_inset
  177. to be the
  178. \emph on
  179. total frequency count
  180. \emph default
  181. allows the entropy coder to perform the renormalization process automatically
  182. as part of the coding process.
  183. This method is the simplest, but sometimes does not accurately model probabilit
  184. ies when one of the probabilities is close to unity.
  185. \end_layout
  186. \begin_layout Standard
  187. A slightly more bit-efficient renormalization procedure is to first
  188. \begin_inset Quotes eld
  189. \end_inset
  190. amplify
  191. \begin_inset Quotes erd
  192. \end_inset
  193. probabilities that are close to unity:
  194. \begin_inset Formula
  195. \begin{equation}
  196. p_{k}^{'}=p_{k}\cdot\frac{\left(\sum_{j}p_{j}\right)-p_{k}}{1-p_{k}}\,,\label{eq:weighted-renorm}
  197. \end{equation}
  198. \end_inset
  199. followed by renormalizing the
  200. \begin_inset Formula $p_{k}^{'}$
  201. \end_inset
  202. to have a sum of 1.
  203. The step in
  204. \begin_inset CommandInset ref
  205. LatexCommand ref
  206. reference "eq:weighted-renorm"
  207. \end_inset
  208. results in probabilities close to 1 being less affected by the renormalization
  209. and slightly improves coding efficiency over the simple renormalization.
  210. \end_layout
  211. \begin_layout Subsection
  212. Adaptation
  213. \end_layout
  214. \begin_layout Standard
  215. There are two possible ways of adapting the probability model to individual
  216. images being coded.
  217. The first is to compute an online probability
  218. \begin_inset Formula $p_{0}\left(m\right)$
  219. \end_inset
  220. of mode
  221. \begin_inset Formula $m$
  222. \end_inset
  223. being selected when none of the neighbours use mode
  224. \begin_inset Formula $m$
  225. \end_inset
  226. .
  227. Then,
  228. \begin_inset Formula $p_{0}\left(m\right)$
  229. \end_inset
  230. is used as a floor probability for each mode
  231. \begin_inset Formula $m$
  232. \end_inset
  233. .
  234. The purpose of this floor is to ensure that when a particular mode is highly
  235. used in an image, its cost goes down.
  236. \end_layout
  237. \begin_layout Standard
  238. The second way of adapting the probability model is to compute a statistic
  239. on a large number of previously encoded modes.
  240. For example, this can be the most often used mode in the frame, the percentage
  241. of non-directional modes used, the most common direction.
  242. This statistic can be used as an additional condition on the probability.
  243. In that case, the storage requirement becomes
  244. \begin_inset Formula $M\cdot2^{N}S$
  245. \end_inset
  246. where
  247. \begin_inset Formula $S$
  248. \end_inset
  249. is the number of possible discrete value for the statistic.
  250. \end_layout
  251. \begin_layout Bibliography
  252. \begin_inset CommandInset bibitem
  253. LatexCommand bibitem
  254. key "MW98"
  255. \end_inset
  256. Moffat, A., Witten, I.H.,
  257. \begin_inset Quotes eld
  258. \end_inset
  259. Arithmetic coding revisited
  260. \begin_inset Quotes erd
  261. \end_inset
  262. ,
  263. \emph on
  264. ACM Transactions on Information Systems (TOIS)
  265. \emph default
  266. , Vol.
  267. 16, Issue 3, pp.
  268. 256-294 , July 1998.
  269. \end_layout
  270. \begin_layout Bibliography
  271. \begin_inset CommandInset bibitem
  272. LatexCommand bibitem
  273. key "SM98"
  274. \end_inset
  275. Stuiver, L.
  276. and Moffat, A., "Piecewise Integer Mapping for Arithmetic Coding",
  277. \emph on
  278. Proc.
  279. of the 17th IEEE Data Compression Conference (DCC)
  280. \emph default
  281. , pp.
  282. 1-10, March/April 1998.
  283. \end_layout
  284. \end_body
  285. \end_document