README.adoc 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337
  1. = audiowmark - Audio Watermarking
  2. == Description
  3. `audiowmark` is an Open Source solution for audio watermarking. It is
  4. distributed under the terms of the GNU General Public License. A sound file is
  5. read by the software, and a 128-bit message is stored in a watermark in the
  6. output sound file. For human listeners, the files typically sound the same.
  7. However, the 128-bit message can be retrieved from the output sound file. Our
  8. tests show, that even if the file is converted to mp3 or ogg (with bitrate 128
  9. kbit/s or higher), the watermark usually can be retrieved without problems. The
  10. process of retrieving the message does not need the original audio file (blind
  11. decoding).
  12. Internally, audiowmark is using the patchwork algorithm to hide the data in the
  13. spectrum of the audio file. The signal is split into 1024 sample frames. For
  14. each frame, some pseoudo-randomly selected amplitudes of the frequency bands of
  15. a 1024-value FFTs are increased or decreased slightly, which can be detected
  16. later. The algorithm used here is inspired by
  17. Martin Steinebach: Digitale Wasserzeichen für Audiodaten.
  18. Darmstadt University of Technology 2004, ISBN 3-8322-2507-2
  19. == Adding a Watermark
  20. To add a watermark to the soundfile in.wav with a 128-bit message (which is
  21. specified as hex-string):
  22. [subs=+quotes]
  23. ....
  24. *$ audiowmark add in.wav out.wav 0123456789abcdef0011223344556677*
  25. Input: in.wav
  26. Output: out.wav
  27. Message: 0123456789abcdef0011223344556677
  28. Strength: 10
  29. Time: 3:59
  30. Sample Rate: 48000
  31. Channels: 2
  32. Data Blocks: 4
  33. ....
  34. The most important options for adding a watermark are:
  35. --key <filename>::
  36. Use watermarking key from file <filename> (see <<key>>).
  37. --strength <s>::
  38. Set the watermarking strength (see <<strength>>).
  39. == Retrieving a Watermark
  40. To get the 128-bit message from the watermarked file, use:
  41. [subs=+quotes]
  42. ....
  43. *$ audiowmark get out.wav*
  44. pattern 0:05 0123456789abcdef0011223344556677 1.324 0.059 A
  45. pattern 0:57 0123456789abcdef0011223344556677 1.413 0.112 B
  46. pattern 0:57 0123456789abcdef0011223344556677 1.368 0.086 AB
  47. pattern 1:49 0123456789abcdef0011223344556677 1.302 0.098 A
  48. pattern 2:40 0123456789abcdef0011223344556677 1.361 0.093 B
  49. pattern 2:40 0123456789abcdef0011223344556677 1.331 0.096 AB
  50. pattern all 0123456789abcdef0011223344556677 1.350 0.054
  51. ....
  52. The output of `audiowmark get` is designed to be machine readable. Each line
  53. that starts with `pattern` contains one decoded message. The fields are
  54. seperated by one or more space characters. The first field is a *timestamp*
  55. indicating the position of the data block. The second field is the *decoded
  56. message*. For most purposes this is all you need to know.
  57. The software was designed under the assumption that you - the user - will be
  58. able to decide whether a message is correct or not. To do this, on watermarking
  59. song files, you could list each message you embedded in a database. During
  60. retrieval, you should look up each pattern `audiowmark get` outputs in the
  61. database. If the message is not found, then you should assume that a decoding
  62. error occurred. In our example each pattern was decoded correctly, because
  63. the watermark was not damaged at all, but if you for instance use lossy
  64. compression (with a low bitrate), it may happen that only some of the decoded
  65. patterns are correct. Or none, if the watermark was damaged too much.
  66. The third field is the *sync score* (higher is better). The synchronization
  67. algorithm tries to find valid data blocks in the audio file, that become
  68. candidates for decoding.
  69. The fourth field is the *decoding error* (lower is better). During message
  70. decoding, we use convolutional codes for error correction, to make the
  71. watermarking more robust.
  72. The fifth field is the *block type*. There are two types of data blocks,
  73. A blocks and B blocks. A single data block can be decoded alone, as it
  74. contains a complete message. However, if during watermark detection an
  75. A block followed by a B block was found, these two can be decoded
  76. together (then this field will be AB), resulting in even higher error
  77. correction capacity than one block alone would have.
  78. To improve the error correction capacity even further, the `all` pattern
  79. combines all data blocks that are available. The combined decoded
  80. message will often be the most reliable result (meaning that even if all
  81. other patterns were incorrect, this could still be right).
  82. The most important options for getting a watermark are:
  83. --key <filename>::
  84. Use watermarking key from file <filename> (see <<key>>).
  85. --strength <s>::
  86. Set the watermarking strength (see <<strength>>).
  87. [[key]]
  88. == Watermark Key
  89. Since the software is Open Source, a watermarking key should be used to ensure
  90. that the message bits cannot be retrieved by somebody else (which would also
  91. allow removing the watermark without loss of quality). The watermark key
  92. controls all pseudo-random parameters of the algorithm. This means that
  93. it determines which frequency bands are increased or decreased to store a
  94. 0 bit or a 1 bit. Without the key, it is impossible to decode the message
  95. bits from the audio file alone.
  96. Our watermarking key is a 128-bit AES key. A key can be generated using
  97. audiowmark gen-key test.key
  98. and can be used for the add/get commands as follows:
  99. audiowmark add --key test.key in.wav out.wav 0123456789abcdef0011223344556677
  100. audiowmark get --key test.key out.wav
  101. [[strength]]
  102. == Watermark Strength
  103. The watermark strength parameter affects how much the watermarking algorithm
  104. modifies the input signal. A stronger watermark is more audible, but also more
  105. robust against modifications. The default strength is 10. A watermark with that
  106. strength is recoverable after mp3/ogg encoding with 128kbit/s or higher. In our
  107. informal listening tests, this setting also has a very good subjective quality.
  108. A higher strength (for instance 15) would be helpful for instance if robustness
  109. against multiple conversions or conversions to low bit rates (i.e. 64kbit/s) is
  110. desired.
  111. A lower strength (for instance 6) makes the watermark less audible, but also
  112. less robust. Strengths below 5 are not recommended. To set the strength, the
  113. same value has to be passed during both, generation and retrieving the
  114. watermark. Fractional strengths (like 7.5) are possible.
  115. audiowmark add --strength 15 in.wav out.wav 0123456789abcdef0011223344556677
  116. audiowmark get --strength 15 out.wav
  117. == Video Files
  118. For video files, `videowmark` can be used to add a watermark to the audio track
  119. of video files. To add a watermark, use
  120. [subs=+quotes]
  121. ....
  122. *$ videowmark add in.avi out.avi 0123456789abcdef0011223344556677*
  123. Audio Codec: -c:a mp3 -ab 128000
  124. Input: in.avi
  125. Output: out.avi
  126. Message: 0123456789abcdef0011223344556677
  127. Strength: 10
  128. Time: 3:53
  129. Sample Rate: 44100
  130. Channels: 2
  131. Data Blocks: 4
  132. ....
  133. To detect a watermark, use
  134. [subs=+quotes]
  135. ....
  136. *$ videowmark get out.avi*
  137. pattern 0:05 0123456789abcdef0011223344556677 1.294 0.142 A
  138. pattern 0:57 0123456789abcdef0011223344556677 1.191 0.144 B
  139. pattern 0:57 0123456789abcdef0011223344556677 1.242 0.145 AB
  140. pattern 1:49 0123456789abcdef0011223344556677 1.215 0.120 A
  141. pattern 2:40 0123456789abcdef0011223344556677 1.079 0.128 B
  142. pattern 2:40 0123456789abcdef0011223344556677 1.147 0.126 AB
  143. pattern all 0123456789abcdef0011223344556677 1.195 0.104
  144. ....
  145. The key and strength can be set using the command line options
  146. --key <filename>::
  147. Use watermarking key from file <filename> (see <<key>>).
  148. --strength <s>::
  149. Set the watermarking strength (see <<strength>>).
  150. == Output as Stream
  151. Usually, an input file is read, watermarked and an output file is written.
  152. This means that it takes some time before the watermarked file can be used.
  153. An alternative is to output the watermarked file as stream to stdout. One use
  154. case is sending the watermarked file to a user via network while the
  155. watermarker is still working on the rest of the file. Here is an example how to
  156. watermark a wav file to stdout:
  157. audiowmark add in.wav - 0123456789abcdef0011223344556677 | play -
  158. In this case the file in.wav is read, watermarked, and the output is sent
  159. to stdout. The "play -" can start playing the watermarked stream while the
  160. rest of the file is being watermarked.
  161. If - is used as output, the output is a valid .wav file, so the programs
  162. running after `audiowmark` will be able to determine sample rate, number of
  163. channels, bit depth, encoding and so on from the wav header.
  164. Note that all input formats supported by audiowmark can be used in this way,
  165. for instance flac/mp3:
  166. audiowmark add in.flac - 0123456789abcdef0011223344556677 | play -
  167. audiowmark add in.mp3 - 0123456789abcdef0011223344556677 | play -
  168. == Input from Stream
  169. Similar to the output, the `audiowmark` input can be a stream. In this case,
  170. the input must be a valid .wav file. The watermarker will be able to
  171. start watermarking the input stream before all data is available. An
  172. example would be:
  173. cat in.wav | audiowmark add - out.wav 0123456789abcdef0011223344556677
  174. It is possible to do both, input from stream and output as stream.
  175. cat in.wav | audiowmark add - - 0123456789abcdef0011223344556677 | play -
  176. Streaming input is also supported for watermark detection.
  177. cat in.wav | audiowmark get -
  178. == Raw Streams
  179. So far, all streams described here are essentially wav streams, which means
  180. that the wav header allows `audiowmark` to determine sample rate, number of
  181. channels, bit depth, encoding and so forth from the stream itself, and the a
  182. wav header is written for the program after `audiowmark`, so that this can
  183. figure out the parameters of the stream.
  184. There are two cases where this is problematic. The first case is if the full
  185. length of the stream is not known at the time processing starts. Then a wav
  186. header cannot be used, as the wav file contains the length of the stream. The
  187. second case is that the program before or after `audiowmark` doesn't support wav
  188. headers.
  189. For these two cases, raw streams are available. The idea is to set all
  190. information that is needed like sample rate, number of channels,... manually.
  191. Then, headerless data can be processed from stdin and/or sent to stdout.
  192. --input-format raw::
  193. --output-format raw::
  194. --format raw::
  195. These can be used to set the input format or output format to raw. The
  196. last version sets both, input and output format to raw.
  197. --raw-rate <rate>::
  198. This should be used to set the sample rate. The input sample rate and
  199. the output sample rate will always be the same (no resampling is
  200. done by the watermarker). There is no default for the sampling rate,
  201. so this parameter must always be specified for raw streams.
  202. --raw-input-bits <bits>::
  203. --raw-output-bits <bits>::
  204. --raw-bits <bits>::
  205. The options can be used to set the input number of bits, the output number
  206. of bits or both. The number of bits can either be `16` or `24`. The default
  207. number of bits is `16`.
  208. --raw-input-endian <endian>::
  209. --raw-output-endian <endian>::
  210. --raw-endian <endian>::
  211. These options can be used to set the input/output endianness or both.
  212. The <endian> parameter can either be `little` or `big`. The default
  213. endianness is `little`.
  214. --raw-input-encoding <encoding>::
  215. --raw-output-encoding <encoding>::
  216. --raw-encoding <encoding>::
  217. These options can be used to set the input/output encoding or both.
  218. The <encoding> parameter can either be `signed` or `unsigned`. The
  219. default encoding is `signed`.
  220. --raw-channels <channels>::
  221. This can be used to set the number of channels. Note that the number
  222. of input channels and the number of output channels must always be the
  223. same. The watermarker has been designed and tested for stereo files,
  224. so the number of channels should really be `2`. This is also the
  225. default.
  226. == Dependencies
  227. If you compile from source, `audiowmark` needs the following libraries:
  228. * libfftw3
  229. * libsndfile
  230. * libgcrypt
  231. * libzita-resampler
  232. * libmpg123
  233. == Building fftw
  234. `audiowmark` needs the single prevision variant of fftw3.
  235. If you are building fftw3 from source, use the `--enable-float`
  236. configure parameter to build it, e.g.::
  237. cd ${FFTW3_SOURCE}
  238. ./configure --enable-float --enable-sse && \
  239. make && \
  240. sudo make install
  241. or, when building from git
  242. cd ${FFTW3_GIT}
  243. ./bootstrap.sh --enable-shared --enable-sse --enable-float && \
  244. make && \
  245. sudo make install
  246. == Docker Build
  247. You should be able to execute `audiowmark` via Docker.
  248. Example that outputs the usage message:
  249. docker build -t audiowmark .
  250. docker run -v <local-data-directory>:/data -it audiowmark -h