README.adoc 28 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765
  1. = audiowmark - Audio Watermarking
  2. == Description
  3. `audiowmark` is an Open Source (GPL) solution for audio watermarking.
  4. A sound file is read by the software, and a 128-bit message is stored in a
  5. watermark in the output sound file. For human listeners, the files typically
  6. sound the same.
  7. However, the 128-bit message can be retrieved from the output sound file. Our
  8. tests show, that even if the file is converted to mp3 or ogg (with bitrate 128
  9. kbit/s or higher), the watermark usually can be retrieved without problems. The
  10. process of retrieving the message does not need the original audio file (blind
  11. decoding).
  12. Internally, audiowmark is using the patchwork algorithm to hide the data in the
  13. spectrum of the audio file. The signal is split into 1024 sample frames. For
  14. each frame, some pseoudo-randomly selected amplitudes of the frequency bands of
  15. a 1024-value FFTs are increased or decreased slightly, which can be detected
  16. later. The algorithm used here is inspired by
  17. Martin Steinebach: Digitale Wasserzeichen für Audiodaten.
  18. Darmstadt University of Technology 2004, ISBN 3-8322-2507-2
  19. If you are interested in the details how `audiowmark` works, there is
  20. a separate
  21. https://uplex.de/audiowmark/audiowmark-developer.pdf[*documentation for developers*].
  22. == Open Source License
  23. `audiowmark` is *open source* software available under the *GPLv3
  24. or later* license.
  25. Copyright (C) 2018-2020 Stefan Westerfeld
  26. This program is free software: you can redistribute it and/or modify
  27. it under the terms of the GNU General Public License as published by
  28. the Free Software Foundation, either version 3 of the License, or
  29. (at your option) any later version.
  30. This program is distributed in the hope that it will be useful,
  31. but WITHOUT ANY WARRANTY; without even the implied warranty of
  32. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  33. GNU General Public License for more details.
  34. You should have received a copy of the GNU General Public License
  35. along with this program. If not, see <http://www.gnu.org/licenses/>.
  36. == Adding a Watermark
  37. To add a watermark to the soundfile in.wav with a 128-bit message (which is
  38. specified as hex-string):
  39. [subs=+quotes]
  40. ....
  41. *$ audiowmark add in.wav out.wav 0123456789abcdef0011223344556677*
  42. Input: in.wav
  43. Output: out.wav
  44. Message: 0123456789abcdef0011223344556677
  45. Strength: 10
  46. Time: 3:59
  47. Sample Rate: 48000
  48. Channels: 2
  49. Data Blocks: 4
  50. ....
  51. If you want to use `audiowmark` in any serious application, please read the
  52. section <<rec-payload>> on how to generate the 128-bit message. Typically these
  53. bits should be a *hash* or *HMAC* of some sort.
  54. The most important options for adding a watermark are:
  55. --key <filename>::
  56. Use watermarking key from file <filename> (see <<key>>).
  57. --strength <s>::
  58. Set the watermarking strength (see <<strength>>).
  59. == Retrieving a Watermark
  60. To get the 128-bit message from the watermarked file, use:
  61. [subs=+quotes]
  62. ....
  63. *$ audiowmark get out.wav*
  64. pattern 0:05 0123456789abcdef0011223344556677 1.324 0.059 A
  65. pattern 0:57 0123456789abcdef0011223344556677 1.413 0.112 B
  66. pattern 0:57 0123456789abcdef0011223344556677 1.368 0.086 AB
  67. pattern 1:49 0123456789abcdef0011223344556677 1.302 0.098 A
  68. pattern 2:40 0123456789abcdef0011223344556677 1.361 0.093 B
  69. pattern 2:40 0123456789abcdef0011223344556677 1.331 0.096 AB
  70. pattern all 0123456789abcdef0011223344556677 1.350 0.054
  71. ....
  72. The output of `audiowmark get` is designed to be machine readable. Each line
  73. that starts with `pattern` contains one decoded message. The fields are
  74. seperated by one or more space characters. The first field is a *timestamp*
  75. indicating the position of the data block. The second field is the *decoded
  76. message*. For most purposes this is all you need to know.
  77. The software was designed under the assumption that the message is a *hash* or
  78. *HMAC* of some sort. Before you start using `audiowmark` in any serious
  79. application, please read the section <<rec-payload>>. You - the user - should
  80. be able to decide whether a message is correct or not. To do this, on
  81. watermarking song files, you could *create a database entry* for each message
  82. you embedded in a watermark. During retrieval, you should perform a *database
  83. lookup* for each pattern `audiowmark get` outputs. If the message is not found,
  84. then you should assume that a decoding error occurred. In our example each
  85. pattern was decoded correctly, because the watermark was not damaged at all,
  86. but if you for instance use lossy compression (with a low bitrate), it may
  87. happen that only some of the decoded patterns are correct. Or none, if the
  88. watermark was damaged too much.
  89. The third field is the *sync score* (higher is better). The synchronization
  90. algorithm tries to find valid data blocks in the audio file, that become
  91. candidates for decoding.
  92. The fourth field is the *decoding error* (lower is better). During message
  93. decoding, we use convolutional codes for error correction, to make the
  94. watermarking more robust.
  95. The fifth field is the *block type*. There are two types of data blocks,
  96. A blocks and B blocks. A single data block can be decoded alone, as it
  97. contains a complete message. However, if during watermark detection an
  98. A block followed by a B block was found, these two can be decoded
  99. together (then this field will be AB), resulting in even higher error
  100. correction capacity than one block alone would have.
  101. To improve the error correction capacity even further, the `all` pattern
  102. combines all data blocks that are available. The combined decoded
  103. message will often be the most reliable result (meaning that even if all
  104. other patterns were incorrect, this could still be right).
  105. The most important options for getting a watermark are:
  106. --key <filename>::
  107. Use watermarking key from file <filename> (see <<key>>).
  108. --strength <s>::
  109. Set the watermarking strength (see <<strength>>).
  110. --detect-speed::
  111. --detect-speed-patient::
  112. Detect and correct replay speed difference (see <<speed>>).
  113. --json <file>::
  114. Write results to <file> in machine readable JSON format.
  115. [[key]]
  116. == Watermark Key
  117. Since the software is Open Source, a watermarking key should be used to ensure
  118. that the message bits cannot be retrieved by somebody else (which would also
  119. allow removing the watermark without loss of quality). The watermark key
  120. controls all pseudo-random parameters of the algorithm. This means that
  121. it determines which frequency bands are increased or decreased to store a
  122. 0 bit or a 1 bit. Without the key, it is impossible to decode the message
  123. bits from the audio file alone.
  124. Our watermarking key is a 128-bit AES key. A key can be generated using
  125. audiowmark gen-key test.key
  126. and can be used for the add/get commands as follows:
  127. audiowmark add --key test.key in.wav out.wav 0123456789abcdef0011223344556677
  128. audiowmark get --key test.key out.wav
  129. Keys can be named using the `gen-key --name` option, and the key name will be
  130. reported for each match:
  131. audiowmark gen-key oct23.key --name "October 2023"
  132. Finally, it is possible to use the `--key` option more than once for watermark
  133. detection. In this case, all keys that are specified will be tried. This is
  134. useful if you change keys on a regular basis, and passing multiple keys is
  135. more efficient than performing watermark detection multiple times with one
  136. key.
  137. audiowmark get --key oct23.key --key nov23.key --key dec23.key out.wav
  138. [[strength]]
  139. == Watermark Strength
  140. The watermark strength parameter affects how much the watermarking algorithm
  141. modifies the input signal. A stronger watermark is more audible, but also more
  142. robust against modifications. The default strength is 10. A watermark with that
  143. strength is recoverable after mp3/ogg encoding with 128kbit/s or higher. In our
  144. informal listening tests, this setting also has a very good subjective quality.
  145. A higher strength (for instance 15) would be helpful for instance if robustness
  146. against multiple conversions or conversions to low bit rates (i.e. 64kbit/s) is
  147. desired.
  148. A lower strength (for instance 6) makes the watermark less audible, but also
  149. less robust. Strengths below 5 are not recommended. To set the strength, the
  150. same value has to be passed during both, generation and retrieving the
  151. watermark. Fractional strengths (like 7.5) are possible.
  152. audiowmark add --strength 15 in.wav out.wav 0123456789abcdef0011223344556677
  153. audiowmark get --strength 15 out.wav
  154. [[rec-payload]]
  155. == Recommendations for the Watermarking Payload
  156. Although `audiowmark` does not specify what the 128-bit message stored in the
  157. watermark should be, it was designed under the assumption that the message
  158. should be a *hash* or *HMAC* of some sort.
  159. Lets look at a typical use case. We have a song called *Dreams* by an artist
  160. called *Alice*. A user called *John Smith* downloads a watermarked copy.
  161. Later, we find this file somewhere on the internet. Typically we want to answer
  162. the questions:
  163. * is this one of the files we previously watermarked?
  164. * what song/artist is this?
  165. * which user shared it?
  166. _When the user downloads a watermarked copy_, we construct a string that
  167. contains all information we need to answer our questions, for example
  168. like this:
  169. Artist:Alice|Title:Dreams|User:John Smith
  170. To obtain the 128-bit message, we can hash this string, for instance by
  171. using the first 128 bits of a SHA-256 hash like this:
  172. $ STRING='Artist:Alice|Title:Dreams|User:John Smith'
  173. $ MSG=`echo -n "$STRING" | sha256sum | head -c 32`
  174. $ echo $MSG
  175. ecd057f0d1fbb25d6430b338b5d72eb2
  176. This 128-bit message can be used as watermark:
  177. $ audiowmark add --key my.key song.wav song.wm.wav $MSG
  178. At this point, we should also *create a database entry* consisting of the
  179. hash value `$MSG` and the corresponding string `$STRING`.
  180. The shell commands for creating the hash are listed here to provide a
  181. simplified example. Fields (like the song title) can contain the characters `'`
  182. and `|`, so these cases need to be dealt with.
  183. _If we find a watermarked copy of the song on the net_, the first step is to
  184. detect the watermark message using
  185. $ audiowmark get --key my.key song.wm.wav
  186. pattern 0:05 ecd057f0d1fbb25d6430b338b5d72eb2 1.377 0.068 A
  187. pattern 0:57 ecd057f0d1fbb25d6430b338b5d72eb2 1.392 0.109 B
  188. [...]
  189. The second step is to perform a *database lookup* for each result returned by
  190. `audiowmark`. If we find a matching entry in our database, this is one of the
  191. files we previously watermarked.
  192. As a last step, we can use the string stored in the database, which contains
  193. the song/artist and the user that shared it.
  194. _The advantages of using a hash as message are:_
  195. 1. Although `audiowmark` sometimes produces *false positives*, this doesn't
  196. matter, because it is extremely unlikely that a false positive will match an
  197. existing database entry.
  198. 2. Even if a few *bit errors* occur, it is extremely unlikely that a song
  199. watermarked for user A will be attributed to user B, simply because all hash
  200. bits depend on the user. So this is a much better payload than storing a user
  201. ID, artist ID and song ID in the message bits directly.
  202. 3. It is *easy to extend*, because we can add any fields we need to the hash
  203. string. For instance, if we want to store the name of the album, we can simply
  204. add it to the string.
  205. 4. If the hash matches exactly, it is really *hard to deny* that it was this
  206. user who shared the song. How else could all 128 bits of the hash match the
  207. message bits decoded by `audiowmark`?
  208. [[speed]]
  209. == Speed Detection
  210. If a watermarked audio signal is played back a little faster or slower than the
  211. original speed, watermark detection will fail. This could happen by accident if
  212. the digital watermark was converted to an analog signal and back and the
  213. original speed was not (exactly) preserved. It could also be done intentionally
  214. as an attack to avoid the watermark from being detected.
  215. In order to be able to find the watermark in these cases, `audiowmark` can try
  216. to figure out the speed difference to the original audio signal and correct the
  217. replay speed before detecting the watermark. The search range for the replay
  218. speed is approximately *[0.8..1.25]*.
  219. Example: add a watermark to `in.wav` and increase the replay speed by 5% using
  220. `sox`.
  221. [subs=+quotes]
  222. ....
  223. *$ audiowmark add in.wav out.wav 0123456789abcdef0011223344556677*
  224. [...]
  225. *$ sox out.wav out1.wav speed 1.05*
  226. ....
  227. Without speed detection, we get no results. With speed detection the speed
  228. difference is detected and corrected so we get results.
  229. [subs=+quotes]
  230. ....
  231. *$ audiowmark get out1.wav*
  232. *$ audiowmark get out1.wav --detect-speed*
  233. speed 1.049966
  234. pattern 0:05 0123456789abcdef0011223344556677 1.209 0.147 A-SPEED
  235. pattern 0:57 0123456789abcdef0011223344556677 1.301 0.143 B-SPEED
  236. pattern 0:57 0123456789abcdef0011223344556677 1.255 0.145 AB-SPEED
  237. pattern 1:49 0123456789abcdef0011223344556677 1.380 0.173 A-SPEED
  238. pattern all 0123456789abcdef0011223344556677 1.297 0.130 SPEED
  239. ....
  240. The speed detection algorithm is not enabled by default because it is
  241. relatively slow (total cpu time required) and needs a lot of memory. However
  242. the search is automatically run in parallel using many threads on systems with
  243. many cpu cores. So on good hardware it makes sense to always enable this option
  244. to be robust to replay speed attacks.
  245. There are two versions of the speed detection algorithm, `--detect-speed` and
  246. `--detect-speed-patient`. The difference is that the patient version takes
  247. more cpu time to detect the speed, but produces more accurate results.
  248. == Short Payload (experimental)
  249. By default, the watermark will store a 128-bit message. In this mode, we
  250. recommend using a 128bit hash (or HMAC) as payload. No error checking is
  251. performed, the user needs to test patterns that the watermarker decodes to
  252. ensure that they really are one of the expected patterns, not a decoding
  253. error.
  254. As an alternative, an experimental short payload option is available, for very
  255. short payloads (12, 16 or 20 bits). It is enabled using the `--short <bits>`
  256. command line option, for instance for 16 bits:
  257. audiowmark add --short 16 in.wav out.wav abcd
  258. audiowmark get --short 16 out.wav
  259. Internally, a larger set of bits is sent to ensure that decoded short patterns
  260. are really valid, so in this mode, error checking is performed after decoding,
  261. and only valid patterns are reported.
  262. Besides error checking, the advantage of a short payload is that fewer bits
  263. need to be sent, so decoding will more likely to be successful on shorter
  264. clips.
  265. == Video Files
  266. For video files, `videowmark` can be used to add a watermark to the audio track
  267. of video files. To add a watermark, use
  268. [subs=+quotes]
  269. ....
  270. *$ videowmark add in.avi out.avi 0123456789abcdef0011223344556677*
  271. Audio Codec: -c:a mp3 -ab 128000
  272. Input: in.avi
  273. Output: out.avi
  274. Message: 0123456789abcdef0011223344556677
  275. Strength: 10
  276. Time: 3:53
  277. Sample Rate: 44100
  278. Channels: 2
  279. Data Blocks: 4
  280. ....
  281. To detect a watermark, use
  282. [subs=+quotes]
  283. ....
  284. *$ videowmark get out.avi*
  285. pattern 0:05 0123456789abcdef0011223344556677 1.294 0.142 A
  286. pattern 0:57 0123456789abcdef0011223344556677 1.191 0.144 B
  287. pattern 0:57 0123456789abcdef0011223344556677 1.242 0.145 AB
  288. pattern 1:49 0123456789abcdef0011223344556677 1.215 0.120 A
  289. pattern 2:40 0123456789abcdef0011223344556677 1.079 0.128 B
  290. pattern 2:40 0123456789abcdef0011223344556677 1.147 0.126 AB
  291. pattern all 0123456789abcdef0011223344556677 1.195 0.104
  292. ....
  293. The key and strength can be set using the command line options
  294. --key <filename>::
  295. Use watermarking key from file <filename> (see <<key>>).
  296. --strength <s>::
  297. Set the watermarking strength (see <<strength>>).
  298. Videos can be watermarked on-the-fly using <<hls>>.
  299. == Output as Stream
  300. Usually, an input file is read, watermarked and an output file is written.
  301. This means that it takes some time before the watermarked file can be used.
  302. An alternative is to output the watermarked file as stream to stdout. One use
  303. case is sending the watermarked file to a user via network while the
  304. watermarker is still working on the rest of the file. Here is an example how to
  305. watermark a wav file to stdout:
  306. audiowmark add in.wav - 0123456789abcdef0011223344556677 | play -
  307. In this case the file in.wav is read, watermarked, and the output is sent
  308. to stdout. The "play -" can start playing the watermarked stream while the
  309. rest of the file is being watermarked.
  310. If - is used as output, the output is a valid .wav file, so the programs
  311. running after `audiowmark` will be able to determine sample rate, number of
  312. channels, bit depth, encoding and so on from the wav header.
  313. Note that all input formats supported by audiowmark can be used in this way,
  314. for instance flac/mp3:
  315. audiowmark add in.flac - 0123456789abcdef0011223344556677 | play -
  316. audiowmark add in.mp3 - 0123456789abcdef0011223344556677 | play -
  317. == Input from Stream
  318. Similar to the output, the `audiowmark` input can be a stream. In this case,
  319. the input must be a valid .wav file. The watermarker will be able to
  320. start watermarking the input stream before all data is available. An
  321. example would be:
  322. cat in.wav | audiowmark add - out.wav 0123456789abcdef0011223344556677
  323. It is possible to do both, input from stream and output as stream.
  324. cat in.wav | audiowmark add - - 0123456789abcdef0011223344556677 | play -
  325. Streaming input is also supported for watermark detection.
  326. cat in.wav | audiowmark get -
  327. == Wav Pipe Format
  328. In some cases, the length of the streaming input is not known by the program
  329. that produces the stream. For instance consider a mp3 that is being decoded by
  330. madplay.
  331. cat in.mp3 |
  332. madplay -o wave:- - |
  333. audiowmark add - out.wav f0
  334. Since madplay doesn't know the length of the output when it starts decoding the
  335. mp3, the best it can do is to fill the wav header with a big number. And
  336. indeed, audiowmark will watermark the stream, but also print a warning like
  337. this:
  338. audiowmark: warning: unexpected EOF; input frames (1073741823) != output frames (8316288)
  339. This may sound harmless, but for very long input streams, this will also
  340. truncate the audio input after this length. If you already know that you need
  341. to input a wav file from a pipe (without correct length in the header) and
  342. simply want to watermark all of it, it is better to use the `wav-pipe` format:
  343. cat in.mp3 |
  344. madplay -o wave:- - |
  345. audiowmark add --input-format wav-pipe --output-format rf64 - out.wav f0
  346. This will not print a warning, and it also works correctly for long input
  347. streams. Note that using `rf64` as output format is necessary for huge output
  348. files (larger than 4G).
  349. Similar to pipe input, audiowmark can write a wav header with a huge number (in
  350. cases where it does not know the length in advance) if the output format is set
  351. to `wav-pipe`.
  352. cat in.mp3 |
  353. madplay -o wave:- - |
  354. audiowmark add --input-format wav-pipe --output-format wav-pipe - - f0 |
  355. lame - > out.mp3
  356. If you need both, `wav-pipe` input and output, a shorter way to write it is
  357. using `--format wav-pipe`, like this:
  358. cat in.mp3 |
  359. madplay -o wave:- - |
  360. audiowmark add --format wav-pipe - - f0 |
  361. lame - > out.mp3
  362. == Raw Streams
  363. So far, all streams described here are essentially wav streams, which means
  364. that the wav header allows `audiowmark` to determine sample rate, number of
  365. channels, bit depth, encoding and so forth from the stream itself, and the a
  366. wav header is written for the program after `audiowmark`, so that this can
  367. figure out the parameters of the stream.
  368. If the program before or after `audiowmark` doesn't support wav headers, raw
  369. streams can be used instead. The idea is to set all information that is needed
  370. like sample rate, number of channels,... manually. Then, headerless data can
  371. be processed from stdin and/or sent to stdout.
  372. --input-format raw::
  373. --output-format raw::
  374. --format raw::
  375. These can be used to set the input format or output format to raw. The
  376. last version sets both, input and output format to raw.
  377. --raw-rate <rate>::
  378. This should be used to set the sample rate. The input sample rate and
  379. the output sample rate will always be the same (no resampling is
  380. done by the watermarker). There is no default for the sampling rate,
  381. so this parameter must always be specified for raw streams.
  382. --raw-input-bits <bits>::
  383. --raw-output-bits <bits>::
  384. --raw-bits <bits>::
  385. The options can be used to set the input number of bits, the output number
  386. of bits or both. The number of bits can either be `16` or `24`. The default
  387. number of bits is `16`.
  388. --raw-input-endian <endian>::
  389. --raw-output-endian <endian>::
  390. --raw-endian <endian>::
  391. These options can be used to set the input/output endianness or both.
  392. The <endian> parameter can either be `little` or `big`. The default
  393. endianness is `little`.
  394. --raw-input-encoding <encoding>::
  395. --raw-output-encoding <encoding>::
  396. --raw-encoding <encoding>::
  397. These options can be used to set the input/output encoding or both.
  398. The <encoding> parameter can either be `signed` or `unsigned`. The
  399. default encoding is `signed`.
  400. --raw-channels <channels>::
  401. This can be used to set the number of channels. Note that the number
  402. of input channels and the number of output channels must always be the
  403. same. The watermarker has been designed and tested for stereo files,
  404. so the number of channels should really be `2`. This is also the
  405. default.
  406. == Other Command Line Options
  407. --output-format rf64::
  408. Regular wav files are limited to 4GB in size. By using this option,
  409. `audiowmark` will write RF64 wave files, which do not have this size limit.
  410. This is not the default because not all programs might be able to read RF64
  411. wave files.
  412. --q, --quiet::
  413. Disable all information messages generated by `audiomark`.
  414. --strict::
  415. This option will enable strict error checking, which may in some situations
  416. make `audiowmark` return an error, where it could continue.
  417. [[hls]]
  418. == HTTP Live Streaming
  419. === Introduction for HLS
  420. HTTP Live Streaming (HLS) is a protocol to deliver audio or video streams via
  421. HTTP. One example for using HLS in practice would be: a user watches a video
  422. in a web browser with a player like `hls.js`. The user is free to
  423. play/pause/seek the video as he wants. `audiowmark` can watermark the audio
  424. content while it is being transmitted to the user.
  425. HLS splits the contents of each stream into small segments. For the watermarker
  426. this means that if the user seeks to a position far ahead in the stream, the
  427. server needs to start sending segments from where the new play position is, but
  428. everything in between can be ignored.
  429. Another important property of HLS is that it allows separate segments for the
  430. video and audio stream of a video. Since we watermark only the audio track of a
  431. video, the video segments can be sent as they are (and different users can get
  432. the same video segments). What is watermarked are the audio segments only, so
  433. here instead of sending the original audio segments to the user, the audio
  434. segments are watermarked individually for each user, and then transmitted.
  435. Everything necessary to watermark HLS audio segments is available within
  436. `audiowmark`. The server side support which is necessary to send the right
  437. watermarked segment to the right user is not included.
  438. [[hls-requirements]]
  439. === HLS Requirements
  440. HLS support requires some headers/libraries from ffmpeg:
  441. * libavcodec
  442. * libavformat
  443. * libavutil
  444. * libswresample
  445. To enable these as dependencies and build `audiowmark` with HLS support, use the
  446. `--with-ffmpeg` configure option:
  447. [subs=+quotes]
  448. ....
  449. *$ ./configure --with-ffmpeg*
  450. ....
  451. In addition to the libraries, `audiowmark` also uses the two command line
  452. programs from ffmpeg, so they need to be installed:
  453. * ffmpeg
  454. * ffprobe
  455. === Preparing HLS segments
  456. The first step for preparing content for streaming with HLS would be splitting
  457. a video into segments. For this documentation, we use a very simple example
  458. using ffmpeg. No matter what the original codec was, at this point we force
  459. transcoding to AAC with our target bit rate, because during delivery the stream
  460. will be in AAC format.
  461. [subs=+quotes]
  462. ....
  463. *$ ffmpeg -i video.mp4 -f hls -master_pl_name replay.m3u8 -c:a aac -ab 192k \
  464. -var_stream_map "a:0,agroup:aud v:0,agroup:aud" \
  465. -hls_playlist_type vod -hls_list_size 0 -hls_time 10 vs%v/out.m3u8*
  466. ....
  467. This splits the `video.mp4` file into an audio stream of segments in the `vs0`
  468. directory and a video stream of segments in the `vs1` directory. Each segment
  469. is approximately 10 seconds long, and a master playlist is written to
  470. `replay.m3u8`.
  471. Now we can add the relevant audio context to each audio ts segment. This is
  472. necessary so that when the segment is watermarked in order to be transmitted to
  473. the user, `audiowmark` will have enough context available before and after the
  474. segment to create a watermark which sounds correct over segment boundaries.
  475. [subs=+quotes]
  476. ....
  477. *$ audiowmark hls-prepare vs0 vs0prep out.m3u8 video.mp4*
  478. AAC Bitrate: 195641 (detected)
  479. Segments: 18
  480. Time: 2:53
  481. ....
  482. This steps reads the audio playlist `vs0/out.m3u8` and writes all segments
  483. contained in this audio playlist to a new directory `vs0prep` which
  484. contains the audio segments prepared for watermarking.
  485. The last argument in this command line is `video.mp4` again. All audio
  486. that is watermarked is taken from this audio master. It could also be
  487. supplied in `wav` format. This makes a difference if you use lossy
  488. compression as target format (for instance AAC), but your original
  489. video has an audio stream with higher quality (i.e. lossless).
  490. === Watermarking HLS segments
  491. So with all preparations made, what would the server have to do to send a
  492. watermarked version of the 6th audio segment `vs0prep/out5.ts`?
  493. [subs=+quotes]
  494. ....
  495. *$ audiowmark hls-add vs0prep/out5.ts send5.ts 0123456789abcdef0011223344556677*
  496. Message: 0123456789abcdef0011223344556677
  497. Strength: 10
  498. Time: 0:15
  499. Sample Rate: 44100
  500. Channels: 2
  501. Data Blocks: 0
  502. AAC Bitrate: 195641
  503. ....
  504. So instead of sending out5.ts (which has no watermark) to the user, we would
  505. send send5.ts, which is watermarked.
  506. In a real-world use case, it is likely that the server would supply the input
  507. segment on stdin and send the output segment as written to stdout, like this
  508. [subs=+quotes]
  509. ....
  510. *$ [...] | audiowmark hls-add - - 0123456789abcdef0011223344556677 | [...]*
  511. [...]
  512. ....
  513. The usual parameters are supported in `audiowmark hls-add`, like
  514. --key <filename>::
  515. Use watermarking key from file <filename> (see <<key>>).
  516. --strength <s>::
  517. Set the watermarking strength (see <<strength>>).
  518. The AAC bitrate for the output segment can be set using:
  519. --bit-rate <bit_rate>::
  520. Set the AAC bit-rate for the generated watermarked segment.
  521. The rules for the AAC bit-rate of the newly encoded watermarked segment are:
  522. * if the --bit-rate option is used during `hls-add`, this bit-rate will be used
  523. * otherwise, if the `--bit-rate` option is used during `hls-prepare`, this bit-rate will be used
  524. * otherwise, the bit-rate of the input material is detected during `hls-prepare`
  525. == Compiling from Source
  526. Stable releases are available from http://uplex.de/audiowmark
  527. The steps to compile the source code are:
  528. ./configure
  529. make
  530. make install
  531. If you build from git (which doesn't include `configure`), the first
  532. step is `./autogen.sh`. In this case, you need to ensure that (besides
  533. the dependencies listed below) the `autoconf-archive` package is
  534. installed.
  535. == Compiling from Source on Windows/Cygwin
  536. Windows is not an officially supported platform. However, if you want to
  537. build audiowmark (and videowmark) from source on windows, one way to do
  538. so is to use Cygwin. Andreas Strohmeier provided
  539. https://raw.githubusercontent.com/swesterfeld/audiowmark/master/docs/win-x64-build-guide.txt[*build instructions for Cygwin*].
  540. == Dependencies
  541. If you compile from source, `audiowmark` needs the following libraries:
  542. * libfftw3
  543. * libsndfile
  544. * libgcrypt
  545. * libzita-resampler
  546. * libmpg123
  547. If you want to build with HTTP Live Streaming support, see also
  548. <<hls-requirements>>.
  549. == Building fftw
  550. `audiowmark` needs the single prevision variant of fftw3.
  551. If you are building fftw3 from source, use the `--enable-float`
  552. configure parameter to build it, e.g.::
  553. cd ${FFTW3_SOURCE}
  554. ./configure --enable-float --enable-sse && \
  555. make && \
  556. sudo make install
  557. or, when building from git
  558. cd ${FFTW3_GIT}
  559. ./bootstrap.sh --enable-shared --enable-sse --enable-float && \
  560. make && \
  561. sudo make install
  562. == Docker Build
  563. You should be able to execute `audiowmark` via Docker.
  564. Example that outputs the usage message:
  565. docker build -t audiowmark .
  566. docker run -v <local-data-directory>:/data --rm -i audiowmark -h