123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381 |
- = audiowmark - Audio Watermarking
- == Description
- `audiowmark` is an Open Source (GPL) solution for audio watermarking.
- A sound file is read by the software, and a 128-bit message is stored in a
- watermark in the output sound file. For human listeners, the files typically
- sound the same.
- However, the 128-bit message can be retrieved from the output sound file. Our
- tests show, that even if the file is converted to mp3 or ogg (with bitrate 128
- kbit/s or higher), the watermark usually can be retrieved without problems. The
- process of retrieving the message does not need the original audio file (blind
- decoding).
- Internally, audiowmark is using the patchwork algorithm to hide the data in the
- spectrum of the audio file. The signal is split into 1024 sample frames. For
- each frame, some pseoudo-randomly selected amplitudes of the frequency bands of
- a 1024-value FFTs are increased or decreased slightly, which can be detected
- later. The algorithm used here is inspired by
- Martin Steinebach: Digitale Wasserzeichen für Audiodaten.
- Darmstadt University of Technology 2004, ISBN 3-8322-2507-2
- == Open Source License
- `audiowmark` is *open source* software available under the *GPLv3
- or later* license.
- Copyright (C) 2018-2020 Stefan Westerfeld
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see <http://www.gnu.org/licenses/>.
- == Adding a Watermark
- To add a watermark to the soundfile in.wav with a 128-bit message (which is
- specified as hex-string):
- [subs=+quotes]
- ....
- *$ audiowmark add in.wav out.wav 0123456789abcdef0011223344556677*
- Input: in.wav
- Output: out.wav
- Message: 0123456789abcdef0011223344556677
- Strength: 10
- Time: 3:59
- Sample Rate: 48000
- Channels: 2
- Data Blocks: 4
- ....
- The most important options for adding a watermark are:
- --key <filename>::
- Use watermarking key from file <filename> (see <<key>>).
- --strength <s>::
- Set the watermarking strength (see <<strength>>).
- == Retrieving a Watermark
- To get the 128-bit message from the watermarked file, use:
- [subs=+quotes]
- ....
- *$ audiowmark get out.wav*
- pattern 0:05 0123456789abcdef0011223344556677 1.324 0.059 A
- pattern 0:57 0123456789abcdef0011223344556677 1.413 0.112 B
- pattern 0:57 0123456789abcdef0011223344556677 1.368 0.086 AB
- pattern 1:49 0123456789abcdef0011223344556677 1.302 0.098 A
- pattern 2:40 0123456789abcdef0011223344556677 1.361 0.093 B
- pattern 2:40 0123456789abcdef0011223344556677 1.331 0.096 AB
- pattern all 0123456789abcdef0011223344556677 1.350 0.054
- ....
- The output of `audiowmark get` is designed to be machine readable. Each line
- that starts with `pattern` contains one decoded message. The fields are
- seperated by one or more space characters. The first field is a *timestamp*
- indicating the position of the data block. The second field is the *decoded
- message*. For most purposes this is all you need to know.
- The software was designed under the assumption that you - the user - will be
- able to decide whether a message is correct or not. To do this, on watermarking
- song files, you could list each message you embedded in a database. During
- retrieval, you should look up each pattern `audiowmark get` outputs in the
- database. If the message is not found, then you should assume that a decoding
- error occurred. In our example each pattern was decoded correctly, because
- the watermark was not damaged at all, but if you for instance use lossy
- compression (with a low bitrate), it may happen that only some of the decoded
- patterns are correct. Or none, if the watermark was damaged too much.
- The third field is the *sync score* (higher is better). The synchronization
- algorithm tries to find valid data blocks in the audio file, that become
- candidates for decoding.
- The fourth field is the *decoding error* (lower is better). During message
- decoding, we use convolutional codes for error correction, to make the
- watermarking more robust.
- The fifth field is the *block type*. There are two types of data blocks,
- A blocks and B blocks. A single data block can be decoded alone, as it
- contains a complete message. However, if during watermark detection an
- A block followed by a B block was found, these two can be decoded
- together (then this field will be AB), resulting in even higher error
- correction capacity than one block alone would have.
- To improve the error correction capacity even further, the `all` pattern
- combines all data blocks that are available. The combined decoded
- message will often be the most reliable result (meaning that even if all
- other patterns were incorrect, this could still be right).
- The most important options for getting a watermark are:
- --key <filename>::
- Use watermarking key from file <filename> (see <<key>>).
- --strength <s>::
- Set the watermarking strength (see <<strength>>).
- [[key]]
- == Watermark Key
- Since the software is Open Source, a watermarking key should be used to ensure
- that the message bits cannot be retrieved by somebody else (which would also
- allow removing the watermark without loss of quality). The watermark key
- controls all pseudo-random parameters of the algorithm. This means that
- it determines which frequency bands are increased or decreased to store a
- 0 bit or a 1 bit. Without the key, it is impossible to decode the message
- bits from the audio file alone.
- Our watermarking key is a 128-bit AES key. A key can be generated using
- audiowmark gen-key test.key
- and can be used for the add/get commands as follows:
- audiowmark add --key test.key in.wav out.wav 0123456789abcdef0011223344556677
- audiowmark get --key test.key out.wav
- [[strength]]
- == Watermark Strength
- The watermark strength parameter affects how much the watermarking algorithm
- modifies the input signal. A stronger watermark is more audible, but also more
- robust against modifications. The default strength is 10. A watermark with that
- strength is recoverable after mp3/ogg encoding with 128kbit/s or higher. In our
- informal listening tests, this setting also has a very good subjective quality.
- A higher strength (for instance 15) would be helpful for instance if robustness
- against multiple conversions or conversions to low bit rates (i.e. 64kbit/s) is
- desired.
- A lower strength (for instance 6) makes the watermark less audible, but also
- less robust. Strengths below 5 are not recommended. To set the strength, the
- same value has to be passed during both, generation and retrieving the
- watermark. Fractional strengths (like 7.5) are possible.
- audiowmark add --strength 15 in.wav out.wav 0123456789abcdef0011223344556677
- audiowmark get --strength 15 out.wav
- == Short Payload (experimental)
- By default, the watermark will store a 128-bit message. In this mode, we
- recommend using a 128bit hash (or HMAC) as payload. No error checking is
- performed, the user needs to test patterns that the watermarker decodes to
- ensure that they really are one of the expected patterns, not a decoding
- error.
- As an alternative, an experimental short payload option is available, for very
- short payloads (12, 16 or 20 bits). It is enabled using the `--short <bits>`
- command line option, for instance for 16 bits:
- audiowmark add --short 16 in.wav out.wav abcd
- audiowmark get --short 16 out.wav
- Internally, a larger set of bits is sent to ensure that decoded short patterns
- are really valid, so in this mode, error checking is performed after decoding,
- and only valid patterns are reported.
- Besides error checking, the advantage of a short payload is that fewer bits
- need to be sent, so decoding will more likely to be successful on shorter
- clips.
- == Video Files
- For video files, `videowmark` can be used to add a watermark to the audio track
- of video files. To add a watermark, use
- [subs=+quotes]
- ....
- *$ videowmark add in.avi out.avi 0123456789abcdef0011223344556677*
- Audio Codec: -c:a mp3 -ab 128000
- Input: in.avi
- Output: out.avi
- Message: 0123456789abcdef0011223344556677
- Strength: 10
- Time: 3:53
- Sample Rate: 44100
- Channels: 2
- Data Blocks: 4
- ....
- To detect a watermark, use
- [subs=+quotes]
- ....
- *$ videowmark get out.avi*
- pattern 0:05 0123456789abcdef0011223344556677 1.294 0.142 A
- pattern 0:57 0123456789abcdef0011223344556677 1.191 0.144 B
- pattern 0:57 0123456789abcdef0011223344556677 1.242 0.145 AB
- pattern 1:49 0123456789abcdef0011223344556677 1.215 0.120 A
- pattern 2:40 0123456789abcdef0011223344556677 1.079 0.128 B
- pattern 2:40 0123456789abcdef0011223344556677 1.147 0.126 AB
- pattern all 0123456789abcdef0011223344556677 1.195 0.104
- ....
- The key and strength can be set using the command line options
- --key <filename>::
- Use watermarking key from file <filename> (see <<key>>).
- --strength <s>::
- Set the watermarking strength (see <<strength>>).
- == Output as Stream
- Usually, an input file is read, watermarked and an output file is written.
- This means that it takes some time before the watermarked file can be used.
- An alternative is to output the watermarked file as stream to stdout. One use
- case is sending the watermarked file to a user via network while the
- watermarker is still working on the rest of the file. Here is an example how to
- watermark a wav file to stdout:
- audiowmark add in.wav - 0123456789abcdef0011223344556677 | play -
- In this case the file in.wav is read, watermarked, and the output is sent
- to stdout. The "play -" can start playing the watermarked stream while the
- rest of the file is being watermarked.
- If - is used as output, the output is a valid .wav file, so the programs
- running after `audiowmark` will be able to determine sample rate, number of
- channels, bit depth, encoding and so on from the wav header.
- Note that all input formats supported by audiowmark can be used in this way,
- for instance flac/mp3:
- audiowmark add in.flac - 0123456789abcdef0011223344556677 | play -
- audiowmark add in.mp3 - 0123456789abcdef0011223344556677 | play -
- == Input from Stream
- Similar to the output, the `audiowmark` input can be a stream. In this case,
- the input must be a valid .wav file. The watermarker will be able to
- start watermarking the input stream before all data is available. An
- example would be:
- cat in.wav | audiowmark add - out.wav 0123456789abcdef0011223344556677
- It is possible to do both, input from stream and output as stream.
- cat in.wav | audiowmark add - - 0123456789abcdef0011223344556677 | play -
- Streaming input is also supported for watermark detection.
- cat in.wav | audiowmark get -
- == Raw Streams
- So far, all streams described here are essentially wav streams, which means
- that the wav header allows `audiowmark` to determine sample rate, number of
- channels, bit depth, encoding and so forth from the stream itself, and the a
- wav header is written for the program after `audiowmark`, so that this can
- figure out the parameters of the stream.
- There are two cases where this is problematic. The first case is if the full
- length of the stream is not known at the time processing starts. Then a wav
- header cannot be used, as the wav file contains the length of the stream. The
- second case is that the program before or after `audiowmark` doesn't support wav
- headers.
- For these two cases, raw streams are available. The idea is to set all
- information that is needed like sample rate, number of channels,... manually.
- Then, headerless data can be processed from stdin and/or sent to stdout.
- --input-format raw::
- --output-format raw::
- --format raw::
- These can be used to set the input format or output format to raw. The
- last version sets both, input and output format to raw.
- --raw-rate <rate>::
- This should be used to set the sample rate. The input sample rate and
- the output sample rate will always be the same (no resampling is
- done by the watermarker). There is no default for the sampling rate,
- so this parameter must always be specified for raw streams.
- --raw-input-bits <bits>::
- --raw-output-bits <bits>::
- --raw-bits <bits>::
- The options can be used to set the input number of bits, the output number
- of bits or both. The number of bits can either be `16` or `24`. The default
- number of bits is `16`.
- --raw-input-endian <endian>::
- --raw-output-endian <endian>::
- --raw-endian <endian>::
- These options can be used to set the input/output endianness or both.
- The <endian> parameter can either be `little` or `big`. The default
- endianness is `little`.
- --raw-input-encoding <encoding>::
- --raw-output-encoding <encoding>::
- --raw-encoding <encoding>::
- These options can be used to set the input/output encoding or both.
- The <encoding> parameter can either be `signed` or `unsigned`. The
- default encoding is `signed`.
- --raw-channels <channels>::
- This can be used to set the number of channels. Note that the number
- of input channels and the number of output channels must always be the
- same. The watermarker has been designed and tested for stereo files,
- so the number of channels should really be `2`. This is also the
- default.
- == Dependencies
- If you compile from source, `audiowmark` needs the following libraries:
- * libfftw3
- * libsndfile
- * libgcrypt
- * libzita-resampler
- * libmpg123
- == Building fftw
- `audiowmark` needs the single prevision variant of fftw3.
- If you are building fftw3 from source, use the `--enable-float`
- configure parameter to build it, e.g.::
- cd ${FFTW3_SOURCE}
- ./configure --enable-float --enable-sse && \
- make && \
- sudo make install
- or, when building from git
- cd ${FFTW3_GIT}
- ./bootstrap.sh --enable-shared --enable-sse --enable-float && \
- make && \
- sudo make install
- == Docker Build
- You should be able to execute `audiowmark` via Docker.
- Example that outputs the usage message:
- docker build -t audiowmark .
- docker run -v <local-data-directory>:/data -it audiowmark -h
|