123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807 |
- .. SPDX-License-Identifier: GPL-2.0
- .. _fsverity:
- =======================================================
- fs-verity: read-only file-based authenticity protection
- =======================================================
- Introduction
- ============
- fs-verity (``fs/verity/``) is a support layer that filesystems can
- hook into to support transparent integrity and authenticity protection
- of read-only files. Currently, it is supported by the ext4 and f2fs
- filesystems. Like fscrypt, not too much filesystem-specific code is
- needed to support fs-verity.
- fs-verity is similar to `dm-verity
- <https://www.kernel.org/doc/Documentation/device-mapper/verity.txt>`_
- but works on files rather than block devices. On regular files on
- filesystems supporting fs-verity, userspace can execute an ioctl that
- causes the filesystem to build a Merkle tree for the file and persist
- it to a filesystem-specific location associated with the file.
- After this, the file is made readonly, and all reads from the file are
- automatically verified against the file's Merkle tree. Reads of any
- corrupted data, including mmap reads, will fail.
- Userspace can use another ioctl to retrieve the root hash (actually
- the "fs-verity file digest", which is a hash that includes the Merkle
- tree root hash) that fs-verity is enforcing for the file. This ioctl
- executes in constant time, regardless of the file size.
- fs-verity is essentially a way to hash a file in constant time,
- subject to the caveat that reads which would violate the hash will
- fail at runtime.
- Use cases
- =========
- By itself, the base fs-verity feature only provides integrity
- protection, i.e. detection of accidental (non-malicious) corruption.
- However, because fs-verity makes retrieving the file hash extremely
- efficient, it's primarily meant to be used as a tool to support
- authentication (detection of malicious modifications) or auditing
- (logging file hashes before use).
- Trusted userspace code (e.g. operating system code running on a
- read-only partition that is itself authenticated by dm-verity) can
- authenticate the contents of an fs-verity file by using the
- `FS_IOC_MEASURE_VERITY`_ ioctl to retrieve its hash, then verifying a
- digital signature of it.
- A standard file hash could be used instead of fs-verity. However,
- this is inefficient if the file is large and only a small portion may
- be accessed. This is often the case for Android application package
- (APK) files, for example. These typically contain many translations,
- classes, and other resources that are infrequently or even never
- accessed on a particular device. It would be slow and wasteful to
- read and hash the entire file before starting the application.
- Unlike an ahead-of-time hash, fs-verity also re-verifies data each
- time it's paged in. This ensures that malicious disk firmware can't
- undetectably change the contents of the file at runtime.
- fs-verity does not replace or obsolete dm-verity. dm-verity should
- still be used on read-only filesystems. fs-verity is for files that
- must live on a read-write filesystem because they are independently
- updated and potentially user-installed, so dm-verity cannot be used.
- The base fs-verity feature is a hashing mechanism only; actually
- authenticating the files is up to userspace. However, to meet some
- users' needs, fs-verity optionally supports a simple signature
- verification mechanism where users can configure the kernel to require
- that all fs-verity files be signed by a key loaded into a keyring; see
- `Built-in signature verification`_. Support for fs-verity file hashes
- in IMA (Integrity Measurement Architecture) policies is also planned.
- User API
- ========
- FS_IOC_ENABLE_VERITY
- --------------------
- The FS_IOC_ENABLE_VERITY ioctl enables fs-verity on a file. It takes
- in a pointer to a :c:type:`struct fsverity_enable_arg`, defined as
- follows::
- struct fsverity_enable_arg {
- __u32 version;
- __u32 hash_algorithm;
- __u32 block_size;
- __u32 salt_size;
- __u64 salt_ptr;
- __u32 sig_size;
- __u32 __reserved1;
- __u64 sig_ptr;
- __u64 __reserved2[11];
- };
- This structure contains the parameters of the Merkle tree to build for
- the file, and optionally contains a signature. It must be initialized
- as follows:
- - ``version`` must be 1.
- - ``hash_algorithm`` must be the identifier for the hash algorithm to
- use for the Merkle tree, such as FS_VERITY_HASH_ALG_SHA256. See
- ``include/uapi/linux/fsverity.h`` for the list of possible values.
- - ``block_size`` must be the Merkle tree block size. Currently, this
- must be equal to the system page size, which is usually 4096 bytes.
- Other sizes may be supported in the future. This value is not
- necessarily the same as the filesystem block size.
- - ``salt_size`` is the size of the salt in bytes, or 0 if no salt is
- provided. The salt is a value that is prepended to every hashed
- block; it can be used to personalize the hashing for a particular
- file or device. Currently the maximum salt size is 32 bytes.
- - ``salt_ptr`` is the pointer to the salt, or NULL if no salt is
- provided.
- - ``sig_size`` is the size of the signature in bytes, or 0 if no
- signature is provided. Currently the signature is (somewhat
- arbitrarily) limited to 16128 bytes. See `Built-in signature
- verification`_ for more information.
- - ``sig_ptr`` is the pointer to the signature, or NULL if no
- signature is provided.
- - All reserved fields must be zeroed.
- FS_IOC_ENABLE_VERITY causes the filesystem to build a Merkle tree for
- the file and persist it to a filesystem-specific location associated
- with the file, then mark the file as a verity file. This ioctl may
- take a long time to execute on large files, and it is interruptible by
- fatal signals.
- FS_IOC_ENABLE_VERITY checks for write access to the inode. However,
- it must be executed on an O_RDONLY file descriptor and no processes
- can have the file open for writing. Attempts to open the file for
- writing while this ioctl is executing will fail with ETXTBSY. (This
- is necessary to guarantee that no writable file descriptors will exist
- after verity is enabled, and to guarantee that the file's contents are
- stable while the Merkle tree is being built over it.)
- On success, FS_IOC_ENABLE_VERITY returns 0, and the file becomes a
- verity file. On failure (including the case of interruption by a
- fatal signal), no changes are made to the file.
- FS_IOC_ENABLE_VERITY can fail with the following errors:
- - ``EACCES``: the process does not have write access to the file
- - ``EBADMSG``: the signature is malformed
- - ``EBUSY``: this ioctl is already running on the file
- - ``EEXIST``: the file already has verity enabled
- - ``EFAULT``: the caller provided inaccessible memory
- - ``EINTR``: the operation was interrupted by a fatal signal
- - ``EINVAL``: unsupported version, hash algorithm, or block size; or
- reserved bits are set; or the file descriptor refers to neither a
- regular file nor a directory.
- - ``EISDIR``: the file descriptor refers to a directory
- - ``EKEYREJECTED``: the signature doesn't match the file
- - ``EMSGSIZE``: the salt or signature is too long
- - ``ENOKEY``: the fs-verity keyring doesn't contain the certificate
- needed to verify the signature
- - ``ENOPKG``: fs-verity recognizes the hash algorithm, but it's not
- available in the kernel's crypto API as currently configured (e.g.
- for SHA-512, missing CONFIG_CRYPTO_SHA512).
- - ``ENOTTY``: this type of filesystem does not implement fs-verity
- - ``EOPNOTSUPP``: the kernel was not configured with fs-verity
- support; or the filesystem superblock has not had the 'verity'
- feature enabled on it; or the filesystem does not support fs-verity
- on this file. (See `Filesystem support`_.)
- - ``EPERM``: the file is append-only; or, a signature is required and
- one was not provided.
- - ``EROFS``: the filesystem is read-only
- - ``ETXTBSY``: someone has the file open for writing. This can be the
- caller's file descriptor, another open file descriptor, or the file
- reference held by a writable memory map.
- FS_IOC_MEASURE_VERITY
- ---------------------
- The FS_IOC_MEASURE_VERITY ioctl retrieves the digest of a verity file.
- The fs-verity file digest is a cryptographic digest that identifies
- the file contents that are being enforced on reads; it is computed via
- a Merkle tree and is different from a traditional full-file digest.
- This ioctl takes in a pointer to a variable-length structure::
- struct fsverity_digest {
- __u16 digest_algorithm;
- __u16 digest_size; /* input/output */
- __u8 digest[];
- };
- ``digest_size`` is an input/output field. On input, it must be
- initialized to the number of bytes allocated for the variable-length
- ``digest`` field.
- On success, 0 is returned and the kernel fills in the structure as
- follows:
- - ``digest_algorithm`` will be the hash algorithm used for the file
- digest. It will match ``fsverity_enable_arg::hash_algorithm``.
- - ``digest_size`` will be the size of the digest in bytes, e.g. 32
- for SHA-256. (This can be redundant with ``digest_algorithm``.)
- - ``digest`` will be the actual bytes of the digest.
- FS_IOC_MEASURE_VERITY is guaranteed to execute in constant time,
- regardless of the size of the file.
- FS_IOC_MEASURE_VERITY can fail with the following errors:
- - ``EFAULT``: the caller provided inaccessible memory
- - ``ENODATA``: the file is not a verity file
- - ``ENOTTY``: this type of filesystem does not implement fs-verity
- - ``EOPNOTSUPP``: the kernel was not configured with fs-verity
- support, or the filesystem superblock has not had the 'verity'
- feature enabled on it. (See `Filesystem support`_.)
- - ``EOVERFLOW``: the digest is longer than the specified
- ``digest_size`` bytes. Try providing a larger buffer.
- FS_IOC_READ_VERITY_METADATA
- ---------------------------
- The FS_IOC_READ_VERITY_METADATA ioctl reads verity metadata from a
- verity file. This ioctl is available since Linux v5.12.
- This ioctl allows writing a server program that takes a verity file
- and serves it to a client program, such that the client can do its own
- fs-verity compatible verification of the file. This only makes sense
- if the client doesn't trust the server and if the server needs to
- provide the storage for the client.
- This is a fairly specialized use case, and most fs-verity users won't
- need this ioctl.
- This ioctl takes in a pointer to the following structure::
- #define FS_VERITY_METADATA_TYPE_MERKLE_TREE 1
- #define FS_VERITY_METADATA_TYPE_DESCRIPTOR 2
- #define FS_VERITY_METADATA_TYPE_SIGNATURE 3
- struct fsverity_read_metadata_arg {
- __u64 metadata_type;
- __u64 offset;
- __u64 length;
- __u64 buf_ptr;
- __u64 __reserved;
- };
- ``metadata_type`` specifies the type of metadata to read:
- - ``FS_VERITY_METADATA_TYPE_MERKLE_TREE`` reads the blocks of the
- Merkle tree. The blocks are returned in order from the root level
- to the leaf level. Within each level, the blocks are returned in
- the same order that their hashes are themselves hashed.
- See `Merkle tree`_ for more information.
- - ``FS_VERITY_METADATA_TYPE_DESCRIPTOR`` reads the fs-verity
- descriptor. See `fs-verity descriptor`_.
- - ``FS_VERITY_METADATA_TYPE_SIGNATURE`` reads the signature which was
- passed to FS_IOC_ENABLE_VERITY, if any. See `Built-in signature
- verification`_.
- The semantics are similar to those of ``pread()``. ``offset``
- specifies the offset in bytes into the metadata item to read from, and
- ``length`` specifies the maximum number of bytes to read from the
- metadata item. ``buf_ptr`` is the pointer to the buffer to read into,
- cast to a 64-bit integer. ``__reserved`` must be 0. On success, the
- number of bytes read is returned. 0 is returned at the end of the
- metadata item. The returned length may be less than ``length``, for
- example if the ioctl is interrupted.
- The metadata returned by FS_IOC_READ_VERITY_METADATA isn't guaranteed
- to be authenticated against the file digest that would be returned by
- `FS_IOC_MEASURE_VERITY`_, as the metadata is expected to be used to
- implement fs-verity compatible verification anyway (though absent a
- malicious disk, the metadata will indeed match). E.g. to implement
- this ioctl, the filesystem is allowed to just read the Merkle tree
- blocks from disk without actually verifying the path to the root node.
- FS_IOC_READ_VERITY_METADATA can fail with the following errors:
- - ``EFAULT``: the caller provided inaccessible memory
- - ``EINTR``: the ioctl was interrupted before any data was read
- - ``EINVAL``: reserved fields were set, or ``offset + length``
- overflowed
- - ``ENODATA``: the file is not a verity file, or
- FS_VERITY_METADATA_TYPE_SIGNATURE was requested but the file doesn't
- have a built-in signature
- - ``ENOTTY``: this type of filesystem does not implement fs-verity, or
- this ioctl is not yet implemented on it
- - ``EOPNOTSUPP``: the kernel was not configured with fs-verity
- support, or the filesystem superblock has not had the 'verity'
- feature enabled on it. (See `Filesystem support`_.)
- FS_IOC_GETFLAGS
- ---------------
- The existing ioctl FS_IOC_GETFLAGS (which isn't specific to fs-verity)
- can also be used to check whether a file has fs-verity enabled or not.
- To do so, check for FS_VERITY_FL (0x00100000) in the returned flags.
- The verity flag is not settable via FS_IOC_SETFLAGS. You must use
- FS_IOC_ENABLE_VERITY instead, since parameters must be provided.
- statx
- -----
- Since Linux v5.5, the statx() system call sets STATX_ATTR_VERITY if
- the file has fs-verity enabled. This can perform better than
- FS_IOC_GETFLAGS and FS_IOC_MEASURE_VERITY because it doesn't require
- opening the file, and opening verity files can be expensive.
- Accessing verity files
- ======================
- Applications can transparently access a verity file just like a
- non-verity one, with the following exceptions:
- - Verity files are readonly. They cannot be opened for writing or
- truncate()d, even if the file mode bits allow it. Attempts to do
- one of these things will fail with EPERM. However, changes to
- metadata such as owner, mode, timestamps, and xattrs are still
- allowed, since these are not measured by fs-verity. Verity files
- can also still be renamed, deleted, and linked to.
- - Direct I/O is not supported on verity files. Attempts to use direct
- I/O on such files will fall back to buffered I/O.
- - DAX (Direct Access) is not supported on verity files, because this
- would circumvent the data verification.
- - Reads of data that doesn't match the verity Merkle tree will fail
- with EIO (for read()) or SIGBUS (for mmap() reads).
- - If the sysctl "fs.verity.require_signatures" is set to 1 and the
- file is not signed by a key in the fs-verity keyring, then opening
- the file will fail. See `Built-in signature verification`_.
- Direct access to the Merkle tree is not supported. Therefore, if a
- verity file is copied, or is backed up and restored, then it will lose
- its "verity"-ness. fs-verity is primarily meant for files like
- executables that are managed by a package manager.
- File digest computation
- =======================
- This section describes how fs-verity hashes the file contents using a
- Merkle tree to produce the digest which cryptographically identifies
- the file contents. This algorithm is the same for all filesystems
- that support fs-verity.
- Userspace only needs to be aware of this algorithm if it needs to
- compute fs-verity file digests itself, e.g. in order to sign files.
- .. _fsverity_merkle_tree:
- Merkle tree
- -----------
- The file contents is divided into blocks, where the block size is
- configurable but is usually 4096 bytes. The end of the last block is
- zero-padded if needed. Each block is then hashed, producing the first
- level of hashes. Then, the hashes in this first level are grouped
- into 'blocksize'-byte blocks (zero-padding the ends as needed) and
- these blocks are hashed, producing the second level of hashes. This
- proceeds up the tree until only a single block remains. The hash of
- this block is the "Merkle tree root hash".
- If the file fits in one block and is nonempty, then the "Merkle tree
- root hash" is simply the hash of the single data block. If the file
- is empty, then the "Merkle tree root hash" is all zeroes.
- The "blocks" here are not necessarily the same as "filesystem blocks".
- If a salt was specified, then it's zero-padded to the closest multiple
- of the input size of the hash algorithm's compression function, e.g.
- 64 bytes for SHA-256 or 128 bytes for SHA-512. The padded salt is
- prepended to every data or Merkle tree block that is hashed.
- The purpose of the block padding is to cause every hash to be taken
- over the same amount of data, which simplifies the implementation and
- keeps open more possibilities for hardware acceleration. The purpose
- of the salt padding is to make the salting "free" when the salted hash
- state is precomputed, then imported for each hash.
- Example: in the recommended configuration of SHA-256 and 4K blocks,
- 128 hash values fit in each block. Thus, each level of the Merkle
- tree is approximately 128 times smaller than the previous, and for
- large files the Merkle tree's size converges to approximately 1/127 of
- the original file size. However, for small files, the padding is
- significant, making the space overhead proportionally more.
- .. _fsverity_descriptor:
- fs-verity descriptor
- --------------------
- By itself, the Merkle tree root hash is ambiguous. For example, it
- can't a distinguish a large file from a small second file whose data
- is exactly the top-level hash block of the first file. Ambiguities
- also arise from the convention of padding to the next block boundary.
- To solve this problem, the fs-verity file digest is actually computed
- as a hash of the following structure, which contains the Merkle tree
- root hash as well as other fields such as the file size::
- struct fsverity_descriptor {
- __u8 version; /* must be 1 */
- __u8 hash_algorithm; /* Merkle tree hash algorithm */
- __u8 log_blocksize; /* log2 of size of data and tree blocks */
- __u8 salt_size; /* size of salt in bytes; 0 if none */
- __le32 __reserved_0x04; /* must be 0 */
- __le64 data_size; /* size of file the Merkle tree is built over */
- __u8 root_hash[64]; /* Merkle tree root hash */
- __u8 salt[32]; /* salt prepended to each hashed block */
- __u8 __reserved[144]; /* must be 0's */
- };
- Built-in signature verification
- ===============================
- With CONFIG_FS_VERITY_BUILTIN_SIGNATURES=y, fs-verity supports putting
- a portion of an authentication policy (see `Use cases`_) in the
- kernel. Specifically, it adds support for:
- 1. At fs-verity module initialization time, a keyring ".fs-verity" is
- created. The root user can add trusted X.509 certificates to this
- keyring using the add_key() system call, then (when done)
- optionally use keyctl_restrict_keyring() to prevent additional
- certificates from being added.
- 2. `FS_IOC_ENABLE_VERITY`_ accepts a pointer to a PKCS#7 formatted
- detached signature in DER format of the file's fs-verity digest.
- On success, this signature is persisted alongside the Merkle tree.
- Then, any time the file is opened, the kernel will verify the
- file's actual digest against this signature, using the certificates
- in the ".fs-verity" keyring.
- 3. A new sysctl "fs.verity.require_signatures" is made available.
- When set to 1, the kernel requires that all verity files have a
- correctly signed digest as described in (2).
- fs-verity file digests must be signed in the following format, which
- is similar to the structure used by `FS_IOC_MEASURE_VERITY`_::
- struct fsverity_formatted_digest {
- char magic[8]; /* must be "FSVerity" */
- __le16 digest_algorithm;
- __le16 digest_size;
- __u8 digest[];
- };
- fs-verity's built-in signature verification support is meant as a
- relatively simple mechanism that can be used to provide some level of
- authenticity protection for verity files, as an alternative to doing
- the signature verification in userspace or using IMA-appraisal.
- However, with this mechanism, userspace programs still need to check
- that the verity bit is set, and there is no protection against verity
- files being swapped around.
- Filesystem support
- ==================
- fs-verity is currently supported by the ext4 and f2fs filesystems.
- The CONFIG_FS_VERITY kconfig option must be enabled to use fs-verity
- on either filesystem.
- ``include/linux/fsverity.h`` declares the interface between the
- ``fs/verity/`` support layer and filesystems. Briefly, filesystems
- must provide an ``fsverity_operations`` structure that provides
- methods to read and write the verity metadata to a filesystem-specific
- location, including the Merkle tree blocks and
- ``fsverity_descriptor``. Filesystems must also call functions in
- ``fs/verity/`` at certain times, such as when a file is opened or when
- pages have been read into the pagecache. (See `Verifying data`_.)
- ext4
- ----
- ext4 supports fs-verity since Linux v5.4 and e2fsprogs v1.45.2.
- To create verity files on an ext4 filesystem, the filesystem must have
- been formatted with ``-O verity`` or had ``tune2fs -O verity`` run on
- it. "verity" is an RO_COMPAT filesystem feature, so once set, old
- kernels will only be able to mount the filesystem readonly, and old
- versions of e2fsck will be unable to check the filesystem. Moreover,
- currently ext4 only supports mounting a filesystem with the "verity"
- feature when its block size is equal to PAGE_SIZE (often 4096 bytes).
- ext4 sets the EXT4_VERITY_FL on-disk inode flag on verity files. It
- can only be set by `FS_IOC_ENABLE_VERITY`_, and it cannot be cleared.
- ext4 also supports encryption, which can be used simultaneously with
- fs-verity. In this case, the plaintext data is verified rather than
- the ciphertext. This is necessary in order to make the fs-verity file
- digest meaningful, since every file is encrypted differently.
- ext4 stores the verity metadata (Merkle tree and fsverity_descriptor)
- past the end of the file, starting at the first 64K boundary beyond
- i_size. This approach works because (a) verity files are readonly,
- and (b) pages fully beyond i_size aren't visible to userspace but can
- be read/written internally by ext4 with only some relatively small
- changes to ext4. This approach avoids having to depend on the
- EA_INODE feature and on rearchitecturing ext4's xattr support to
- support paging multi-gigabyte xattrs into memory, and to support
- encrypting xattrs. Note that the verity metadata *must* be encrypted
- when the file is, since it contains hashes of the plaintext data.
- Currently, ext4 verity only supports the case where the Merkle tree
- block size, filesystem block size, and page size are all the same. It
- also only supports extent-based files.
- f2fs
- ----
- f2fs supports fs-verity since Linux v5.4 and f2fs-tools v1.11.0.
- To create verity files on an f2fs filesystem, the filesystem must have
- been formatted with ``-O verity``.
- f2fs sets the FADVISE_VERITY_BIT on-disk inode flag on verity files.
- It can only be set by `FS_IOC_ENABLE_VERITY`_, and it cannot be
- cleared.
- Like ext4, f2fs stores the verity metadata (Merkle tree and
- fsverity_descriptor) past the end of the file, starting at the first
- 64K boundary beyond i_size. See explanation for ext4 above.
- Moreover, f2fs supports at most 4096 bytes of xattr entries per inode
- which wouldn't be enough for even a single Merkle tree block.
- Currently, f2fs verity only supports a Merkle tree block size of 4096.
- Also, f2fs doesn't support enabling verity on files that currently
- have atomic or volatile writes pending.
- Implementation details
- ======================
- Verifying data
- --------------
- fs-verity ensures that all reads of a verity file's data are verified,
- regardless of which syscall is used to do the read (e.g. mmap(),
- read(), pread()) and regardless of whether it's the first read or a
- later read (unless the later read can return cached data that was
- already verified). Below, we describe how filesystems implement this.
- Pagecache
- ~~~~~~~~~
- For filesystems using Linux's pagecache, the ``->readpage()`` and
- ``->readpages()`` methods must be modified to verify pages before they
- are marked Uptodate. Merely hooking ``->read_iter()`` would be
- insufficient, since ``->read_iter()`` is not used for memory maps.
- Therefore, fs/verity/ provides a function fsverity_verify_page() which
- verifies a page that has been read into the pagecache of a verity
- inode, but is still locked and not Uptodate, so it's not yet readable
- by userspace. As needed to do the verification,
- fsverity_verify_page() will call back into the filesystem to read
- Merkle tree pages via fsverity_operations::read_merkle_tree_page().
- fsverity_verify_page() returns false if verification failed; in this
- case, the filesystem must not set the page Uptodate. Following this,
- as per the usual Linux pagecache behavior, attempts by userspace to
- read() from the part of the file containing the page will fail with
- EIO, and accesses to the page within a memory map will raise SIGBUS.
- fsverity_verify_page() currently only supports the case where the
- Merkle tree block size is equal to PAGE_SIZE (often 4096 bytes).
- In principle, fsverity_verify_page() verifies the entire path in the
- Merkle tree from the data page to the root hash. However, for
- efficiency the filesystem may cache the hash pages. Therefore,
- fsverity_verify_page() only ascends the tree reading hash pages until
- an already-verified hash page is seen, as indicated by the PageChecked
- bit being set. It then verifies the path to that page.
- This optimization, which is also used by dm-verity, results in
- excellent sequential read performance. This is because usually (e.g.
- 127 in 128 times for 4K blocks and SHA-256) the hash page from the
- bottom level of the tree will already be cached and checked from
- reading a previous data page. However, random reads perform worse.
- Block device based filesystems
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Block device based filesystems (e.g. ext4 and f2fs) in Linux also use
- the pagecache, so the above subsection applies too. However, they
- also usually read many pages from a file at once, grouped into a
- structure called a "bio". To make it easier for these types of
- filesystems to support fs-verity, fs/verity/ also provides a function
- fsverity_verify_bio() which verifies all pages in a bio.
- ext4 and f2fs also support encryption. If a verity file is also
- encrypted, the pages must be decrypted before being verified. To
- support this, these filesystems allocate a "post-read context" for
- each bio and store it in ``->bi_private``::
- struct bio_post_read_ctx {
- struct bio *bio;
- struct work_struct work;
- unsigned int cur_step;
- unsigned int enabled_steps;
- };
- ``enabled_steps`` is a bitmask that specifies whether decryption,
- verity, or both is enabled. After the bio completes, for each needed
- postprocessing step the filesystem enqueues the bio_post_read_ctx on a
- workqueue, and then the workqueue work does the decryption or
- verification. Finally, pages where no decryption or verity error
- occurred are marked Uptodate, and the pages are unlocked.
- Files on ext4 and f2fs may contain holes. Normally, ``->readpages()``
- simply zeroes holes and sets the corresponding pages Uptodate; no bios
- are issued. To prevent this case from bypassing fs-verity, these
- filesystems use fsverity_verify_page() to verify hole pages.
- ext4 and f2fs disable direct I/O on verity files, since otherwise
- direct I/O would bypass fs-verity. (They also do the same for
- encrypted files.)
- Userspace utility
- =================
- This document focuses on the kernel, but a userspace utility for
- fs-verity can be found at:
- https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/fsverity-utils.git
- See the README.md file in the fsverity-utils source tree for details,
- including examples of setting up fs-verity protected files.
- Tests
- =====
- To test fs-verity, use xfstests. For example, using `kvm-xfstests
- <https://github.com/tytso/xfstests-bld/blob/master/Documentation/kvm-quickstart.md>`_::
- kvm-xfstests -c ext4,f2fs -g verity
- FAQ
- ===
- This section answers frequently asked questions about fs-verity that
- weren't already directly answered in other parts of this document.
- :Q: Why isn't fs-verity part of IMA?
- :A: fs-verity and IMA (Integrity Measurement Architecture) have
- different focuses. fs-verity is a filesystem-level mechanism for
- hashing individual files using a Merkle tree. In contrast, IMA
- specifies a system-wide policy that specifies which files are
- hashed and what to do with those hashes, such as log them,
- authenticate them, or add them to a measurement list.
- IMA is planned to support the fs-verity hashing mechanism as an
- alternative to doing full file hashes, for people who want the
- performance and security benefits of the Merkle tree based hash.
- But it doesn't make sense to force all uses of fs-verity to be
- through IMA. As a standalone filesystem feature, fs-verity
- already meets many users' needs, and it's testable like other
- filesystem features e.g. with xfstests.
- :Q: Isn't fs-verity useless because the attacker can just modify the
- hashes in the Merkle tree, which is stored on-disk?
- :A: To verify the authenticity of an fs-verity file you must verify
- the authenticity of the "fs-verity file digest", which
- incorporates the root hash of the Merkle tree. See `Use cases`_.
- :Q: Isn't fs-verity useless because the attacker can just replace a
- verity file with a non-verity one?
- :A: See `Use cases`_. In the initial use case, it's really trusted
- userspace code that authenticates the files; fs-verity is just a
- tool to do this job efficiently and securely. The trusted
- userspace code will consider non-verity files to be inauthentic.
- :Q: Why does the Merkle tree need to be stored on-disk? Couldn't you
- store just the root hash?
- :A: If the Merkle tree wasn't stored on-disk, then you'd have to
- compute the entire tree when the file is first accessed, even if
- just one byte is being read. This is a fundamental consequence of
- how Merkle tree hashing works. To verify a leaf node, you need to
- verify the whole path to the root hash, including the root node
- (the thing which the root hash is a hash of). But if the root
- node isn't stored on-disk, you have to compute it by hashing its
- children, and so on until you've actually hashed the entire file.
- That defeats most of the point of doing a Merkle tree-based hash,
- since if you have to hash the whole file ahead of time anyway,
- then you could simply do sha256(file) instead. That would be much
- simpler, and a bit faster too.
- It's true that an in-memory Merkle tree could still provide the
- advantage of verification on every read rather than just on the
- first read. However, it would be inefficient because every time a
- hash page gets evicted (you can't pin the entire Merkle tree into
- memory, since it may be very large), in order to restore it you
- again need to hash everything below it in the tree. This again
- defeats most of the point of doing a Merkle tree-based hash, since
- a single block read could trigger re-hashing gigabytes of data.
- :Q: But couldn't you store just the leaf nodes and compute the rest?
- :A: See previous answer; this really just moves up one level, since
- one could alternatively interpret the data blocks as being the
- leaf nodes of the Merkle tree. It's true that the tree can be
- computed much faster if the leaf level is stored rather than just
- the data, but that's only because each level is less than 1% the
- size of the level below (assuming the recommended settings of
- SHA-256 and 4K blocks). For the exact same reason, by storing
- "just the leaf nodes" you'd already be storing over 99% of the
- tree, so you might as well simply store the whole tree.
- :Q: Can the Merkle tree be built ahead of time, e.g. distributed as
- part of a package that is installed to many computers?
- :A: This isn't currently supported. It was part of the original
- design, but was removed to simplify the kernel UAPI and because it
- wasn't a critical use case. Files are usually installed once and
- used many times, and cryptographic hashing is somewhat fast on
- most modern processors.
- :Q: Why doesn't fs-verity support writes?
- :A: Write support would be very difficult and would require a
- completely different design, so it's well outside the scope of
- fs-verity. Write support would require:
- - A way to maintain consistency between the data and hashes,
- including all levels of hashes, since corruption after a crash
- (especially of potentially the entire file!) is unacceptable.
- The main options for solving this are data journalling,
- copy-on-write, and log-structured volume. But it's very hard to
- retrofit existing filesystems with new consistency mechanisms.
- Data journalling is available on ext4, but is very slow.
- - Rebuilding the the Merkle tree after every write, which would be
- extremely inefficient. Alternatively, a different authenticated
- dictionary structure such as an "authenticated skiplist" could
- be used. However, this would be far more complex.
- Compare it to dm-verity vs. dm-integrity. dm-verity is very
- simple: the kernel just verifies read-only data against a
- read-only Merkle tree. In contrast, dm-integrity supports writes
- but is slow, is much more complex, and doesn't actually support
- full-device authentication since it authenticates each sector
- independently, i.e. there is no "root hash". It doesn't really
- make sense for the same device-mapper target to support these two
- very different cases; the same applies to fs-verity.
- :Q: Since verity files are immutable, why isn't the immutable bit set?
- :A: The existing "immutable" bit (FS_IMMUTABLE_FL) already has a
- specific set of semantics which not only make the file contents
- read-only, but also prevent the file from being deleted, renamed,
- linked to, or having its owner or mode changed. These extra
- properties are unwanted for fs-verity, so reusing the immutable
- bit isn't appropriate.
- :Q: Why does the API use ioctls instead of setxattr() and getxattr()?
- :A: Abusing the xattr interface for basically arbitrary syscalls is
- heavily frowned upon by most of the Linux filesystem developers.
- An xattr should really just be an xattr on-disk, not an API to
- e.g. magically trigger construction of a Merkle tree.
- :Q: Does fs-verity support remote filesystems?
- :A: Only ext4 and f2fs support is implemented currently, but in
- principle any filesystem that can store per-file verity metadata
- can support fs-verity, regardless of whether it's local or remote.
- Some filesystems may have fewer options of where to store the
- verity metadata; one possibility is to store it past the end of
- the file and "hide" it from userspace by manipulating i_size. The
- data verification functions provided by ``fs/verity/`` also assume
- that the filesystem uses the Linux pagecache, but both local and
- remote filesystems normally do so.
- :Q: Why is anything filesystem-specific at all? Shouldn't fs-verity
- be implemented entirely at the VFS level?
- :A: There are many reasons why this is not possible or would be very
- difficult, including the following:
- - To prevent bypassing verification, pages must not be marked
- Uptodate until they've been verified. Currently, each
- filesystem is responsible for marking pages Uptodate via
- ``->readpages()``. Therefore, currently it's not possible for
- the VFS to do the verification on its own. Changing this would
- require significant changes to the VFS and all filesystems.
- - It would require defining a filesystem-independent way to store
- the verity metadata. Extended attributes don't work for this
- because (a) the Merkle tree may be gigabytes, but many
- filesystems assume that all xattrs fit into a single 4K
- filesystem block, and (b) ext4 and f2fs encryption doesn't
- encrypt xattrs, yet the Merkle tree *must* be encrypted when the
- file contents are, because it stores hashes of the plaintext
- file contents.
- So the verity metadata would have to be stored in an actual
- file. Using a separate file would be very ugly, since the
- metadata is fundamentally part of the file to be protected, and
- it could cause problems where users could delete the real file
- but not the metadata file or vice versa. On the other hand,
- having it be in the same file would break applications unless
- filesystems' notion of i_size were divorced from the VFS's,
- which would be complex and require changes to all filesystems.
- - It's desirable that FS_IOC_ENABLE_VERITY uses the filesystem's
- transaction mechanism so that either the file ends up with
- verity enabled, or no changes were made. Allowing intermediate
- states to occur after a crash may cause problems.
|