Eric Biggers | f4f864c | 2017-10-29 06:30:14 -0400 | [diff] [blame] | 1 | ===================================== |
| 2 | Filesystem-level encryption (fscrypt) |
| 3 | ===================================== |
| 4 | |
| 5 | Introduction |
| 6 | ============ |
| 7 | |
| 8 | fscrypt is a library which filesystems can hook into to support |
| 9 | transparent encryption of files and directories. |
| 10 | |
| 11 | Note: "fscrypt" in this document refers to the kernel-level portion, |
| 12 | implemented in ``fs/crypto/``, as opposed to the userspace tool |
| 13 | `fscrypt <https://github.com/google/fscrypt>`_. This document only |
| 14 | covers the kernel-level portion. For command-line examples of how to |
| 15 | use encryption, see the documentation for the userspace tool `fscrypt |
| 16 | <https://github.com/google/fscrypt>`_. Also, it is recommended to use |
| 17 | the fscrypt userspace tool, or other existing userspace tools such as |
| 18 | `fscryptctl <https://github.com/google/fscryptctl>`_ or `Android's key |
| 19 | management system |
| 20 | <https://source.android.com/security/encryption/file-based>`_, over |
| 21 | using the kernel's API directly. Using existing tools reduces the |
| 22 | chance of introducing your own security bugs. (Nevertheless, for |
| 23 | completeness this documentation covers the kernel's API anyway.) |
| 24 | |
| 25 | Unlike dm-crypt, fscrypt operates at the filesystem level rather than |
| 26 | at the block device level. This allows it to encrypt different files |
| 27 | with different keys and to have unencrypted files on the same |
| 28 | filesystem. This is useful for multi-user systems where each user's |
| 29 | data-at-rest needs to be cryptographically isolated from the others. |
| 30 | However, except for filenames, fscrypt does not encrypt filesystem |
| 31 | metadata. |
| 32 | |
| 33 | Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated |
| 34 | directly into supported filesystems --- currently ext4, F2FS, and |
| 35 | UBIFS. This allows encrypted files to be read and written without |
| 36 | caching both the decrypted and encrypted pages in the pagecache, |
| 37 | thereby nearly halving the memory used and bringing it in line with |
| 38 | unencrypted files. Similarly, half as many dentries and inodes are |
| 39 | needed. eCryptfs also limits encrypted filenames to 143 bytes, |
| 40 | causing application compatibility issues; fscrypt allows the full 255 |
| 41 | bytes (NAME_MAX). Finally, unlike eCryptfs, the fscrypt API can be |
| 42 | used by unprivileged users, with no need to mount anything. |
| 43 | |
| 44 | fscrypt does not support encrypting files in-place. Instead, it |
| 45 | supports marking an empty directory as encrypted. Then, after |
| 46 | userspace provides the key, all regular files, directories, and |
| 47 | symbolic links created in that directory tree are transparently |
| 48 | encrypted. |
| 49 | |
| 50 | Threat model |
| 51 | ============ |
| 52 | |
| 53 | Offline attacks |
| 54 | --------------- |
| 55 | |
| 56 | Provided that userspace chooses a strong encryption key, fscrypt |
| 57 | protects the confidentiality of file contents and filenames in the |
| 58 | event of a single point-in-time permanent offline compromise of the |
| 59 | block device content. fscrypt does not protect the confidentiality of |
| 60 | non-filename metadata, e.g. file sizes, file permissions, file |
| 61 | timestamps, and extended attributes. Also, the existence and location |
| 62 | of holes (unallocated blocks which logically contain all zeroes) in |
| 63 | files is not protected. |
| 64 | |
| 65 | fscrypt is not guaranteed to protect confidentiality or authenticity |
| 66 | if an attacker is able to manipulate the filesystem offline prior to |
| 67 | an authorized user later accessing the filesystem. |
| 68 | |
| 69 | Online attacks |
| 70 | -------------- |
| 71 | |
| 72 | fscrypt (and storage encryption in general) can only provide limited |
| 73 | protection, if any at all, against online attacks. In detail: |
| 74 | |
| 75 | fscrypt is only resistant to side-channel attacks, such as timing or |
| 76 | electromagnetic attacks, to the extent that the underlying Linux |
| 77 | Cryptographic API algorithms are. If a vulnerable algorithm is used, |
| 78 | such as a table-based implementation of AES, it may be possible for an |
| 79 | attacker to mount a side channel attack against the online system. |
| 80 | Side channel attacks may also be mounted against applications |
| 81 | consuming decrypted data. |
| 82 | |
| 83 | After an encryption key has been provided, fscrypt is not designed to |
| 84 | hide the plaintext file contents or filenames from other users on the |
| 85 | same system, regardless of the visibility of the keyring key. |
| 86 | Instead, existing access control mechanisms such as file mode bits, |
| 87 | POSIX ACLs, LSMs, or mount namespaces should be used for this purpose. |
| 88 | Also note that as long as the encryption keys are *anywhere* in |
| 89 | memory, an online attacker can necessarily compromise them by mounting |
| 90 | a physical attack or by exploiting any kernel security vulnerability |
| 91 | which provides an arbitrary memory read primitive. |
| 92 | |
| 93 | While it is ostensibly possible to "evict" keys from the system, |
| 94 | recently accessed encrypted files will remain accessible at least |
| 95 | until the filesystem is unmounted or the VFS caches are dropped, e.g. |
| 96 | using ``echo 2 > /proc/sys/vm/drop_caches``. Even after that, if the |
| 97 | RAM is compromised before being powered off, it will likely still be |
| 98 | possible to recover portions of the plaintext file contents, if not |
| 99 | some of the encryption keys as well. (Since Linux v4.12, all |
| 100 | in-kernel keys related to fscrypt are sanitized before being freed. |
| 101 | However, userspace would need to do its part as well.) |
| 102 | |
| 103 | Currently, fscrypt does not prevent a user from maliciously providing |
| 104 | an incorrect key for another user's existing encrypted files. A |
| 105 | protection against this is planned. |
| 106 | |
| 107 | Key hierarchy |
| 108 | ============= |
| 109 | |
| 110 | Master Keys |
| 111 | ----------- |
| 112 | |
| 113 | Each encrypted directory tree is protected by a *master key*. Master |
| 114 | keys can be up to 64 bytes long, and must be at least as long as the |
| 115 | greater of the key length needed by the contents and filenames |
| 116 | encryption modes being used. For example, if AES-256-XTS is used for |
| 117 | contents encryption, the master key must be 64 bytes (512 bits). Note |
| 118 | that the XTS mode is defined to require a key twice as long as that |
| 119 | required by the underlying block cipher. |
| 120 | |
| 121 | To "unlock" an encrypted directory tree, userspace must provide the |
| 122 | appropriate master key. There can be any number of master keys, each |
| 123 | of which protects any number of directory trees on any number of |
| 124 | filesystems. |
| 125 | |
| 126 | Userspace should generate master keys either using a cryptographically |
| 127 | secure random number generator, or by using a KDF (Key Derivation |
| 128 | Function). Note that whenever a KDF is used to "stretch" a |
| 129 | lower-entropy secret such as a passphrase, it is critical that a KDF |
| 130 | designed for this purpose be used, such as scrypt, PBKDF2, or Argon2. |
| 131 | |
| 132 | Per-file keys |
| 133 | ------------- |
| 134 | |
| 135 | Master keys are not used to encrypt file contents or names directly. |
| 136 | Instead, a unique key is derived for each encrypted file, including |
| 137 | each regular file, directory, and symbolic link. This has several |
| 138 | advantages: |
| 139 | |
| 140 | - In cryptosystems, the same key material should never be used for |
| 141 | different purposes. Using the master key as both an XTS key for |
| 142 | contents encryption and as a CTS-CBC key for filenames encryption |
| 143 | would violate this rule. |
| 144 | - Per-file keys simplify the choice of IVs (Initialization Vectors) |
| 145 | for contents encryption. Without per-file keys, to ensure IV |
| 146 | uniqueness both the inode and logical block number would need to be |
| 147 | encoded in the IVs. This would make it impossible to renumber |
| 148 | inodes, which e.g. ``resize2fs`` can do when resizing an ext4 |
| 149 | filesystem. With per-file keys, it is sufficient to encode just the |
| 150 | logical block number in the IVs. |
| 151 | - Per-file keys strengthen the encryption of filenames, where IVs are |
| 152 | reused out of necessity. With a unique key per directory, IV reuse |
| 153 | is limited to within a single directory. |
| 154 | - Per-file keys allow individual files to be securely erased simply by |
| 155 | securely erasing their keys. (Not yet implemented.) |
| 156 | |
| 157 | A KDF (Key Derivation Function) is used to derive per-file keys from |
| 158 | the master key. This is done instead of wrapping a randomly-generated |
| 159 | key for each file because it reduces the size of the encryption xattr, |
| 160 | which for some filesystems makes the xattr more likely to fit in-line |
| 161 | in the filesystem's inode table. With a KDF, only a 16-byte nonce is |
| 162 | required --- long enough to make key reuse extremely unlikely. A |
| 163 | wrapped key, on the other hand, would need to be up to 64 bytes --- |
| 164 | the length of an AES-256-XTS key. Furthermore, currently there is no |
| 165 | requirement to support unlocking a file with multiple alternative |
| 166 | master keys or to support rotating master keys. Instead, the master |
| 167 | keys may be wrapped in userspace, e.g. as done by the `fscrypt |
| 168 | <https://github.com/google/fscrypt>`_ tool. |
| 169 | |
| 170 | The current KDF encrypts the master key using the 16-byte nonce as an |
| 171 | AES-128-ECB key. The output is used as the derived key. If the |
| 172 | output is longer than needed, then it is truncated to the needed |
| 173 | length. Truncation is the norm for directories and symlinks, since |
| 174 | those use the CTS-CBC encryption mode which requires a key half as |
| 175 | long as that required by the XTS encryption mode. |
| 176 | |
| 177 | Note: this KDF meets the primary security requirement, which is to |
| 178 | produce unique derived keys that preserve the entropy of the master |
| 179 | key, assuming that the master key is already a good pseudorandom key. |
| 180 | However, it is nonstandard and has some problems such as being |
| 181 | reversible, so it is generally considered to be a mistake! It may be |
| 182 | replaced with HKDF or another more standard KDF in the future. |
| 183 | |
| 184 | Encryption modes and usage |
| 185 | ========================== |
| 186 | |
| 187 | fscrypt allows one encryption mode to be specified for file contents |
| 188 | and one encryption mode to be specified for filenames. Different |
| 189 | directory trees are permitted to use different encryption modes. |
| 190 | Currently, the following pairs of encryption modes are supported: |
| 191 | |
| 192 | - AES-256-XTS for contents and AES-256-CTS-CBC for filenames |
| 193 | - AES-128-CBC for contents and AES-128-CTS-CBC for filenames |
Eric Biggers | 12d28f7 | 2018-05-07 17:22:08 -0700 | [diff] [blame] | 194 | - Speck128/256-XTS for contents and Speck128/256-CTS-CBC for filenames |
Eric Biggers | f4f864c | 2017-10-29 06:30:14 -0400 | [diff] [blame] | 195 | |
| 196 | It is strongly recommended to use AES-256-XTS for contents encryption. |
| 197 | AES-128-CBC was added only for low-powered embedded devices with |
| 198 | crypto accelerators such as CAAM or CESA that do not support XTS. |
| 199 | |
Eric Biggers | 12d28f7 | 2018-05-07 17:22:08 -0700 | [diff] [blame] | 200 | Similarly, Speck128/256 support was only added for older or low-end |
| 201 | CPUs which cannot do AES fast enough -- especially ARM CPUs which have |
| 202 | NEON instructions but not the Cryptography Extensions -- and for which |
| 203 | it would not otherwise be feasible to use encryption at all. It is |
| 204 | not recommended to use Speck on CPUs that have AES instructions. |
| 205 | Speck support is only available if it has been enabled in the crypto |
| 206 | API via CONFIG_CRYPTO_SPECK. Also, on ARM platforms, to get |
| 207 | acceptable performance CONFIG_CRYPTO_SPECK_NEON must be enabled. |
| 208 | |
Eric Biggers | f4f864c | 2017-10-29 06:30:14 -0400 | [diff] [blame] | 209 | New encryption modes can be added relatively easily, without changes |
| 210 | to individual filesystems. However, authenticated encryption (AE) |
| 211 | modes are not currently supported because of the difficulty of dealing |
| 212 | with ciphertext expansion. |
| 213 | |
| 214 | For file contents, each filesystem block is encrypted independently. |
| 215 | Currently, only the case where the filesystem block size is equal to |
| 216 | the system's page size (usually 4096 bytes) is supported. With the |
| 217 | XTS mode of operation (recommended), the logical block number within |
| 218 | the file is used as the IV. With the CBC mode of operation (not |
| 219 | recommended), ESSIV is used; specifically, the IV for CBC is the |
| 220 | logical block number encrypted with AES-256, where the AES-256 key is |
| 221 | the SHA-256 hash of the inode's data encryption key. |
| 222 | |
| 223 | For filenames, the full filename is encrypted at once. Because of the |
| 224 | requirements to retain support for efficient directory lookups and |
| 225 | filenames of up to 255 bytes, a constant initialization vector (IV) is |
| 226 | used. However, each encrypted directory uses a unique key, which |
| 227 | limits IV reuse to within a single directory. Note that IV reuse in |
| 228 | the context of CTS-CBC encryption means that when the original |
| 229 | filenames share a common prefix at least as long as the cipher block |
| 230 | size (16 bytes for AES), the corresponding encrypted filenames will |
| 231 | also share a common prefix. This is undesirable; it may be fixed in |
| 232 | the future by switching to an encryption mode that is a strong |
| 233 | pseudorandom permutation on arbitrary-length messages, e.g. the HEH |
| 234 | (Hash-Encrypt-Hash) mode. |
| 235 | |
| 236 | Since filenames are encrypted with the CTS-CBC mode of operation, the |
| 237 | plaintext and ciphertext filenames need not be multiples of the AES |
| 238 | block size, i.e. 16 bytes. However, the minimum size that can be |
| 239 | encrypted is 16 bytes, so shorter filenames are NUL-padded to 16 bytes |
| 240 | before being encrypted. In addition, to reduce leakage of filename |
| 241 | lengths via their ciphertexts, all filenames are NUL-padded to the |
| 242 | next 4, 8, 16, or 32-byte boundary (configurable). 32 is recommended |
| 243 | since this provides the best confidentiality, at the cost of making |
| 244 | directory entries consume slightly more space. Note that since NUL |
| 245 | (``\0``) is not otherwise a valid character in filenames, the padding |
| 246 | will never produce duplicate plaintexts. |
| 247 | |
| 248 | Symbolic link targets are considered a type of filename and are |
| 249 | encrypted in the same way as filenames in directory entries. Each |
| 250 | symlink also uses a unique key; hence, the hardcoded IV is not a |
| 251 | problem for symlinks. |
| 252 | |
| 253 | User API |
| 254 | ======== |
| 255 | |
| 256 | Setting an encryption policy |
| 257 | ---------------------------- |
| 258 | |
| 259 | The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an |
| 260 | empty directory or verifies that a directory or regular file already |
| 261 | has the specified encryption policy. It takes in a pointer to a |
| 262 | :c:type:`struct fscrypt_policy`, defined as follows:: |
| 263 | |
| 264 | #define FS_KEY_DESCRIPTOR_SIZE 8 |
| 265 | |
| 266 | struct fscrypt_policy { |
| 267 | __u8 version; |
| 268 | __u8 contents_encryption_mode; |
| 269 | __u8 filenames_encryption_mode; |
| 270 | __u8 flags; |
| 271 | __u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE]; |
| 272 | }; |
| 273 | |
| 274 | This structure must be initialized as follows: |
| 275 | |
| 276 | - ``version`` must be 0. |
| 277 | |
| 278 | - ``contents_encryption_mode`` and ``filenames_encryption_mode`` must |
| 279 | be set to constants from ``<linux/fs.h>`` which identify the |
| 280 | encryption modes to use. If unsure, use |
| 281 | FS_ENCRYPTION_MODE_AES_256_XTS (1) for ``contents_encryption_mode`` |
| 282 | and FS_ENCRYPTION_MODE_AES_256_CTS (4) for |
| 283 | ``filenames_encryption_mode``. |
| 284 | |
| 285 | - ``flags`` must be set to a value from ``<linux/fs.h>`` which |
| 286 | identifies the amount of NUL-padding to use when encrypting |
| 287 | filenames. If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3). |
| 288 | |
| 289 | - ``master_key_descriptor`` specifies how to find the master key in |
| 290 | the keyring; see `Adding keys`_. It is up to userspace to choose a |
| 291 | unique ``master_key_descriptor`` for each master key. The e4crypt |
| 292 | and fscrypt tools use the first 8 bytes of |
| 293 | ``SHA-512(SHA-512(master_key))``, but this particular scheme is not |
| 294 | required. Also, the master key need not be in the keyring yet when |
| 295 | FS_IOC_SET_ENCRYPTION_POLICY is executed. However, it must be added |
| 296 | before any files can be created in the encrypted directory. |
| 297 | |
| 298 | If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY |
| 299 | verifies that the file is an empty directory. If so, the specified |
| 300 | encryption policy is assigned to the directory, turning it into an |
| 301 | encrypted directory. After that, and after providing the |
| 302 | corresponding master key as described in `Adding keys`_, all regular |
| 303 | files, directories (recursively), and symlinks created in the |
| 304 | directory will be encrypted, inheriting the same encryption policy. |
| 305 | The filenames in the directory's entries will be encrypted as well. |
| 306 | |
| 307 | Alternatively, if the file is already encrypted, then |
| 308 | FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption |
| 309 | policy exactly matches the actual one. If they match, then the ioctl |
| 310 | returns 0. Otherwise, it fails with EEXIST. This works on both |
| 311 | regular files and directories, including nonempty directories. |
| 312 | |
| 313 | Note that the ext4 filesystem does not allow the root directory to be |
| 314 | encrypted, even if it is empty. Users who want to encrypt an entire |
| 315 | filesystem with one key should consider using dm-crypt instead. |
| 316 | |
| 317 | FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors: |
| 318 | |
| 319 | - ``EACCES``: the file is not owned by the process's uid, nor does the |
| 320 | process have the CAP_FOWNER capability in a namespace with the file |
| 321 | owner's uid mapped |
| 322 | - ``EEXIST``: the file is already encrypted with an encryption policy |
| 323 | different from the one specified |
| 324 | - ``EINVAL``: an invalid encryption policy was specified (invalid |
| 325 | version, mode(s), or flags) |
| 326 | - ``ENOTDIR``: the file is unencrypted and is a regular file, not a |
| 327 | directory |
| 328 | - ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory |
| 329 | - ``ENOTTY``: this type of filesystem does not implement encryption |
| 330 | - ``EOPNOTSUPP``: the kernel was not configured with encryption |
| 331 | support for this filesystem, or the filesystem superblock has not |
| 332 | had encryption enabled on it. (For example, to use encryption on an |
| 333 | ext4 filesystem, CONFIG_EXT4_ENCRYPTION must be enabled in the |
| 334 | kernel config, and the superblock must have had the "encrypt" |
| 335 | feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O |
| 336 | encrypt``.) |
| 337 | - ``EPERM``: this directory may not be encrypted, e.g. because it is |
| 338 | the root directory of an ext4 filesystem |
| 339 | - ``EROFS``: the filesystem is readonly |
| 340 | |
| 341 | Getting an encryption policy |
| 342 | ---------------------------- |
| 343 | |
| 344 | The FS_IOC_GET_ENCRYPTION_POLICY ioctl retrieves the :c:type:`struct |
| 345 | fscrypt_policy`, if any, for a directory or regular file. See above |
| 346 | for the struct definition. No additional permissions are required |
| 347 | beyond the ability to open the file. |
| 348 | |
| 349 | FS_IOC_GET_ENCRYPTION_POLICY can fail with the following errors: |
| 350 | |
| 351 | - ``EINVAL``: the file is encrypted, but it uses an unrecognized |
| 352 | encryption context format |
| 353 | - ``ENODATA``: the file is not encrypted |
| 354 | - ``ENOTTY``: this type of filesystem does not implement encryption |
| 355 | - ``EOPNOTSUPP``: the kernel was not configured with encryption |
| 356 | support for this filesystem |
| 357 | |
| 358 | Note: if you only need to know whether a file is encrypted or not, on |
| 359 | most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl |
| 360 | and check for FS_ENCRYPT_FL, or to use the statx() system call and |
| 361 | check for STATX_ATTR_ENCRYPTED in stx_attributes. |
| 362 | |
| 363 | Getting the per-filesystem salt |
| 364 | ------------------------------- |
| 365 | |
| 366 | Some filesystems, such as ext4 and F2FS, also support the deprecated |
| 367 | ioctl FS_IOC_GET_ENCRYPTION_PWSALT. This ioctl retrieves a randomly |
| 368 | generated 16-byte value stored in the filesystem superblock. This |
| 369 | value is intended to used as a salt when deriving an encryption key |
| 370 | from a passphrase or other low-entropy user credential. |
| 371 | |
| 372 | FS_IOC_GET_ENCRYPTION_PWSALT is deprecated. Instead, prefer to |
| 373 | generate and manage any needed salt(s) in userspace. |
| 374 | |
| 375 | Adding keys |
| 376 | ----------- |
| 377 | |
| 378 | To provide a master key, userspace must add it to an appropriate |
| 379 | keyring using the add_key() system call (see: |
| 380 | ``Documentation/security/keys/core.rst``). The key type must be |
| 381 | "logon"; keys of this type are kept in kernel memory and cannot be |
| 382 | read back by userspace. The key description must be "fscrypt:" |
| 383 | followed by the 16-character lower case hex representation of the |
| 384 | ``master_key_descriptor`` that was set in the encryption policy. The |
| 385 | key payload must conform to the following structure:: |
| 386 | |
| 387 | #define FS_MAX_KEY_SIZE 64 |
| 388 | |
| 389 | struct fscrypt_key { |
| 390 | u32 mode; |
| 391 | u8 raw[FS_MAX_KEY_SIZE]; |
| 392 | u32 size; |
| 393 | }; |
| 394 | |
| 395 | ``mode`` is ignored; just set it to 0. The actual key is provided in |
| 396 | ``raw`` with ``size`` indicating its size in bytes. That is, the |
| 397 | bytes ``raw[0..size-1]`` (inclusive) are the actual key. |
| 398 | |
| 399 | The key description prefix "fscrypt:" may alternatively be replaced |
| 400 | with a filesystem-specific prefix such as "ext4:". However, the |
| 401 | filesystem-specific prefixes are deprecated and should not be used in |
| 402 | new programs. |
| 403 | |
| 404 | There are several different types of keyrings in which encryption keys |
| 405 | may be placed, such as a session keyring, a user session keyring, or a |
| 406 | user keyring. Each key must be placed in a keyring that is "attached" |
| 407 | to all processes that might need to access files encrypted with it, in |
| 408 | the sense that request_key() will find the key. Generally, if only |
| 409 | processes belonging to a specific user need to access a given |
| 410 | encrypted directory and no session keyring has been installed, then |
| 411 | that directory's key should be placed in that user's user session |
| 412 | keyring or user keyring. Otherwise, a session keyring should be |
| 413 | installed if needed, and the key should be linked into that session |
| 414 | keyring, or in a keyring linked into that session keyring. |
| 415 | |
| 416 | Note: introducing the complex visibility semantics of keyrings here |
| 417 | was arguably a mistake --- especially given that by design, after any |
| 418 | process successfully opens an encrypted file (thereby setting up the |
| 419 | per-file key), possessing the keyring key is not actually required for |
| 420 | any process to read/write the file until its in-memory inode is |
| 421 | evicted. In the future there probably should be a way to provide keys |
| 422 | directly to the filesystem instead, which would make the intended |
| 423 | semantics clearer. |
| 424 | |
| 425 | Access semantics |
| 426 | ================ |
| 427 | |
| 428 | With the key |
| 429 | ------------ |
| 430 | |
| 431 | With the encryption key, encrypted regular files, directories, and |
| 432 | symlinks behave very similarly to their unencrypted counterparts --- |
| 433 | after all, the encryption is intended to be transparent. However, |
| 434 | astute users may notice some differences in behavior: |
| 435 | |
| 436 | - Unencrypted files, or files encrypted with a different encryption |
| 437 | policy (i.e. different key, modes, or flags), cannot be renamed or |
| 438 | linked into an encrypted directory; see `Encryption policy |
| 439 | enforcement`_. Attempts to do so will fail with EPERM. However, |
| 440 | encrypted files can be renamed within an encrypted directory, or |
| 441 | into an unencrypted directory. |
| 442 | |
| 443 | - Direct I/O is not supported on encrypted files. Attempts to use |
| 444 | direct I/O on such files will fall back to buffered I/O. |
| 445 | |
| 446 | - The fallocate operations FALLOC_FL_COLLAPSE_RANGE, |
| 447 | FALLOC_FL_INSERT_RANGE, and FALLOC_FL_ZERO_RANGE are not supported |
| 448 | on encrypted files and will fail with EOPNOTSUPP. |
| 449 | |
| 450 | - Online defragmentation of encrypted files is not supported. The |
| 451 | EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with |
| 452 | EOPNOTSUPP. |
| 453 | |
| 454 | - The ext4 filesystem does not support data journaling with encrypted |
| 455 | regular files. It will fall back to ordered data mode instead. |
| 456 | |
| 457 | - DAX (Direct Access) is not supported on encrypted files. |
| 458 | |
| 459 | - The st_size of an encrypted symlink will not necessarily give the |
| 460 | length of the symlink target as required by POSIX. It will actually |
Eric Biggers | 2f46a2b | 2018-01-11 23:30:09 -0500 | [diff] [blame] | 461 | give the length of the ciphertext, which will be slightly longer |
| 462 | than the plaintext due to NUL-padding and an extra 2-byte overhead. |
| 463 | |
| 464 | - The maximum length of an encrypted symlink is 2 bytes shorter than |
| 465 | the maximum length of an unencrypted symlink. For example, on an |
| 466 | EXT4 filesystem with a 4K block size, unencrypted symlinks can be up |
| 467 | to 4095 bytes long, while encrypted symlinks can only be up to 4093 |
| 468 | bytes long (both lengths excluding the terminating null). |
Eric Biggers | f4f864c | 2017-10-29 06:30:14 -0400 | [diff] [blame] | 469 | |
| 470 | Note that mmap *is* supported. This is possible because the pagecache |
| 471 | for an encrypted file contains the plaintext, not the ciphertext. |
| 472 | |
| 473 | Without the key |
| 474 | --------------- |
| 475 | |
| 476 | Some filesystem operations may be performed on encrypted regular |
| 477 | files, directories, and symlinks even before their encryption key has |
| 478 | been provided: |
| 479 | |
| 480 | - File metadata may be read, e.g. using stat(). |
| 481 | |
| 482 | - Directories may be listed, in which case the filenames will be |
| 483 | listed in an encoded form derived from their ciphertext. The |
| 484 | current encoding algorithm is described in `Filename hashing and |
| 485 | encoding`_. The algorithm is subject to change, but it is |
| 486 | guaranteed that the presented filenames will be no longer than |
| 487 | NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and |
| 488 | will uniquely identify directory entries. |
| 489 | |
| 490 | The ``.`` and ``..`` directory entries are special. They are always |
| 491 | present and are not encrypted or encoded. |
| 492 | |
| 493 | - Files may be deleted. That is, nondirectory files may be deleted |
| 494 | with unlink() as usual, and empty directories may be deleted with |
| 495 | rmdir() as usual. Therefore, ``rm`` and ``rm -r`` will work as |
| 496 | expected. |
| 497 | |
| 498 | - Symlink targets may be read and followed, but they will be presented |
| 499 | in encrypted form, similar to filenames in directories. Hence, they |
| 500 | are unlikely to point to anywhere useful. |
| 501 | |
| 502 | Without the key, regular files cannot be opened or truncated. |
| 503 | Attempts to do so will fail with ENOKEY. This implies that any |
| 504 | regular file operations that require a file descriptor, such as |
| 505 | read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden. |
| 506 | |
| 507 | Also without the key, files of any type (including directories) cannot |
| 508 | be created or linked into an encrypted directory, nor can a name in an |
| 509 | encrypted directory be the source or target of a rename, nor can an |
| 510 | O_TMPFILE temporary file be created in an encrypted directory. All |
| 511 | such operations will fail with ENOKEY. |
| 512 | |
| 513 | It is not currently possible to backup and restore encrypted files |
| 514 | without the encryption key. This would require special APIs which |
| 515 | have not yet been implemented. |
| 516 | |
| 517 | Encryption policy enforcement |
| 518 | ============================= |
| 519 | |
| 520 | After an encryption policy has been set on a directory, all regular |
| 521 | files, directories, and symbolic links created in that directory |
| 522 | (recursively) will inherit that encryption policy. Special files --- |
| 523 | that is, named pipes, device nodes, and UNIX domain sockets --- will |
| 524 | not be encrypted. |
| 525 | |
| 526 | Except for those special files, it is forbidden to have unencrypted |
| 527 | files, or files encrypted with a different encryption policy, in an |
| 528 | encrypted directory tree. Attempts to link or rename such a file into |
| 529 | an encrypted directory will fail with EPERM. This is also enforced |
| 530 | during ->lookup() to provide limited protection against offline |
| 531 | attacks that try to disable or downgrade encryption in known locations |
| 532 | where applications may later write sensitive data. It is recommended |
| 533 | that systems implementing a form of "verified boot" take advantage of |
| 534 | this by validating all top-level encryption policies prior to access. |
| 535 | |
| 536 | Implementation details |
| 537 | ====================== |
| 538 | |
| 539 | Encryption context |
| 540 | ------------------ |
| 541 | |
| 542 | An encryption policy is represented on-disk by a :c:type:`struct |
| 543 | fscrypt_context`. It is up to individual filesystems to decide where |
| 544 | to store it, but normally it would be stored in a hidden extended |
| 545 | attribute. It should *not* be exposed by the xattr-related system |
| 546 | calls such as getxattr() and setxattr() because of the special |
| 547 | semantics of the encryption xattr. (In particular, there would be |
| 548 | much confusion if an encryption policy were to be added to or removed |
| 549 | from anything other than an empty directory.) The struct is defined |
| 550 | as follows:: |
| 551 | |
| 552 | #define FS_KEY_DESCRIPTOR_SIZE 8 |
| 553 | #define FS_KEY_DERIVATION_NONCE_SIZE 16 |
| 554 | |
| 555 | struct fscrypt_context { |
| 556 | u8 format; |
| 557 | u8 contents_encryption_mode; |
| 558 | u8 filenames_encryption_mode; |
| 559 | u8 flags; |
| 560 | u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE]; |
| 561 | u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE]; |
| 562 | }; |
| 563 | |
| 564 | Note that :c:type:`struct fscrypt_context` contains the same |
| 565 | information as :c:type:`struct fscrypt_policy` (see `Setting an |
| 566 | encryption policy`_), except that :c:type:`struct fscrypt_context` |
| 567 | also contains a nonce. The nonce is randomly generated by the kernel |
| 568 | and is used to derive the inode's encryption key as described in |
| 569 | `Per-file keys`_. |
| 570 | |
| 571 | Data path changes |
| 572 | ----------------- |
| 573 | |
| 574 | For the read path (->readpage()) of regular files, filesystems can |
| 575 | read the ciphertext into the page cache and decrypt it in-place. The |
| 576 | page lock must be held until decryption has finished, to prevent the |
| 577 | page from becoming visible to userspace prematurely. |
| 578 | |
| 579 | For the write path (->writepage()) of regular files, filesystems |
| 580 | cannot encrypt data in-place in the page cache, since the cached |
| 581 | plaintext must be preserved. Instead, filesystems must encrypt into a |
| 582 | temporary buffer or "bounce page", then write out the temporary |
| 583 | buffer. Some filesystems, such as UBIFS, already use temporary |
| 584 | buffers regardless of encryption. Other filesystems, such as ext4 and |
| 585 | F2FS, have to allocate bounce pages specially for encryption. |
| 586 | |
| 587 | Filename hashing and encoding |
| 588 | ----------------------------- |
| 589 | |
| 590 | Modern filesystems accelerate directory lookups by using indexed |
| 591 | directories. An indexed directory is organized as a tree keyed by |
| 592 | filename hashes. When a ->lookup() is requested, the filesystem |
| 593 | normally hashes the filename being looked up so that it can quickly |
| 594 | find the corresponding directory entry, if any. |
| 595 | |
| 596 | With encryption, lookups must be supported and efficient both with and |
| 597 | without the encryption key. Clearly, it would not work to hash the |
| 598 | plaintext filenames, since the plaintext filenames are unavailable |
| 599 | without the key. (Hashing the plaintext filenames would also make it |
| 600 | impossible for the filesystem's fsck tool to optimize encrypted |
| 601 | directories.) Instead, filesystems hash the ciphertext filenames, |
| 602 | i.e. the bytes actually stored on-disk in the directory entries. When |
| 603 | asked to do a ->lookup() with the key, the filesystem just encrypts |
| 604 | the user-supplied name to get the ciphertext. |
| 605 | |
| 606 | Lookups without the key are more complicated. The raw ciphertext may |
| 607 | contain the ``\0`` and ``/`` characters, which are illegal in |
| 608 | filenames. Therefore, readdir() must base64-encode the ciphertext for |
| 609 | presentation. For most filenames, this works fine; on ->lookup(), the |
| 610 | filesystem just base64-decodes the user-supplied name to get back to |
| 611 | the raw ciphertext. |
| 612 | |
| 613 | However, for very long filenames, base64 encoding would cause the |
| 614 | filename length to exceed NAME_MAX. To prevent this, readdir() |
| 615 | actually presents long filenames in an abbreviated form which encodes |
| 616 | a strong "hash" of the ciphertext filename, along with the optional |
| 617 | filesystem-specific hash(es) needed for directory lookups. This |
| 618 | allows the filesystem to still, with a high degree of confidence, map |
| 619 | the filename given in ->lookup() back to a particular directory entry |
| 620 | that was previously listed by readdir(). See :c:type:`struct |
| 621 | fscrypt_digested_name` in the source for more details. |
| 622 | |
| 623 | Note that the precise way that filenames are presented to userspace |
| 624 | without the key is subject to change in the future. It is only meant |
| 625 | as a way to temporarily present valid filenames so that commands like |
| 626 | ``rm -r`` work as expected on encrypted directories. |