Problem Statement
App Instances deployed by EVE Customers receive and process business sensitive information from sensors and the Cloud. Data collected and processed by these App Instances are stored in their virtual storage, which is backed by the hardware storage on the EVE platform. It is important that even if the secondary storage drive is stolen, the data remains secure. For this reason, data should be in encrypted form when it is stored. Here we explore various tools available in Linux which support disk encryption, and compare their performance on the EVE platform.
Native EXT4 Encryption (e4crypt/fscrypt)
Starting from Linux kernel 4.1, EXT4 filesystem supports native encryption. This means that encryption functionality is built into the filesystem, and closely works with the internal implementation of EXT4 filesystem. e4crypt tool is used to manage EXT4 encryption. Recently fscrypt tool was introduced by Google as a wrapper around e4crypt with additional functionalities like better key management (PAM plugins) and support for changing the encryption keys periodically. While e4crypt is supported as a package in Alpine Linux, fscrypt is yet to be validated officially on Alpine Linux, and one needs to explicitly build and package it in EVE. It supports encryption only for a given directory under a partition. Kernel needs to be configured with CONFIG_EXT4_FS_ENCRYPTION for using this feature.
Overlay Encryption (eCryptfs)
Unlike Native encryption, eCryptfs does not assume any particular file system, but rather works on top of existing file system and provides a virtual mountpoint. Files that are written on this virtual mount point get encrypted and stored in the underlying filesystem. It supports encryption only for a given directory under a partition. ecryptfs-utilss is the official package supported by Alpine Linux, which has the commands to manage eCryptfs. Kernel needs to be configured with CONFIG_ECRYPTFS_ENCRYPTION for using this feature.
Disk or Partition level Encryption (dm-crypt/LUKS)
LUKS does encryption at the given disk partition level. It groups one or more partitions into a Linux LVM (more like a virtual disk), and supports encryption at the LVM level. LVM needs to be enabled for encryption, before further partitioning and mounting. User will be writing to the LVM, which will be encrypted and stored on the underlying physical disk partition. Cryptsetup is the widely used userspace tool to configure LUKS encryption, and comes by default on Alpine. It is by default enabled in the EVE kernel.
Adiantum
Adiantum is a new encryption method by Google, and is used on Android for improving performance of disk encryption on low-end ARM processors lacking CPU instructions for special encryption operations. Adiantum is not a new file system, but rather a new cipher as a replacement for CPU intensive AES cipher. Option to use Adiantum encryption has been added to dm-crypt/cryptsetup and fscrypt. Adiantum cipher was added to Linux kernel from version 5.0 onwards.
Current EVE kernel version is 4.19, and we need to upgrade to kernel 5.0 or later to experiment with Adiantum on EVE and measure its performance. This effort is underway, and we need to run fscrypt on DomU running out of EVE with 5.2.2 kernel and get the performance numbers.
Performance Measurement
Given the choices available for doing file encryption, it was decided to prototype each of these on one of the EVE platforms, and measure how they fare against each other. Flexible IO Tester (FIO) is a widely used tool in Linux to measure disk IO performance. one set of tests were done with FIO running on Dom0 (i.e. Base OS), and another set of tests were done with FIO running inside a DomU(VM), with its image coming from an encrypted disk on Dom0. Following are the details of the experiment:
EVE Model used: Advantech ARK 1124 (Dom0 tests) and Supermicro E100-9APP (DomU tests)
FIO test specification: 60% read, 40% write, 1GB file, with 4 worker threads and testing for 90 second duration
Performance reports of fscrypt, eCryptfs and LUKS have been uploaded as attachments.
Performance Overhead Comparison (on Dom0)
No Encryption | fscrypt/e4crypt | eCryptfs | dm-crypt(LUKS) | |
Read rate (MB/s) | 23.7 | 20.4 | 19 | 19.9 |
Write rate (MB/s) | 15.9 | 13.6 | 12.7 | 13.3 |
Read overhead % | 0 | 13.9 % | 19.80% | 16.00% |
Write overhead % | 0 | 14.40% | 20.10% | 16.35% |
Performance Overhead Comparison (on DomU)
No Encryption | fscrypt/e4crypt | eCryptfs | dm-crypt(LUKS) | |
Read rate (MB/s) | 15.4 | 15.2 | 12.3 | 13.8 |
Write rate (MB/s) | 10.3 | 10.1 | 8.19 | 9.19 |
Read overhead % | 0 | 1.20% | 20.10% | 10.30% |
Write overhead % | 0 | 1.90% | 20.40% | 10.70% |
Based on the above performance comparison, fscrypt (preferably with Adiantum) will be used to implement data encryption on EVE.
For detailed design aspects of fscrypt, the reader is strongly advised to read this fscrypt architecture document, and get familiar with the nomenclature used, more importantly the policy and protector constructs.
Implementation Details
Directory layout
Currently most of the persistent and sensitive data is stored under /dev/sda9 partition, which is mounted at /persist.
E.g.
/persist/img - Stores the mutated DomU image disks
/persist/config - Stores the device configuration
/persist/checkpoint - Stores last known working device configuration
However, there are other directories which are not considered sensitive:
E.g:
/persist/IMGA/ - Logs from IMGA
/persist/IMGB/ - Logs from IMGB
Therefore the following approach is proposed:
Create a new directory /persist/vault and set it up for encryption. All sensitive information can be moved to a subdirectory under /persist/vault.
E.g.
/persist/img to move to /persist/vault/img
/persist/config to move to /persist/vault/config
Also, /dev/sda9 will move to EXT4 from EXT3, to take advantage of EXT4 native file system encryption.
Setting up Encryption
At the time of first use, the /dev/sda9 partition will be prepared for encryption using “fscrypt setup” instructions. Fscrypt uses /etc/fscrypt.conf to store its global configuration. A static /etc/fscrypt.conf will be packaged and placed to be visible, as /etc/ will be readonly.
Specifically, /dev/sda9 will be prepared for EXT4 native encryption, with the following 2 settings:
mkfs.ext4 -O encrypt /dev/sda9
and post mount,
fscrypt setup /persist
And finally, contents of vault directory will be enabled for encryption (fscrypt encrypt /persist/vault), and pass a randomly generated hex key as the passphrase to protect the protector used for /persist/vault. You can treat this passphrase as the key which will unwrap the protector key, which will in turn unwrap policy key. Policy key is the key used for the actual encryption.
Unlocking on Genuine Boot-up
On every boot, /persist/vault should be unlocked for accessing its files and subdirectories. For this we need to feed the passphrase to fscrypt to unlock the policy keys and adding the policy keys to the kernel keyring. On platforms with TPM, TPM can be used to protect this passphrase, and TPM can unseal the passphrase against the validity of a set of PCRs - Providing a way to unlock the directory only if this is a legal, untampered bootup.
Protecting the Master Passphrase
For disk encryption to serve its purpose, the keys used for encrypting the data should be secure. For end-user devices, usually user input is sought to unlock the keys during bootup. (e.g. bitlocker password, PIN/fingerprint/pattern for smartphones). For Edge devices, this is not an option. On the other hand, we can’t store the keys on the hard drive, as anyone who steals the hard drive has the keys on the drive itself, which defeats the purpose.
Therefore it is proposed that TPM is used for sealing the master passphrase against a set of PCR values. With this mechanism, if the hard disk is lost, the keys are still not available on the hard disk, so there will not be a way to decrypt the data on the disk.
On the platforms, where there is no TPM support, we need to store the key on the hard disk. TBD: Should we even support disk encryption on platforms without TPM?
Suggestion: If we encrypt, we always encrypt, tpm or not.
- TPM device: as above, depends on PCR registers (given the constraints in the earlier comments)
- Non-TPM: Encryption key stored in Controller.
At initial startup, a device registers, cloud generates an encryption key and sends it. The device does not persist this key, but stores it in memory. At next boot, device retrieves the key from the cloud via an API call. At first blush, this doesn't do much, since the device's auth-to-the-cloud-identity is still based on something on the local disk/flash. It does, however have the following advantages:
- Stealing the disk/device is useless since it is useful only against the controller
- We can strengthen the device-to-controller auth by adding the serial (we already do that at registration), which won't be known unless someone steals the entire device, or add something hardware specific that is read from the firmware on every boot and auth, not just first-boot/registration.
- Most importantly, we have a "kill switch". It doesn't prevent the disk from being stolen, but it adds an extra step, one that connects to a controller space (the Controller), providing a place to deactivate a device (or even a whole class).
Master Key Rotation
Fscrypt supports changing the protectors password without re-encrypting all the files in the encrypted directory. If required, we can change the protector password, if we think that protector keys might have been compromised. In future we can consider rotating these protectors periodically (say every week) as a precautionary measure to enhance the security.
Recovery Procedure
It is possible to get a recovery key for the protector used in protecting the policy. Given the recovery key, we can technically reconstruct the protector key and hence unwrap the policy key and hence decrypt the contents of the encrypted directory. Fscrypt is yet to implement “fscrypt recover” series of commands, to use these from the command line (till then we can implement an offline tool, with modified fscrypt code, to decrypt the directory, given the recovery key).
Device, after setting up encryption, would send the recovery key to the Cloud Controller, for storing this in the vault in the Cloud, against the device certificate. In case there is a need to recover data from a corrupted asset, we can use this recovery key.
Image Compatibility
New Installation
/dev/sda9 will be formatted with EXT4, and /persist/vault will be created and marked for encryption.
Upgrade of existing Installation to Image with Encryption
/dev/sda9 will be reformatted from EXT3 to EXT4, and existing contents of /dev/sda9 will be lost. In this scenario, the device will behave like it was booted after a USB installation. All the DomU images, device configuration will be downloaded again from the Controller.
However one change that will be done is to preserve /persist/config/tpm_in_use file, across EXT3 to EXT4 transition. This is done so that device software continues to see the TPM mode, and hence cloud connectivity is not lost during the transition.
Downgrade Implications
If the device is downgraded to an old image which does not support disk encryption, /dev/sda9 will be reformatted again with EXT3, and contents of /persist will need to be repopulated by downloading configuration and images from the controller. However, /persist/config/tpm_in_use file may be lost and hence cloud controller connectivity may not come up. This will need manual intervention to populate /persist/config/tpm_in_use file or one needs to delete the device.cert.pem file and reboot, which will set the tpm_in_use file automatically.
If this downgrade behavior is not acceptable, then we need to explore options of doing disk encryption with EXT4 native encryption on an EXT3 filesystem. But performance of fscrypt on EXT3 filesystem is not as good as a native encryption on EXT4.
Attachments: