Release Notes
EVE is Edge Virtualization Engine
EVE aims to develop an open, agnostic, and standardized architecture unifying the approach to developing and orchestrating cloud-native applications across the enterprise on-premises edge. It offers users new levels of control through hardware-assisted virtualization of on-prem edge devices.
LATEST 8.10.0 Release https://github.com/lf-edge/eve/releases/tag/8.10.0 ๐
NEW:
๐กwwan: Explicitly request IPv4 connection
Without explicitly asking for IPv4 (which we only support for wwan), for some LTE networks the connect request may fail with:
error: couldn't start network: QMI protocol error (14): 'CallFailed'
call end reason (1): generic-unspecified
verbose call end reason (6,50): [3gpp] ipv4-only-allowed
The network returns this error to indicate that only PDN type IPv4 is allowed for the requested PDN connectivity (which we want but need to be explicit about it).
๐ฟSupport for ISO format
This PR adds support for ISO format to support attach of iso images to VM to boot from or to install required data from iso. Also, PR contains small refactoring of volumemgr which makes zvol usage configurable for zfs persist type.
๐Add new signing service at http://169.254.169.254/eve/v1/tpm/signer
Applications might want to get some application-specific data signed by EVE-OS so that they can verify it was indeed generated by an app instance running on a particular device.
โ๏ธBootstrap config protobuf message + config timestamp
Also some high-level documentation is included. However, later there will be a separate markdown document with a detailed description of the newly proposed bootstrapping mechanism (once we figure out all the details).
๐ฑEnable draid feature for persist pool
We had a bug with mismatch of libzfs and zfs module versions. Let's set draid feature enabled as we started with zfs 2.1.x which supports this feature. With disabled we will see errors from zpool status.
๐Build zfs libs and binaries in dom0-ztools
We should use the same version of libzfs as we use in the kernel module. Let's add a build of binaries for zfs into dom0-ztools.
๐Pillar with zfs files from dom0-ztools
We use zpool and zfs in pillar. Let's use binaries we built inside dom0-ztools.
๐Update functions in the ZFS package that use base.Exec() to get information
This commit changes the functions in the ZFS package where we used base.Exec() to get information.
After this commit, data will be collected through the go-libzfs.
FIX:
๐ Fix PCIe BAR allocation on HPE m750
By default, Linux reassigns BAR addresses if there are devices with 64-bit addresses. However, on m750 it fails to assign BAR registers in some HW configurations e.g. P1000 NVIDIA GPU is installed into slot 1. We just force using UEFI assignments in this case. We set it only for m750 to be on a safe side.
๐ use explicit specific version of strongswan in Dockerfile and local file
pkg/strongswan: explicitly specify a version rather than just downloading link to latest. Note that we checked which version we currently are using via the md5 hash and used the same one. This PR does not change the version used, only explicitly references it.
๐ use explicit busybox commit version in pkg/fw
pkg/fw: use an explicit FROM busybox@sha256:<hash> instead of just FROM busybox
๐ Fix dom0-ztools version
Seems version of dom0-ztools changed before the merge of #2746 PR.
๐ Fix live gcp target
We have a problem with live-gcp target because of the wrong directory to find disks.
๐ Some Edgeview enhancements and fixes
STATS:
GitHub:โญ๏ธ359(+1) DockerHub: 300k๐ (+6397) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.9.0...8.10.0
8.9.0 Release https://github.com/lf-edge/eve/releases/tag/8.9.0 ๐
NEW:
๐Do not defer on subsequent boot
On the first boot we want to defer until the EdgeNodeCerts have been published to the controller, but on a subsequent boot we need to proceed and use a checkpointed config. As part of this we make sure we do not attempt to restart the attestation if we didn't yet try.
๐Increase default turbo-mode clock to 1.8GHz
According to https://www.raspberrypi.com/documentation/computers/config_txt.html#arm_boost-raspberry-pi-4-only newer revisions of the Raspberry Pi 4B are equipped with a second switch-mode power supply for the SoC voltage rail, and this allows the default turbo-mode clock to be increased from 1.5GHz to 1.8GHz. This change should be safe for all such boards.
โ๏ธRemove alpine edge usage
We use alpine:edge in pkg/fw which is suboptimal in terms of controlling of versions of software. Let's jump to defined versions of upstream repositories to grub blobs from. Now it reproduces the same logic as we have using alpine:edge.
๐Enable frequency control support for RPi4
Seems without CONFIG_ARM_RASPBERRYPI_CPUFREQ we use 600MHz in all cases on RPi4.
๐Refactor verification of AuthContainer
This prepares for being able to checkpoint the received configuration with its AuthContainer wrapper.
โ๏ธCheckpoint EdgeDevConfig with AuthContainer
Means we can verify the signature when using the checkpoint
๐Populate meta data API with edge node info
Returns enterprise, project and device information in http://169.254.169.254/eve/v1/network.json
STATS:
GitHub:โญ๏ธ358(+0) DockerHub: 294394 (+4620) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.8.0...8.9.0
8.8.0 Release https://github.com/lf-edge/eve/releases/tag/8.8.0 ๐
EVE is Edge Virtualization Engine
EVE aims to develop an open, agnostic, and standardized architecture unifying the approach to developing and orchestrating cloud-native applications across the enterprise on-premises edge. It offers users new levels of control through hardware-assisted virtualization of on-prem edge devices.
NEW:
๐Update of Eden tests
1. Sync version of eclient across tests
2. Add json format for info and metrics
3. Reduce metrics and config intervals to reduce load
4. Reduce load in switch_net_vlans by and reduce apps and dnsmasq
5. Add root certificate to v2tlsbaseroot
6. Expand volumes test to check no space and recovery from no space
7. Update ROL to support logs
8. Update EVE-OS versions
โ๏ธRun only one EdenGCP at a time
We are limited in ROL devices, so should limit concurrent runs of EdenGCP workflows. Using a concurrency group we will run one workflow at a time. According to Github’s docs "When a concurrent job or workflow is queued, if another job or workflow using the same concurrency group in the repository is in progress, the queued job or workflow will be pending. Any previously pending job or workflow in the concurrency group will be canceled.". But we use a snapshot version of EVE-OS, so it is expected behavior.
๐Use lock for PoolOpenAll
Seems the namespace_reload function is expected to not run concurrently on the same handler pointer, we should use lock for iteration functions (PoolOpenAll and DatasetOpenAll).
๐กwwan: configure MTU requested by the network
Currently, we leave the default MTU=1500 configured on the wwan interface. However, we should respect the MTU settings required by the network to which the modem has connected.
โ๏ธRemove chroot for zpool and zfs commands
We have zpool and zfs binaries in pillar, because they are required by snapshotter of the user containerd, so no need to chroot into hostfs.
๐Update device config API
Fetch DeviceName, DeviceId, ProjectId, ProjectName, EnterpriseName and EnterpriseId
FIX:
๐ Fix IP subnet obtained for wwan interface
During netlink.AddrList() usage, we should check if the interface address is reported with a Peer and if it is, use the subnet mask from the peer. This is the case with Point-to-Point interfaces.
๐ workflows: edenGCP: fix getting console log from RoL
This PR fixes an issue with getting logs from RoL.
STATS:
Github:โญ๏ธ358(+1) DockerHub: 294394 (+3076) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.7.0...8.8.0
EVE 8.7.0 Release https://github.com/lf-edge/eve/releases/tag/8.7.0
NEW:
โ๏ธCheck zvols existence in zfsmanager
We can see "Error converting ... to zfs zvol ...: qemu-img failed: exit status 1, qemu-img: Could not open '...': Could not open '...': No such file or directory". We rely on fsnotify and mdev to properly create symlinks to zvols, but seems we should check existence explicitly and publish only if we have zvol. Mdev log indicates that we do add-remove-add sequence, seems we should enforce sequence handling for mdev. Also, we should not use persist publisher as devices may be changed on reboot and should align Key to not have '/' which leads to problems with publishing.
โซUpdate containerd runtime to v2
Containerd runtime v1 is deprecated and we use v2 runtime for linuxkit services. We moved to runtime v2 in the pillar and replace v1 shim with an empty file to reduce space usage. Also, version 1 of config.toml is deprecated, we moved to version 2.
๐งDocs and errors about keys with slashes
We must not use keys with slashes or with a length more than allowed for the file name. We made errors more readable and added notes in docs.
๐งForce unmount container rootfs
We can see errors with "device or resource busy" when we try to re-start container-based app instances occasionally. We will wait for success in unmounting of rootfs more and try to force unmount the second time.
โ๏ธBuild all HV in one worker and prune
Usage of distinct workers to build separate HV is sub-optimal as they are different only in rootfs and we spend time building the same set of packages. We will re-use workers. Also, we are possibly limited in space and should remove dangling images if they are not needed.
๐Reduce eve-alpine image size
We build eve-alpine using layers of the previous eve-alpine. In that case, we end with the sum of layers as they are different. We separated steps to merge all layers into one.
dom0-ztools: Add version command
Users might not be aware of /run/eve-release file in order to check EVE's running version. This commit adds a "version" option to the eve command line, so users can use the "eve version" to easily retrieve the version from /run/eve-release.
๐Use pid from start of newlogd and wait
We start newlogd as a long-running application and do not expect that it will change its PID. Let's check for the touch file as we do it for another agent and increase wait time.
๐Add error collection and metrics for vdev from ZFS
Collection of I/O metrics for zpool and physical disks included in the zpool (for all types of operations) from ZFS
FIX:
๐ Fix handling of a disabled lastresort config
Even when lastresort is disabled by config, we might want to use it in case there is no network configuration available. Recently [1], a PR was merged, which enables lastresort config if persisted DPCL is empty. However, with the first boot of a device, DPCL is obviously empty, but a DPC can be submitted via config partition (override.json) or through a specially formatted USB stick (usb.json). These DPCs are submitted with some small delay, so they do not appear immediately inside DPCL. The idea of this PR is to change the current logic and wait for a minute for any DPC. If no config is available even after one minute then it will forcefully enable lastresort.
๐ Fix diag output when proxy is configured
With explicit proxy configured, the resolution of the controller's domain name is done by the proxy, not EVE. In some scenarios, it is possible that local DNS server(s) configured for uplink interfaces are (intentionally) unable to translate external endpoints. In such cases, it is expected for the DNS check performed by diag to fail. Therefore it makes sense to skip this check if the explicit proxy is being used.
๐ Fix object provided as argument of DeleteLogObject
We can see errors "DeleteLogObject: LogObject with mapKey verifyimage_status-... not found in internal map". Seems we must pass the same logBase object as we provide when call EnsureLogObject and NewLogObject. Otherwise, we will not delete objects from logObjectMap.
๐ Fix workerpool test
We seem to have a problem with the worker pool test as GC may run or not run in the background at the time of check for workers’ count. We spread the test work in time and add sleep according to GC intervals.
๐ Fix upgradeconverter test
We should not use the current directory for pubsub in the test as it will be mounted and we potentially (for example on macOS) will not have the possibility to use sockets there.
๐ pkg/mkimage-raw-efi/make-raw: Fix comments
Fix some typos and misspellings in the comments.
DOC:
๐document how to do custom builds
This document describes how to modify parts of eve-os so you can build a custom image for your purposes.
STATS:
Github:โญ๏ธ357(+2) DockerHub: 291318 (+1893) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.6.0...8.7.0
EVE 8.6.0 Release https://github.com/lf-edge/eve/releases/tag/8.6.0 ๐
NEW:
โฑUse xz compression for squashfs
Changing compression from default (gzip) to xz will reduce rootfs size from 247.90 Mbytes down to 208.57 Mbytes. The squashfs compression time increases by about 3 times (from 11 seconds to 33 on my PC), however, this is a one-time image build operation.
โฑAdjust squashfs decompressor options for arm64
We can use parallel squashfs decompressor (CONFIG_SQUASHFS_DECOMP_MULTI) and decompress directly into the page cache (CONFIG_SQUASHFS_FILE_DIRECT) to speedup read of squashfs
๐ Simplify GRUB development process
If pkg/grub/grub folder exists use it instead of cloning all necessary patches should be applied manually before making changes and either ./bootstrap or ./autogen.sh must be executed
๐Reduce Process and NI metric tickers
CPU profiling indicates that gatherProcessMetricList and getNetworkMetrics are not as cheap operations as it might seem. We reduced calls and make tickers aware of publishing metrics global options.
๐Increase ctrd memory limit
Moving to the second instance of containerd require enlarge of cgroups limit, seems we miss enlarge of /eve/containerd
๐Enable TPM in UEFI
In terms of using TPM devices with QEMU, we enabled support inside UEFI
๐กAdd support for AX210 WiFi
Contributed by Zeljko.Misic@o-s.de
FIX:
๐ No disk led blinking on VMware
We have no LEDs in a virtual machine, so we will not stress the disk
๐ Fix usage calculation for zfs
We should use information from zfs to calculate the usage of /persist/vault/volumes as we use zvol devices not placed in that directory.
DOC:
๐document pubsub
pubsub is the library that handles object in-memory storage, the persistence of such data, and notification to other interested processes of changes to the data.
STATS:
Github:โญ๏ธ355(+1) DockerHub: 291318 (+2154) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.5.0...8.6.0
8.5.0 Release https://github.com/lf-edge/eve/releases/tag/8.5.0๐
NEW:
๐support Edge-View container on EVE
this runs as an 'edgeview' container inside EVE, for access into EVE, the Apps and its site routable Apps
- it offers EVE side log search, network-related debugging,
system related debugging, and show pubsub data - it offers copying files from a device, such as log files, onto e.g. operator's laptop
- it offers 'virtual port mapping' which allows TCP services into device or apps
- this patch does not include the session token/dispatcher ip,port privisionning,
for which one can use e.g. temp config for edge-view with configitem #2269 to build a
private image and use 'zcli' to supply those configures for testing - with the patch, one can build a client-side 'edge-view' container for query
or local virtual-port mapping endpoints, the container can also be used
for ssh-mode access into the EVE for device debugging - this patch has the Golang example program for 'edge-view' dispatcher which
is needed for non-ssh mode with a device running behind fw/nat/lte/proxy - going to generate an EVE wiki doc to describe 'edge-view' in more detail
๐ฟuse sparced write while rolling out image to zvol
It takes forever to write huge amount of data to the disk. The limit for the qemu-img operation was 16 minutes, which was not even close to being enough for flashing 2TiB of data. Assuming that zvol is created with all zeroes, we can instruct qemu-img to skip writing zeroes. This drastically speeds up the process of rolling out the image, as qcow2 almost never contains data to fill the target volume entirely. Also in case we indeed have to flash a lot of data to the disk, this patch increases timeout up to 2 hours.
๐กProperly report NTP server used for network with static config
For networks with static IP configurations, there is a different field inside DNS used to store the address of the assigned NTP server. GetNTPServers should therefore differentiate between statically configured and DHCP-based networks, and report the assigned NTP server(s) accordingly.
๐จpillar: introduce debug build
With this series, one can build Eve’s image with some extra debug capabilities. For now, this includes a pillar with debug symbols, and delve is included in the pillar container.
FIX:
๐ Eden tests cleanup
It seems we spend time more than limit for testing, so:
- Reduce the size of the image for the mount test
- Ensure that we download the same eclient per test
- Merge reboot and shutdown tests
- Move update eve image test from HTTP to large and GCP tests
๐ Fix dhcpcd arguments for static IP config
With static IP configuration, dhcpcd should directly assign a given IP address to an interface, instead of requesting it from a DHCP server (which may not even be present in the network or may give a different address). This is actually a new bug introduced by NIM refactoring, which slipped through manual testing. This reinforces, even more, the need to enhance eden and allow to automatically test all kinds of network scenarios for regression (currently we only test DHCP).
DOCS:
update the docs with instruction how to use dev builds
Eve can be built in "development" mode, by specifying `DEV=y` flag. Currently, this affects only the pillar package. Specifically, the pillar is built with debug symbols and includes https://github.com/go-delve/delve.
STATS:
Github:โญ๏ธ354(+2) DockerHub: 289164 (+3132) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.4.0...8.5.0
8.3.0 Release https://github.com/lf-edge/eve/releases/tag/8.3.0 ๐
NEW:
๐ฅExpose WWAN-related status information to apps via metadata server
Provided that the device has at least one cellular modem visible to EVE (i.e. not assigned directly to an application), JSON-formatted WWAN status information is made available to all applications on the `/eve/v1/network/wwan/status.json` endpoint. This includes information about the installed cellular equipment (modem(s) and SIM card(s)), identity information (IMEI, IMSI, ICCID), available network providers (PLMNs), and more.
๐ฟ Use kernel command line to deploy zfs in live img
We are forced to rebuild the EVE with zfs in the version to use zfs with live images. To generalize extraction of images from existing docker image of EVE we should use options that we can configure with grub options. This PR will deploy zfs persist partition when we will start live image with eve_install_zfs_with_raid_level defined in kernel cmdline.
๐กExpose cellular metrics for LPS
Added cellular metrics (packet counters and signal strength) to the "/api/v1/radio" endpoint of the Local Profile Server (LPS).
FIX:
๐ Fix TestWireless from dpcmanager package
TestWireless may fail if the reading of DNS content from inside of the test overlaps with updates made by DPCManager. This can be prevented by reading published DNS (via pubsub) as opposed to accessing DNS directly in memory. Additionally, the test should wait for RadioSilence.ChangeInProgress to turn false before checking values of other RadioSilence attributes.
๐ Fix documentation for AppInfo LPS endpoint
Minor changes in documentation
STATS:
Github:โญ๏ธ350(+3) DockerHub: 286032 (+2665) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.2.0...8.3.0
EVENTS:
The EVE Design Summit is taking place on June 23 in Berlin, Germany. Top EVE developers and contributors will collaborate with industry users and community members to plan the way forward. An open-source community keynote will be followed by lunch, then we’ll complete the remaining technical expert talks and break into groups to strategize for the future. We’ll wrap the day with an evening social event.
https://www.lfedge.org/event/eve-design-summit-2/
8.2.0 Release https://github.com/lf-edge/eve/releases/tag/8.2.0๐
NEW:
๐ wwan: Replace uqmi with qmicli
This is due to a limitation of the QMI protocol and the linux driver qmi_wwan, which require that there is at most one client talking to the modem. For this reason, qmicli provides a proxy that multiplexes multiple requests under one session. But uqmi is not able to use this proxy, therefore we have to completely migrate from uqmi to qmicli.
๐Send empty escrow key if TPM not enabled
To complete the attestation sequence we should send EncryptedVaultKeyFromDevice. Let’s leave data empty to indicate that we do not want to send it to the controller(if no TPM is enabled).
โ๏ธ DPC verify: more sensitive handling of DNS errors
In this case, the DPC manager should wait instead of falling back to the previous DPC. But to detect this, Send* functions inside the cloud had to be improved to return an error value that allows unwrapping errors from all the send attempts.
๐ฆAdd shutdown command to EdgeDevConfig
We added support to shutdown node gracefully to EdgeDevConfig.
DOCS:
๐ฆ Add shutdown/shutdown-poweroff commands to LPS API
We added DevInfo to the Local Profile API documentation. Publish the current state of the device to the local server and optionally obtain a command to execute.
๐ Add document to list supported RAID configuration
ZFS provides a rich set of functionality but at a cost of extra resource usage. We added a table of RAID configurations for GRUB.
FIX:
๐ Fix acl test to handle multiple IPs from dig
Dig command may return several IPs to properly handle them during the creation of the network we should join lines from output into one with a comma separator.
๐ Fix accidental loop for poweroff command
This was introduced in PR #2609 and discovered as part of the review for PR #2610
๐ Fix attestation restart on 403 config response
We should restart attestation on 403 code as config response and this code is assumed as an error, so we should move handling to the proper place inside error from SendOnAllIntf handling.
๐ Do not restart a crashed VM within the handleModify function.
We added a check for `status.HasError()` in the condition in the `handleModify` function to eliminate the restart of crashed VM earlier than in `config.timer.reboot` seconds.
STATS:
Github:โญ๏ธ347(+3) DockerHub: 282591(+1585) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.0.0...8.1.0
EVENTS:
The EVE Design Summit is taking place on June 23 in Berlin, Germany. Top EVE developers and contributors will collaborate with industry users and community members to plan the way forward. An open-source community keynote will be followed by lunch, then we’ll complete the remaining technical expert talks and break into groups to strategize for the future. We’ll wrap the day with an evening social event.
https://www.lfedge.org/event/eve-design-summit-2/
8.1.0 Released https://github.com/lf-edge/eve/releases/tag/8.1.0 ๐
NEW:
๐Geographic coordinates reported by EVE #2600
EVE is able to obtain location information from a GNSS receiver integrated into an LTE modem.
This information is then propagated to 3 destinations:
- to the controller as ZInfoMsg with newly added ZInfoLocation into InfoContent
- to the Local profile server
- to any locally deployed application via meta-data service
By default, location reporting is disabled and has to be explicitly enabled under the cellular configuration.
Learn more https://lf-lfedge.atlassian.net/wiki/display/EVE/GPS+coordinates+exposed+by+EVE
DOCS:
๐ add edgeview container/api doc
Edge-View as a service on EVE, it needs to receive/update user configurations from the controller; and it needs to send/update Edge-View running status to the controller.
FIX:
โฑproto changes for app delay start interval && application instance staggered start built proto files
The crux of this feature is to introduce a delay between the time EVE is ready to process application instance configuration and the time the application is started. This delay interval should be added per application instance.
โฌ๏ธRemove /config/uuid compatibility for downgrade to older than 5.21
This includes removing some old unneeded code and relying on the 5-second timer when we have a UUID. Also, remove the /config/uuid compatibility for a downgrade to older than 5.21.
๐ Fix leaking locked mutex in netmonitor
watcher of LinuxNetworkMonitor would leave the mutex in a locked state in some cases, causing DPC Manager to deadlock.
๐บAdd missing iptables rule to allow local VNC traffic
STATS:
Github:โญ๏ธ344(+2) DockerHub: 282591(+861) pulls
Changelog: https://github.com/lf-edge/eve/compare/8.0.0...8.1.0
NEW:
- Implement configurators for VLANs and Bonds โ
This is the last missing piece needed to enable support for VLANs and LAGs for EVE management and local network instances. This functionality is described and designed in this document: https://lf-lfedge.atlassian.net/wiki/display/EVE/EVE+VLAN+support+-+create+VLAN+and+bond+interfaces . A part of this commit is also a fix for Switch network VLANs, where we forgot to configure the network uplink port as the trunk, therefore all tagged packets coming in and out of it would be dropped by bridge VLAN filtering. As a result, VLANs were working correctly only for air-gapped switch networks. - Handle EVE-OS reinstall with TPM + ZFS ๐
If we have previously installed EVE-OS on this device and there was no TPM clear done in the BIOS, then we need to clear the keys associated with persist/vault as part of creating persist/vault. - Do not skip DNS State check in client ๐ฆ
We do not update networkState in the client if only DNS State changes. It may block the onboarding process.| - add zinfo type for edgeview and zedagent publish to controller โน๏ธ
Missed the zinfo type in proto define for edgeview in PR Edge-View API and JWT #2427 and added zedagent publish the edgeviewStatus to the controller. - Reduce memory usage for DownloadedParts Hash ๐
We can use a JSON encoder to stream checksum calculations instead of using all data in place to reduce memory usage. Around 25KB per GB of image.
DOCS:
- Update more tunables to the document ๐
Added Minimum recommended system requirements and tunables. - Document how onboarding certificates can be generated ๐
In some cases one might want different certificates than the default in conf/onboard.cert.pem
FIX:
- Fix ntpd.pid watchdog after server change ๐
We sometimes see a watchdog due to ntpd.pid not running in the case when EVE-OS picks up a NTP server from DHCP thus we see a log line of the form: NTP server changed from poll.ntp.org to $ns. The attempt to track this down has been to record the exit value from starting ntpd after it was killed, but that returns zero even when ntpd was not actually started. So adding a wait for the killed process to go away. - Fix filtering of unecessary info messages ๐
Functions *HasRealChange() in zedagent that try to filter uninteresting changes from being logged have unintended side effects and cause zedagent to lose all timestamps and other information. This is because internally they clear these frequently-changing fields before comparing new and previous values, but do so without properly deep-copying values and thus changing the originals. - Fix multiple DNS servers configured for network instance ๐
Multiple DNS servers (to advertise) should be configured for dnsmasq as a one-line "dns-server" DHCP option, with comma-separated DNS server IP addresses. Putting DNS servers each on a separate line is not correct, dnsmasq will only advertize the last entry. - Fix verification of persisted DPCs ๐
With persisted DPCs from the previous run but with last-resort DPC disabled, there is only dpcTestTimer that will trigger DPC verification. And since this timer is set (by default) to 5 minutes, there is quite a delay until device applies working DPC after a reboot. This commit makes sure that persisted DPCs are tested as soon as possible after a reboot. DPC manager only waits for the global configuration before it starts verification. - Fix locally triggered purge ๐
PR introduces a local generation counter for volume, which is added to the remote generation counter (from the controller) to form a volume key changing remotely as well as a locally issued purge. Similarly, the PR adds separate purge and restart counters for locally triggered operations to the application config. Note that most of the logic for local operations is currently handled by zedagent. Later, this could be refactored and moved to zedmanager to keep only config parsing and info/metrics publishing inside zedagent.
7.11.0 Released https://github.com/lf-edge/eve/releases/tag/7.11.0
NEW:
- NIM refactoring โ๏ธ
This PR substantially refactors NIM microservice. Contains several commits to improve the code of NIM, split code into several files, and avoid files with 2000+ lines. - Support zfs raid levels during install ๐
Provides grub parameters to explicitly install zfs and pick the raid level. Support for single disk installation of ZFS is added. - Show string status for zpool in case of not online โน๏ธ
Show string status for zpool in case of not online (i.e. One or more devices have been taken offline by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.). The library we use has no support for showing this, so I parse the output of zpool status here. - Support for multiple top-level vdevs in pool ๐
We will fill children info for multiple top-level vdevs and fill CurrentRaid as the lowest redundancy of all included vdevs. - Support to install EVE and ZFS on same disk ๐
Installed with grub parameters eve_install_zfs_with_raid_level=none eve_install_disk=nvme0n1 - Pass grub config file for iso installer ๐ฟ
Seems we still need to have the possibility to pass options from grub.cfg comes with config.img file in case of iso installer.
The solution to sort disks (#2309) discussed #2303 (comment) is not enough and we still need to define explicitly the disk to install EVE onto with eve_install_disk option
FIX:
- Use fixed device/partition UUIDs with ZFS ๐
For unknown reasons, ZFS creates partition 1 and partition 9 when adding to the pool. As we later add to the pool, we want to ensure that otherwise identical hardware ends up with identical PCR5 measurements from the TPM. - potentialUUIDUpdate multiple access fix ๐
Check if zedclient process is not running before run new one. Add mutex to handle multiple function calls from goroutines - Need longer wait for zboot reset ๐
On a RPI4 running ZFS with lots of USB sticks as drives and one of them in a failed/hosted state the zboot reset (and other zboot commands) takes more than 20 seconds.
STATS:
Github:โญ๏ธ340(+1) stars 121(+3) forks DockerHub: 280472(+1351)pulls๐
Changelog: https://github.com/lf-edge/eve/compare/7.11.0...8.0.0
7.10.0 Released https://github.com/lf-edge/eve/releases/tag/7.10.0
NEW:
- Extract raw filename from mime
This is necessary to handle new version of the mime package and allow creating the CDROM directory layout for cloud-init. Note that the handling guards against the directory/filename escaping from the target directory. - Use 32byte TPM keys only for vault protection ๐
Starting this commit a new install of EVE-OS will create a vault config file on systems with TPM support. That file will be used to determine whether to use only the TPM key or merge the TPM and controller key. This applies to both ext4 and zfs filesystems. - Add functionality to send information about disks โน๏ธ
Add functionality to send information about disks via a separate HardwareInfo message with a rare sending rate. Add serial number assembly for disks in ZFS. Add information about the disk from which information could not be retrieved. Also rewrote the GetSerialNumberForDisk function because an error occurred if the input disk name was a partition (eg /dev/sda1). - Run tests against zfs-kvm ๐
We can use zfs-KVM HV to run the single-disk zfs mode of EVE and use it in our tests. We changed the version of EVE in eveupdate tests to recent ones. - Add S.M.A.R.T data collector for disks ๐
This PR adds features for collecting disk information, including SMART attributes. Also, a package has been added here that allows you to read the file system, to obtain information about available disks and information about them. - API update to send more disks information for storage system to EVE โน๏ธ
Update API to send more disk information (including s.m.a.r.t ones) for the storage system in EVE. - Add possibility to define nested structures in DisksConfig ๐
To use stripe of two pairs of mirrored disks we should define DisksConfig without disks with array_type DISKS_ARRAY_TYPE_RAID0 with two children with properly defined disks inside and with array_type DISKS_ARRAY_TYPE_RAID1 and empty children - Set max_sectors explicitly to run Windows VM with vhost-scsi-pci ๐ช
We can see [ 259.573575] vhost_scsi_calc_sgls: requested sgl_count: 2649 exceeds pre-allocated max_sgls: 2048 in kernel messages and Windows VM do not boot with zfs/vhost-scsi-pci. As discussed in https://edk2.groups.io/g/discuss/topic/windows_2019_vm_fails_to_boot/74465994: I/O size exceeds the max SCSI I/O limitation(8M) of vhost-scsi in KVM and we should adjust options to run Windows VM with vhost-scsi-pci. - Split bucket and path from ds config for AWS โ๏ธ
We can have files located in directories inside the bucket, but now path from datastore assuming as bucket name. We should split the path into bucket names and file paths if we can see '/' inside the bucket. - Have installer default to fixed disk/partition UUIDs ๐
This is needed to make PCR5 in the TPM measured boot be the same for otherwise identical hardware and firmware/software. The new eve_install_random_disk_uuids can be set to get the old behavior. storage-init recreates as fixed if IMGA has the fixed UUID.
STATS:
Github:โญ๏ธ339 stars 119(+1) forks DockerHub: 279121(+ 1606) pulls๐
Changelog: https://github.com/lf-edge/eve/compare/7.9.0...7.10.0
7.9.0 Released https://github.com/lf-edge/eve/releases/tag/7.9.0
NEW:
- Config api to work with disks in zfs
We expect information about disks to be filled in config API and will try to adjust disks states accordingly. If we want to change state to online/offline we should define its state and progress will be available in information messages. - Update API for sending status for storage system
We created informational messages for the disk statuses. - Log install steps
Capture the installation status at various stages and save it to the installer.log file on USB.
Check hash of verified images after reboot
SHA checks the files in the verified directory. We are checking that the files are not changed after reboot. In case, of unexpectable reboot during volume creation.
FIX:
- Need sane upper bounds for some global timers
If someone accidentally set the timers which affect nim and connect the controller to infinite values and then reboot, then the device will never get IP addresses and connect to the controller. So adding sane upper bounds of one hour for these timers. - update docker usage text for lfedge/eve
Updated help message for docker run lfedge/eve - Fix nil pointer assignment in StorageDiskState
STATS:
Github:โญ๏ธ339(+2) stars 118 forks DockerHub: 277515(+1190) pulls๐
Changelog: https://github.com/lf-edge/eve/compare/7.8.0...7.9.0
7.8.0 Released https://github.com/lf-edge/eve/releases/tag/7.9.0
NEW:
- Run potentialUUIDUpdate on 400 and on attest problems โ๏ธ
As described in the APIv2 documentation, we should assume that the device does not exist in the controller if the controller returns 400.
Also seems we do not run potentialUUIDUpdate before successful attestation, but we should. Also, we must remove the old attest message on change and push new. - CONFIG_IGC for Intel Ethernet Controller ๐
CONFIG_IGC for Intel Ethernet Controller I225-LM/I225-V/I225-IT - Use TLS with S3 ๐
Some old code had this disabled, thus we relied on the image SHA256 for verification. However, this means that firewalls need to open up outbound port 80 when port 443 should be sufficient. Verified that the S3 downloads work correctly even when a TLS MiTM proxy is in use thus the proxy certificate is passed into the S3 download code. - Implement appinfo extension for purge/restart command requests โน๏ธ
This commit implements the extension to the /api/v1/appinfo local profile endpoint, which allows the server to submit purge/restart commands for locally running application instances. This functionality is already documented in api/PROFILE.md under "AppInfo". Plus test lf-edge/eden#744 - Rework ECO to show information to log and VNC ๐บ
Now we cannot see information from the app in logs if VNC is enabled, with this change we will output information to both places. - Allow /30 subnets for local network instance ๐ก
The current MinSubnetSize of > 8 is too restrictive. We these changes we can handle /30 subnet, which means that there is one IP address available for an app instance (and one for "zedrouter").
FIX:
- Wait for no qmp socket in cleanup for kvm โ๏ธ
We can see in EdenGCP tests for KVM errors like PURGING: [description:"Qmp not found: error dial unix /run/hypervisor/kvm/ae84d4b7-9634-4abd-83ef-56f04ec01a27.1.5/qmp: connect: no such file or directory"...]. Errors occur after retry in our experiments. We should wait for the qmp socket gone after killing of containerd task. - Fix unlock vault on controller key for zfs ๐
We should unlock zfs vault in a different way on the controller key received
STATS:
Github:โญ๏ธ337 stars 118 forks DockerHub: 276325 pulls๐
Changelog: https://github.com/lf-edge/eve/compare/7.7.0...7.8.0
7.7.0 Released https://github.com/lf-edge/eve/releases/tag/7.7.0
NEW:
- add RoL Rpi4 test steps to eden GCP workflow ๐
Now the first implementation of the cli and rest api of the Rack of Labs system is ready. But it is in the stage of active development and testing, and at this stage, we think it is inappropriate to include it as part of EDEN. RoL was added to the Github Actions together with the Google Cloud Platform. - Add debug.enable.vga support ๐บ
We only return VGA devices that were marked as boot devices.Console output won't be visible on others anyway. It also allows us to debug issues using VGA console while other GPUs are still assigned to applications - Implement ZFS minimum requirement check โ๏ธ
Minimum supported system requirements to install ZFS storage is 64GB memory and 3 physical disks set in eve_persist_disk. eve_install_skip_zfs_checks should be set in the installation config to override the requirement check for experimental installs. - Use context with timeout in sending to the cloud functions โฑ
We can possibly consume a long time when we try to send information to the cloud with long delays/a lot of interfaces. We add context with a timeout to not hit watchdog. - zfs: performance patches ๐
Patches significantly reduce the write amplification, therefore less data needs to be written, and fewer unnecessary syncs are issued. Another feature introduced in the patches is the write bandwidth smoothing algorithm which prevents huge latency spikes under the heavy load on the storage.
FIX:
- Rework zfs_arc_max get logic ๐
Zedmanager asks volummgr to create the new volume, and when that is done it halts the application and restarts it. - Fix dynamic IPs allocation for several interfaces in the same network ๐
If we allocate several interfaces on the same network instance all of them receive the same IP. It comes because we save and check only networkInstance - app pairs, so, we can allocate only one IP. We moved UUIDPairToNum to UUIDPairAndIfIdxToNumKey and allow to allocate different appNum (and IPs) for different interfaces on the same networkInstance.
Changelog: https://github.com/lf-edge/eve/compare/7.6.0...7.7.0
EVE 7.6.0 https://github.com/lf-edge/eve/releases/tag/7.6.0
NEW:
- limit dhcp range with BitMapMax
We cannot allocate more than BitMapMax dynamic IPs so seems we can reduce the capacity of dhcp range down to it. In that case for default network instance (/16) we will not hit the problem with allocated static IPs from the end (and inside) of dhcp range comes from the controller. - kernel: make patches am-able
Only a few of our patches can be applied using `git am`. This commit turns all the patches into a proper format. This makes it a bit easy to kick-start new kernel activity. - appinfo for local profile server implementation
Implementation for sending compact info (uuid, name, state and error) about apps on EdgeNode to api/v1/appinfo endpoint on local profile server. - Add ipxe.efi to release artifacts
ipxe.efi will be published with ipxe.efi.cfg and ipxe.efi.ip.cfg
FIX:
- fix error handling in decrypt of cipherBlock ๐
- fix Sha256FromECPoint calculation to be aligned with controller ๐
Fixes for two problems with error propagation:
in case of no Edge Node certificate, we return nil err and jump into the "Data Validation Failed" error
in case of error from ParseECPrivateKey, we return nil error because of shadowed variable
- fix broken calculation of downloaded parts for azure โ๏ธ
In case of successfully downloading the last part (it is a special one, which is in general not equal SingleMB), on retry we can hit. The range specified is invalid for the current size of the resource because of the wrong check for the last partNum (Developer mistakenly checked not for partNum == partsCount - 1 in #2420) and we hit the situation with downloading 0 bytes starting from the end of the file. In this PR I removed complex logic and just check if the range for download is greater than 0 or not. - fix make-raw to properly handle stdin and to not adjust partitions for usbconf ๐งน
It comes from lines where we check for /parts and if it does not exist, we extract file comes from tar in stdin. But /parts exist, we create it in Dockerfile. So, the developer adds a check for tty connected to stdin, if not, we assume that it is a pipe from tar. - Remove unused package ๐งน
There is a CVE flagged against a dependency in pkg/lisp and since we no longer use it the easiest resolution is to remove the code.
Full Changelog: https://github.com/lf-edge/eve/compare/7.5.0...7.6.0
EVE 7.5.0 https://github.com/lf-edge/eve/releases/tag/7.5.0
NEW:
- Added riscv64 support ๐
Add necessary patches to boot eve on riscv64
- Reserve space to maintain performance in ZFS ๐พ
It is recommended that storage usage should not go above 80% of available space. Because pool performance can degrade when a pool is very full and file systems are updated frequently, such as on a busy mail server. - parse-pkgs generates all known; docker-compose standardized ๐งน
The primary benefit of this change is that it lets the compose file be standard. This makes it easier for editors to see and parse, people onboarding to find. It also allows a normal docker-compose up to simply execute it, while complaining that the env vars were not set. In short, we take advantage of standard tooling without changing functionality. - sort disks in mount_disk.sh ๐
In case of multiple disks, we will have unordered output from find and can have swapped mounts.
as we expect this order from VM config (we rely on the same order between lines in mountPoints file and block device enumeration.) - pkg: grub: arm: moving to 2.06 โฌ๏ธ
Use grub tag instead of commit for arm64 and riscv64. Grub verify module is no longer available for build under arm with grub version 2.06. Grub verify module is no longer available for build under arm with grub version 2.06. Now coreutils package is used for arm too. - pkg: kernel: moving to 5.10.76 version โฌ๏ธ
As a result of testing, updating the kernel to 5.10.76 helps to eliminate the error -Synchronous exception at 0x000000005EAED180 on the RPi4 with hypervizor kvm. Previously, the error occurred when starting, restarting, and shutting down EVE App instances with a 5% chance.
FIX:
- block sending stacks in metrics message ๐
there was a bug that caused the process stack collection to be always null, which was fixed in ec37884. but that fix caused the metrics message size to be more than doubled, e.g. on 'zc1' from 470k/hour before to 1.23M/hour after that fix, about 20Mbytes/day increase for that device. This patch is to skip uploading the process stack in metrics message to bring down the metrics message size to pre 6.12 release. - Euresys Frame Grubber Full XR doesn't work under VM ๐
The PCIe card works fine on bare metal but doesn't work under VM. Neither Linux nor Windows. - Fix wrong permission with initrd.img ๐
docker run lfedge/eve:tag installer_net to create network installer the output file initrd.img gets 600 instead of 644
DOCS:
Full Changelog: https://github.com/lf-edge/eve/compare/7.4.0...7.5.0
EVE 7.4.0 https://github.com/lf-edge/eve/releases/tag/7.4.0
NEW:
- Do not allow to impose radio silence during EVE update testing
If edge node is going through EVE update and radio silence is imposed during the 10 minutes testing period, then the access to the controller may be lost and device will fallback to the previous release. This is in violation with the radio silence requirements, which state that edge node should not trigger port config or EVE image fallback during a (temporarily) imposed radio silence. To prevent the EVE fallback from happening, zedagent will simply return error back to the Local profile server if radio silence is requested during EVE update testing period. - azure partial download
This allow us to keep information about downloaded parts for s3 datastore in memory and resume download from the previous try. New updates added Azure support. Also added file with .progress in the end to keep information about downloaded parts across reboots.
FIX:
- configure network broadcast address on container interfaces
Network bcast address on container interfaces is currently not being set. It shows as 0.0.0.0 in ifconfig output from inside the container.
This fix should make sure that ip command while setting the ucast address also computes the bcast address and adds it to interface of containers. - Fix publishedEdgeNodeCerts set too early.
Even in case messages are deferred due to failures we should not set publishedEdgeNodeCerts until after the ZAttestReqType_ATTEST_REQ_CERT message has been sent. - fix an issue of tlsconfig initialization in diag.go
this crash was due to a change in PR #2333 that added at the diag start of init tlsConfig to the session resume, but not the caroot.
this is to remove that init, and during the tryPing time to get the tlsConfig normally and add the session-resume option
DOCS:
Full Changelog : https://github.com/lf-edge/eve/compare/7.3.0...7.4.0
EVE 6.10.0 https://github.com/lf-edge/eve/releases/tag/6.10.0
NEW:
- retry eve image update
Possibility to retry eve image update by incrementing the counter (EdgeDevConfig->Baseos->RetryUpdate→Counter). If the currently configured image is in FAILED state in the other partition, retry the image upgrade. ELSE Do nothing. Just update the BaseosUpdateCounter counter in the Info message and send an Info message to zedcloud.
Warning! If you try this from the UI and the device is running <6.10 it will silently be a no-op. - run selfRegister to overcome node re-creation
To keep connection with controller after node re-creation with the same onboard certificate we should run selfRegister on reboot - support zfs zvols as volumes
Now it works in a progress state due to a lack of accounting features and refactoring needs.
One drawback I cannot solve here for now: we cannot convert qcow2 (one another format) to raw on the fly, we need to use a temporary file for this now. - generated vendor files for PR #2163
- Add display print example plus + qemu test
- Avoid setting freeUplink in model; better handling of wwan and wlan
FIX:
Full Changelog : https://github.com/lf-edge/eve/compare/6.9.0...6.10.0