...
The zedagent module will be broken-up. The base of validation and over all connectivity and device health will be managed by DevAgent. The DevAgent will be one of the first modules to be spawned along with ledmanager, and will be persistent for the whole lifetime of the EVE node. The ZedAgent will be only responsible for cloud connectivity and configuration parsing and status/metrics publication. The baseos upgrade validation will be covered by DevAgent module, covering all the intermediary state for the device boot up.
EVE Node Health Monitor Function
EVE Node health check functionality, consists of the following,
pillar agent(s) run state and responsiveness
Each agent's health is monitored through watchdog timer.
Controller connectivity
The controller connectivity for the EVE node is evaluated, as following
Reset Timer Function:
For a normal operation scenario, for controller connectivity loss, the EVE node is rebooted after the reset timer interval.
Fallback Timer Function
On baseos upgrade, in validation phase, for controller connectivity loss, EVE Node falls back to fallback image, after the fallback timer interval.
Current Implementation
The EVE node reset and fallback timer functionalities are currently part of ZedAgent Module.
Proposal for Refactoring
Baseosmgr Module
ZedAgent Module
DevAgent Module
DevAgent will listen to the following,
- ledBlinker Status. – for EVE node registration, controller connectivity change events
- Zboot Status
- Zedagent Status
DevAgent will publish to the following,
- Zboot Config
- DevAgent Status
ZedAgent additionally will listen to the following,
- Dev Agent Status
PS.
Currently, the scope of device health, as defined above, does not include the following,
- cpu usage health
- disk space usage health
- network usage health
- each agent's basic functionality check