Block service restart until update complete
PSIT Sandin
Our RMM N-Able N-Central is auto-healing the service when it updates causing updates to fail. The limit in N-central to delay auto-heal upon detection of the service stopping is only 2 minutes 30 seconds. We had heard the team may be working on a way to prevent the service from being re-started during update which would hopefully prevent these failed updates from occurring
T
Tyron McLachlan
We experienced the same issue and Huntress support also pointed to the self heal being the cause. We disabled self heal and now and Huntress service failures log a PSA ticket instead. A lot less noise.
Mike Timko
Tyron McLachlan
Dont you get more noise now that the service is off but not healing, which is silent?
T
Tyron McLachlan
Mike Timko In our case it was the self-healing that was interfering with the Rio service upgrade process and causing it to be stuck in a failed state, and thus triggering a bunch of alerts from our huntress service monitor.
Now that we've removed the 'interference' of the self-heal, the Hutress upgrades behave for the most part and the noise is gone.
Mike Timko
Tyron McLachlan We use n-sight so maybe that's the difference but we have similar:
"RESTART service if "stopped" (requires agent v7.1 or above). We don't get to configure how long it's down for, just "number of consecutive restarts before alert: X" and "alert if service is restarted X times in X hours". We monitor the "Huntress Agent Service" and the "Huntress Updater Service".
We haven't seen any alerts across our fleet really, What are the symptoms you're seeing if the update fails so I can go check if it's happening to us silently somehow? What do i look for to see if affected? Something in huntress dashboard or another service or?
T
Tyron McLachlan
Mike Timko We monitor all 3 services, and it was the Rio service that gave us alot of noise alerts, when during an upgrade our self-heal interfered while it was down adn the upgrade never completed and the Rio service couldnt start back up.
Extract from our support ticket below
" I have uninstalled the RIO agent and the base agent has now installed a clean instance of RIO. The problem seems to arise from some partners having self healing provisions in place. During the update process the service is stopped to allow the update to process without interference. It has been found that if partners have self healing processes in place that restart the RIO service mid update, this causes corruption within the Huntmon DLL which then presents as the RIO service repeatedly stopping. "