The script in this article will scan a Hyper-V host to find its oldest checkpoint. It is an active check tied to the host, not to any particular virtual machine. In order to use it, you must have a functioning Nagios environment and NSClient++ operating as configured in our main Nagios article. It does not directly require any of the base scripts, but the sections mentioned in that article are used here.
Updated May 2, 2018: Version 2.0
- Using the CIM cmdlets instead of WMI cmdlets for speed
- Improved performance by reducing number of CIM calls
- The checkpoint report properly identifies the owning virtual machine
- Ignores checkpoints created by a pooled VDI collection
NSClient++ Configuration
These changes are to be made to the NSClient++ files on all Hyper-V hosts to be monitored.
C:\Program Files\NSClient++\nsclient.ini
If the indicated INI section does not exist, create it. Otherwise, just add the second line to the existing section.
[/settings/external scripts/wrapped scripts] check_checkpointage=check_hvcheckpointage.ps1 $ARG1$ $ARG2$
C:\Program Files\NSClient++\scripts\check_hvcheckpointage.ps1
This script scans a Hyper-V host for its oldest existing checkpoint and reports back to Nagios. This file does not exist and must be created.
<#
check_hvcheckpointage.ps1
Written by Eric Siron
(c) Altaro Software 2018
Version 2.0 May 2, 2018
Intended for use with the NSClient++ module from http://nsclient.org
Checks a Hyper-V host for its oldest checkpoint and returns the status to Nagios.
#>
param(
[Parameter(Position=1)][String]$WarningLevel = '2d',
[Parameter(Position=2)][String]$CriticalLevel = '3d'
)
Set-Variable -Name OldestCheckpoint
if($WarningLevel -match '[mhdwMHDW]')
{
$WarnMeasurement = $Matches[0][0]
if($WarningLevel -match '\d*')
{
$WarnLength = $Matches[0]
}
}
if($CriticalLevel -match '[mhdwMHDW]')
{
$CriticalMeasurement = $Matches[0][0]
if($CriticalLevel -match '\d*')
{
$CriticalLength = $Matches[0]
}
}
$OldestCheckpointCreationTime = [DateTime]::Now
$RawCheckpointIDs = Get-CimInstance -Namespace root/virtualization/v2 -Property Dependent -Class Msvm_SnapshotOfVirtualSystem
foreach ($RawCheckpointID in $RawCheckpointIDs)
{
$Checkpoints = Get-CimInstance -Namespace root/virtualization/v2 -Property VirtualSystemIdentifier, CreationTime, ElementName -Class Msvm_VirtualSystemSettingData -Filter ('InstanceID="{0}" AND VirtualSystemType="Microsoft:Hyper-V:Snapshot:Realized" AND NOT ElementName LIKE "%RDV_ROLLBACK%"' -f $RawCheckpointID.Dependent.InstanceID)
foreach($Checkpoint in $Checkpoints)
{
$CheckpointCreationDate = $Checkpoint.CreationTime
if($CheckpointCreationDate -lt $OldestCheckpointCreationTime)
{
$VM = Get-CimInstance -Namespace root/virtualization/v2 -Property ElementName -Class Msvm_ComputerSystem -Filter ('Name="{0}"' -f $Checkpoint.VirtualSystemIdentifier)
$OldestCheckpoint = @($Checkpoint.ElementName, $VM.ElementName, $CheckpointCreationDate)
}
}
}
if($OldestCheckpoint)
{
[TimeSpan]$CheckpointAge = [DateTime]::Now - $OldestCheckpoint[2]
$AgeString = '{0} minutes' -f $CheckpointAge.Minutes
if($CheckpointAge.Hours)
{
$AgeString = '{0} hours, {1}' -f $CheckpointAge.Hours, $AgeString
}
if($CheckpointAge.Days)
{
$AgeString = '{0} days, {1}' -f $CheckpointAge.Days, $AgeString
}
Write-Host ('Checkpoint "{0}" for VM "{1}" is {2} old. Created: {3}.' -f $OldestCheckpoint[0], $OldestCheckpoint[1], $AgeString, $OldestCheckpoint[2])
$ComparisonLength = 0
switch($CriticalMeasurement)
{
'm' {
$ComparisonLength = $CheckPointAge.Minutes
}
'h' {
$ComparisonLength = $CheckpointAge.Hours
}
'd' {
$ComparisonLength = $CheckpointAge.Days
}
default {
$ComparisonLength = $CheckpointAge.Days * 7
}
}
if($ComparisonLength -gt $CriticalLength)
{
Exit 2
}
$ComparisonLength = 0
switch($WarnMeasurement)
{
'm' {
$ComparisonLength = $CheckPointAge.Minutes
}
'h' {
$ComparisonLength = $CheckpointAge.Hours
}
'd' {
$ComparisonLength = $CheckpointAge.Days
}
default {
$ComparisonLength = $CheckpointAge.Days * 7
}
}
if($ComparisonLength -gt $WarnLength)
{
Exit 1
}
Exit 0
}
else
{
Write-Host 'No checkpoints'
exit 0
}
Restart the NSClient++ service.
Nagios Configuration
These changes are to be made on the Nagios host. I recommend using WinSCP as outlined in our main Nagios and Ubuntu Server articles.
/usr/local/nagios/etc/objects/commands.cfg
The Hyper-V Host Commands section should already exist if you followed our main Nagios article. Add this command there.
################################################################################
#
# Hyper-V Host Commands
#
################################################################################
# $ARG1$: age that triggers a warning condition. use one letter (m = minute, h = hour, d = day, w = week) and one number. ex: 3d for 3 days. order does not matter
# $ARG2$ age that triggers a critical condition
define command{
command_name check-checkpoint-age
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -t 30 -p 5666 -c check_checkpointage -a $ARG1$ $ARG2$
}
/usr/local/nagios/etc/objects/hypervhost.cfg
This file and section were created in the required base scripts article.
###############################################################################
###############################################################################
#
# HYPER-V SERVICE DEFINITIONS
#
###############################################################################
###############################################################################
# check hosts individually for oldest checkpoint
define service{
use generic-service
hostgroup_name hyper-v-servers
service_description All VMs: Max Checkpoint Age
check_command check-checkpoint-age!2h!3d
}
As shown, each host in “hyper-v-servers” will be checked at the default interval. If a checkpoint is older than 3 days, it will trigger a Critical alert. If a checkpoint is older than 2 hours, it will trigger a warning. You can modify the above as needed. You can also duplicate this service but apply it to specific a specific “hostname” instead of “hostgroup_name” to set per-host warning and critical levels.
You must restart Nagios to apply this configuration.
sudo service nagios checkconfig sudo service nagios restart
