Monitoring Windows Services with PowerShell and Python

Background:
I myself work in the technical department of a brokerage company in Toronto, Canada. We also have another office in Calgary. Somehow, after the planned installation of Windows updates on a single domain controller in a remote office, the W32Time service, which is responsible for synchronizing time with an external source, did not start. Thus, within about a week, the time on the server lost about 20 seconds. Our workstations at that time, by default, received time from the controller. You understand what happened. In bidding time is very important, the difference in seconds can solve a lot. The first time difference, unfortunately, was noticed by our brokers. Our technical support department, which essentially consists of 3 people, was assigned for this. It was urgent to do something. The solution was to apply group policy, which sent all machines to an internal NTP server running CentOS. There were still problems with DC Barracuda Agent, the service responsible for connecting domain controllers with our Web filter, and a couple of services sometimes caused us anxiety. Nevertheless, we decided to come up with something to keep track of a couple of services. I googled a little and realized that there are many solutions, mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. responsible for connecting domain controllers with our Web filter, and a couple of services sometimes caused us anxiety. Nevertheless, we decided to come up with something to keep track of a couple of services. I googled a little and realized that there are many solutions, mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. responsible for connecting domain controllers with our Web filter, and a couple of services sometimes caused us anxiety. Nevertheless, we decided to come up with something to keep track of a couple of services. I googled a little and realized that there are many solutions, mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine.
Solution:
On one of the Windows servers, I created a PowerShell script of this kind:
echo "Servername" > C:\Software\Services\Servername.txt
get-date >> C:\Software\Services\Servername.txt
Get-Service -ComputerName Servername | Format-Table -Property status, name >> C:\Software\Services\Servername.txt
In my case, there were 10 such pieces for each server.
In the Task Scheduler, I added the following batch file (it seemed to me easier than trying to run the PowerShell script directly from there):
powershell.exe C:\Software\Services\cal01script.ps1
Now every day I received a list with all the services in a separate file for each server in a similar format:
Servername
Friday, October 26, 2012 1:24:03 PM
Status Name
------ ----
Stopped Acronis VSS Provider
Running AcronisAgent
Running AcronisFS
Running AcronisPXE
Running AcrSch2Svc
Running ADWS
Running AeLookupSvc
Stopped ALG
Stopped AppIDSvc
Running Appinfo
Running AppMgmt
Stopped aspnet_state
Stopped AudioEndpointBuilder
Stopped AudioSrv
Running Barracuda DC Agent
Running BFE
Stopped BITS
Stopped Browser
Running CertPropSvc
Running WinRM
Stopped wmiApSrv
Stopped WPDBusEnum
Running wuauserv
Stopped wudfsvc
Now the most important part. On a separate CentOS machine on board, I wrote this script:
import sys
import smtplib
import string
from sys import argv
import os, time
import optparse
import glob
# function message that defines the email we get about the status
def message(subjectMessage,msg):
SUBJECT = subjectMessage
FROM = "address@domain.com"
TO = 'address@domain.com'
BODY = string.join((
"From: %s" % FROM,
"To: %s" % TO,
"Subject: %s" % SUBJECT ,
"",
msg
), "\r\n")
s = smtplib.SMTP('mail.domain.com')
#s.set_debuglevel(True)
s.sendmail(FROM, TO, BODY)
s.quit()
sys.exit(0)
def processing(runningServicesFileName,desiredServicesFileName):
try:
desiredServicesFile=open(desiredServicesFileName,'r')
except (IOError,NameError,TypeError):
print "The list with the desired state of services either does not exist or the name has been typed incorrectly. Please check it again."
sys.exit(0)
try:
runningServicesFile=open(runningServicesFileName,'r')
except (IOError,NameError,TypeError):
print "The dump with services either does not exist or the name has been typed incorrectly. Please check it again."
sys.exit(0)
#Defining variables
readtxt = desiredServicesFile.readlines()
desiredServices = []
nLines = 0
nRunning = 0
nDesiredServices = len(readtxt)
faultyServices = []
missingServices = []
currentServices = []
serverName = ''
dumpdate=''
errorCount=0
# Trimming file in order to get a list of desired services. Just readlines did not work putting \n in the end of each line
for line in readtxt:
line = line.rstrip()
desiredServices.append(line)
# Finding the number of currently running services and those that failed to start
for line in runningServicesFile:
nLines+=1
# 1 is the line where I append the name of each server
if nLines==1:
serverName = line.rstrip()
# 3 is the line in the dump that contains date
if nLines==3:
dumpdate=line.rstrip()
# 7 is the first line that contains valueable date. It is just the way we get these dumps from Microsoft servers.
if nLines<7:
continue
# The last line in these dumps seems to have a blank character that we have to ignore while iterating.
if len(line)<3:
break
line = line.rstrip();
serviceStatusPair = line.split(None,1)
currentServices.append(serviceStatusPair[1])
if serviceStatusPair[1] in desiredServices and serviceStatusPair[0] == 'Running':
nRunning+=1
if serviceStatusPair[1] in desiredServices and serviceStatusPair[0] != 'Running':
faultyServices.append(serviceStatusPair[1])
if nLines==0:
statusText='Dumps are empty on %s' % (serverName)
detailsText='Dumps are empty'
# Checking if there are any missing services
for i in range(nDesiredServices):
if desiredServices[i] not in currentServices:
missingServices.append(desiredServices[i])
# Sending the email with results
if nRunning == nDesiredServices:
statusText='%s: OK' % (serverName)
detailsText='%s: OK\nEverything works correctly\nLast dump of running services was taken at:\n%s\nThe list of desired services:\n%s\n' % (serverName,dumpdate,'\n'.join(desiredServices))
else:
statusText='%s: Errors' % (serverName)
detailsText='%s: Errors\n%s out of %s services are running.\nServices failed to start:%s\nMissing services:%s\nLast dump of the running services was taken at:\n%s\n' % (serverName,nRunning,nDesiredServices,faultyServices,missingServices,dumpdate)
errorCount=errorCount+1
return (statusText,detailsText,errorCount)
# Defining switches that can be passed to the script
usage = "type -h or --help for help"
parser = optparse.OptionParser(usage,add_help_option=False)
parser.add_option("-h","--help",action="store_true", dest="help",default=False, help="this is help")
parser.add_option("-d","--desired",action="store", dest="desiredServicesFileName", help="list of desired services")
parser.add_option("-r","--running",action="store", dest="runningServicesFileName", help="dump of currently running services")
parser.add_option("-c","--config",action="store", dest="configServicesDirectoryName", help="directory with desired services lists")
(opts, args) = parser.parse_args()
# Outputting a help message and exiting in case -h switch was passed
if opts.help:
print """
This script checks all services on selected Windows machines and sends out a report.
checkServices.py [argument 1] [/argument 2] [/argument 3]
Arguments: Description:
-c, --config - specifies the location of the directory with desired list of services and finds dumps automatically
-d, --desired - specifies the location of the file with the desired list of services.
-r, --running - specifies the location of the file with a dump of running services.
"""
sys.exit(0)
statusMessage = []
detailsMessage = []
body = []
errorCheck=0
directory='%s/*' % opts.configServicesDirectoryName
if opts.configServicesDirectoryName:
check=glob.glob(directory)
check.sort()
if len(check)==0:
message('Server status check:Error','The directory has not been found. Please check its location and spelling.')
sys.exit(0)
for i in check:
desiredServicesFileName=i
runningServicesFileName=i.replace('desiredServices', 'runningServices')
#print runningServicesFileName
status,details,errors=processing(runningServicesFileName,desiredServicesFileName)
errorCheck=errorCheck+errors
statusMessage.append(status)
detailsMessage.append(details)
body='%s\n\n%s' % ('\n'.join(statusMessage),'\n'.join(detailsMessage))
if errorCheck==0:
message('Server status check:OK',body)
else:
message('Server status check:Errors',body)
if opts.desiredServicesFileName or opts.desiredServicesFileName:
status,details,errors=processing(opts.runningServicesFileName,opts.desiredServicesFileName)
message(status,details)
Dump and list files with the desired services should have the same name. The list with the services we are monitoring (desiredServices) should look like this:
Acronis VSS Provider
AcronisAgent
AcronisFS
AcrSch2Svc
The script will check the services, and then compose it all into one email message, which, depending on the result, will say that everything is in order in the message subject or that there are errors, and in the message body to reveal what errors are. For us, one check a day is enough, so in the early morning we receive a notification about the status of our Windows servers. To copy files from a Windows server to a Linux machine, my colleague helped me with the following bash script:
#!/bin/bash
mkdir runningServices
smbclient --user="user%password" "//ServerName.domain.com/software" -c "lcd runningServices; prompt; cd services; mget *.txt"
cd runningServices
for X in `ls *.txt`; do
iconv -f utf16 -t ascii $X > $X.asc
mv $X.asc $X
done
This script also changes the encoding, because on my machine Linux didn’t really want to work with UTF16. Further, in order to clean the folder from dumps with services, I added a batch file to the Task Scheduler to run a PowerShell script that erases dumps.
Body shirt:
powershell.exe C:\Software\Services\delete.ps1
Poweshell script:
remove-item C:\Software\Services\ServerName.txt
The project pursued 2 goals - monitoring services and training Python. This is my first post on Habré, so I already expect an influx of criticism in my address. If you have any comments, especially on improving this system, then you are welcome, share. I hope that this article seems useful to someone, because I did not find such a solution for free and with email notification. Maybe he was looking badly.