Monitoring Windows Services with PowerShell and Python

image
Background:
I myself work in the technical department of a brokerage company in Toronto, Canada. We also have another office in Calgary. Somehow, after the planned installation of Windows updates on a single domain controller in a remote office, the W32Time service, which is responsible for synchronizing time with an external source, did not start. Thus, within about a week, the time on the server lost about 20 seconds. Our workstations at that time, by default, received time from the controller. You understand what happened. In bidding time is very important, the difference in seconds can solve a lot. The first time difference, unfortunately, was noticed by our brokers. Our technical support department, which essentially consists of 3 people, was assigned for this. It was urgent to do something. The solution was to apply group policy, which sent all machines to an internal NTP server running CentOS. There were still problems with DC Barracuda Agent, the service responsible for connecting domain controllers with our Web filter, and a couple of services sometimes caused us anxiety. Nevertheless, we decided to come up with something to keep track of a couple of services. I googled a little and realized that there are many solutions, mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. responsible for connecting domain controllers with our Web filter, and a couple of services sometimes caused us anxiety. Nevertheless, we decided to come up with something to keep track of a couple of services. I googled a little and realized that there are many solutions, mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. responsible for connecting domain controllers with our Web filter, and a couple of services sometimes caused us anxiety. Nevertheless, we decided to come up with something to keep track of a couple of services. I googled a little and realized that there are many solutions, mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine. mostly commercial for this problem, but since I wanted to learn some kind of scripting language, I volunteered to write a script in Python using our local Linux guru. As a result, it turned into a script that checks all services, comparing their availability and status with a list of desired services, which unfortunately must be done manually separately for each machine.

Solution:

On one of the Windows servers, I created a PowerShell script of this kind:
echo "Servername" > C:\Software\Services\Servername.txt
get-date >> C:\Software\Services\Servername.txt
Get-Service -ComputerName Servername | Format-Table -Property status, name >> C:\Software\Services\Servername.txt


In my case, there were 10 such pieces for each server.

In the Task Scheduler, I added the following batch file (it seemed to me easier than trying to run the PowerShell script directly from there):

powershell.exe C:\Software\Services\cal01script.ps1


Now every day I received a list with all the services in a separate file for each server in a similar format:

Servername
Friday, October 26, 2012 1:24:03 PM
                                 Status Name                                   
                                 ------ ----                                   
                                Stopped Acronis VSS Provider                   
                                Running AcronisAgent                           
                                Running AcronisFS                              
                                Running AcronisPXE                             
                                Running AcrSch2Svc                             
                                Running ADWS                                   
                                Running AeLookupSvc                            
                                Stopped ALG                                    
                                Stopped AppIDSvc                               
                                Running Appinfo                                
                                Running AppMgmt                                
                                Stopped aspnet_state                           
                                Stopped AudioEndpointBuilder                   
                                Stopped AudioSrv                               
                                Running Barracuda DC Agent                     
                                Running BFE                                    
                                Stopped BITS                                   
                                Stopped Browser                                
                                Running CertPropSvc                                  
                                Running WinRM                                  
                                Stopped wmiApSrv                               
                                Stopped WPDBusEnum                             
                                Running wuauserv                               
                                Stopped wudfsvc    


Now the most important part. On a separate CentOS machine on board, I wrote this script:

import sys
import smtplib
import string
from sys import argv
import os, time
import optparse
import glob
# function message that defines the email we get about the status
def message(subjectMessage,msg):
  SUBJECT = subjectMessage
  FROM = "address@domain.com"
  TO = 'address@domain.com'
  BODY =  string.join((
  "From: %s" % FROM,
  "To: %s" % TO,
  "Subject: %s" % SUBJECT ,
  "",
  msg
  ), "\r\n")
  s = smtplib.SMTP('mail.domain.com')
  #s.set_debuglevel(True)
  s.sendmail(FROM, TO, BODY)
  s.quit()
  sys.exit(0)
def processing(runningServicesFileName,desiredServicesFileName):
  try:
    desiredServicesFile=open(desiredServicesFileName,'r')
  except (IOError,NameError,TypeError):
    print "The list with the desired state of services either does not exist or the name has been typed incorrectly. Please check it again."
    sys.exit(0)
  try:
    runningServicesFile=open(runningServicesFileName,'r')
  except (IOError,NameError,TypeError):
    print "The dump with services either does not exist or the name has been typed incorrectly. Please check it again."
    sys.exit(0)
  #Defining variables
  readtxt = desiredServicesFile.readlines()
  desiredServices = []
  nLines = 0
  nRunning = 0
  nDesiredServices = len(readtxt)
  faultyServices = []
  missingServices = []
  currentServices = []
  serverName = ''
  dumpdate=''
  errorCount=0
 # Trimming file in order to get a list of desired services. Just readlines did not work putting \n in the end of each line
  for line in readtxt:
    line = line.rstrip()
    desiredServices.append(line)
  # Finding the number of currently running services and those that failed to start
  for line in runningServicesFile:
    nLines+=1
  # 1 is the line where I append the name of each server
    if nLines==1:
      serverName = line.rstrip()
  # 3 is the line in the dump that contains date
    if nLines==3:
      dumpdate=line.rstrip()
  # 7 is the first line that contains valueable date. It is just the way we get these dumps from Microsoft servers.
    if nLines<7:
      continue
  # The last line in these dumps seems to have a blank character that we have to ignore while iterating.
    if len(line)<3:
      break
    line = line.rstrip();
    serviceStatusPair = line.split(None,1)
    currentServices.append(serviceStatusPair[1])
    if serviceStatusPair[1] in desiredServices and serviceStatusPair[0] == 'Running':
      nRunning+=1
    if serviceStatusPair[1] in desiredServices and serviceStatusPair[0] != 'Running':
      faultyServices.append(serviceStatusPair[1])
  if nLines==0:
    statusText='Dumps are empty on %s' % (serverName)
    detailsText='Dumps are empty'
  # Checking if there are any missing services
  for i in range(nDesiredServices):
    if desiredServices[i] not in currentServices:
       missingServices.append(desiredServices[i])
  # Sending the email with results
  if nRunning == nDesiredServices:
    statusText='%s: OK' % (serverName)
    detailsText='%s: OK\nEverything works correctly\nLast dump of running services was taken at:\n%s\nThe list of desired services:\n%s\n' % (serverName,dumpdate,'\n'.join(desiredServices))
  else:
    statusText='%s: Errors' % (serverName)
    detailsText='%s: Errors\n%s out of %s services are running.\nServices failed to start:%s\nMissing services:%s\nLast dump of the running services was taken at:\n%s\n' % (serverName,nRunning,nDesiredServices,faultyServices,missingServices,dumpdate)
    errorCount=errorCount+1
  return (statusText,detailsText,errorCount)
# Defining switches that can be passed to the script
usage = "type -h or --help for help"
parser = optparse.OptionParser(usage,add_help_option=False)
parser.add_option("-h","--help",action="store_true", dest="help",default=False, help="this is help")
parser.add_option("-d","--desired",action="store", dest="desiredServicesFileName", help="list of desired services")
parser.add_option("-r","--running",action="store", dest="runningServicesFileName", help="dump of currently running services")
parser.add_option("-c","--config",action="store", dest="configServicesDirectoryName", help="directory with desired services lists")
(opts, args) = parser.parse_args()
# Outputting a help message and exiting in case -h switch was passed
if opts.help:
  print """
  This script checks all services on selected Windows machines and sends out a report.
  checkServices.py [argument 1] [/argument 2] [/argument 3]
  Arguments:      Description:
  -c, --config - specifies the location of the directory with desired list of services and finds dumps automatically
  -d, --desired - specifies the location of the file with the desired list of services.
  -r, --running - specifies the location of the file with a dump of running services.
  """
  sys.exit(0)
statusMessage = []
detailsMessage = []
body = []
errorCheck=0
directory='%s/*' % opts.configServicesDirectoryName
if opts.configServicesDirectoryName:
  check=glob.glob(directory)
  check.sort()
  if len(check)==0:
    message('Server status check:Error','The directory has not been found. Please check its location and spelling.')
    sys.exit(0)
  for i in check:
    desiredServicesFileName=i
    runningServicesFileName=i.replace('desiredServices', 'runningServices')
    #print runningServicesFileName
    status,details,errors=processing(runningServicesFileName,desiredServicesFileName)
    errorCheck=errorCheck+errors
    statusMessage.append(status)
    detailsMessage.append(details)
  body='%s\n\n%s' % ('\n'.join(statusMessage),'\n'.join(detailsMessage))
  if errorCheck==0:
    message('Server status check:OK',body)
  else:
    message('Server status check:Errors',body)
if opts.desiredServicesFileName or opts.desiredServicesFileName:
  status,details,errors=processing(opts.runningServicesFileName,opts.desiredServicesFileName)
  message(status,details)


Dump and list files with the desired services should have the same name. The list with the services we are monitoring (desiredServices) should look like this:

Acronis VSS Provider
AcronisAgent
AcronisFS    
AcrSch2Svc      


The script will check the services, and then compose it all into one email message, which, depending on the result, will say that everything is in order in the message subject or that there are errors, and in the message body to reveal what errors are. For us, one check a day is enough, so in the early morning we receive a notification about the status of our Windows servers. To copy files from a Windows server to a Linux machine, my colleague helped me with the following bash script:

#!/bin/bash
mkdir runningServices
smbclient --user="user%password" "//ServerName.domain.com/software" -c "lcd runningServices; prompt; cd services; mget *.txt"
cd runningServices
for X in `ls *.txt`; do
  iconv -f utf16 -t ascii $X > $X.asc
  mv $X.asc $X
done


This script also changes the encoding, because on my machine Linux didn’t really want to work with UTF16. Further, in order to clean the folder from dumps with services, I added a batch file to the Task Scheduler to run a PowerShell script that erases dumps.
Body shirt:
powershell.exe C:\Software\Services\delete.ps1


Poweshell script:
remove-item C:\Software\Services\ServerName.txt


The project pursued 2 goals - monitoring services and training Python. This is my first post on Habré, so I already expect an influx of criticism in my address. If you have any comments, especially on improving this system, then you are welcome, share. I hope that this article seems useful to someone, because I did not find such a solution for free and with email notification. Maybe he was looking badly.

Also popular now: