Track file deletion on PowerShell

Hello, Habr! The topic of my post has already been raised here, but I have something to add.

When our file storage exchanged the third terabyte, more and more often our department began to receive requests to find out who deleted the important document or the entire folder with the documents. Often this happens according to someone’s evil intent. Backups are good, but the country must know its heroes. And milk is doubly tastier when we can write it in PowerShell.

While I was sorting it out, I decided to write it down for my colleagues in the workshop, and then I thought that it might come in handy to someone else. The material was mixed. Someone will find a ready-made solution for themselves, someone will come in handy some unobvious methods of working with PowerShell or task scheduler, and someone will check their scripts for speed.

In the process of finding a solution to the problem, I read the articleauthored Deks . I decided to take it as a basis, but some points did not suit me.
  • Firstly, the time for generating a report for four hours on a 2-terabyte storage facility, with which about 200 people work simultaneously, was about five minutes. And this despite the fact that we do not write too much in the logs. This is less than Dex, but more than I would like, because ...
  • Secondly, all the same had to be implemented on another twenty servers, much less productive than the main one.
  • Thirdly, the schedule for launching report generation raises questions.
  • And fourthly, I wanted to exclude myself from the process of delivering the collected information to end users (read: automate so that they no longer call me with this question).

But I liked the Deks train of thought ...

Brief discourse: When the file system audit is turned on at the time of file deletion, two events are created in the security log, with codes 4663 and, then, 4660. The first records an attempt to request access to delete, data about the user and the path to the deleted file, and the second records what happened the fact of removal. Events have a unique EventRecordID, which differs by one for these two events.

Below is the source script that collects information about deleted files and the users who deleted them.

$time =  (get-date) - (new-timespan -min 240)
$Events = Get-WinEvent -FilterHashtable @{LogName="Security";ID=4660;StartTime=$time} | Select TimeCreated,@{n="Запись";e={([xml]$_.ToXml()).Event.System.EventRecordID}} |sort Запись
$BodyL = ""
$TimeSpan = new-TimeSpan -sec 1
foreach($event in $events){
    $PrevEvent = $Event.Запись
    $PrevEvent = $PrevEvent - 1
    $TimeEvent = $Event.TimeCreated
    $TimeEventEnd = $TimeEvent+$TimeSpan
    $TimeEventStart = $TimeEvent- (new-timespan -sec 1)
    $Body = Get-WinEvent -FilterHashtable @{LogName="Security";ID=4663;StartTime=$TimeEventStart;EndTime=$TimeEventEnd} |where {([xml]$_.ToXml()).Event.System.EventRecordID -match "$PrevEvent"}|where{ ([xml]$_.ToXml()).Event.EventData.Data |where {$_.name -eq "ObjectName"}|where {($_.'#text') -notmatch ".*tmp"} |where {($_.'#text') -notmatch ".*~lock*"}|where {($_.'#text') -notmatch ".*~$*"}} |select TimeCreated, @{n="Файл_";e={([xml]$_.ToXml()).Event.EventData.Data | ? {$_.Name -eq "ObjectName"} | %{$_.'#text'}}},@{n="Пользователь_";e={([xml]$_.ToXml()).Event.EventData.Data | ? {$_.Name -eq "SubjectUserName"} | %{$_.'#text'}}} 
    if ($Body -match ".*Secret*"){
        $BodyL=$BodyL+$Body.TimeCreated+"`t"+$Body.Файл_+"`t"+$Body.Пользователь_+"`n"
    }
}
$Month = $Time.Month
$Year = $Time.Year
$name = "DeletedFiles-"+$Month+"-"+$Year+".txt"
$Outfile = "\serverServerLogFilesDeletedFilesLog"+$name
$BodyL | out-file $Outfile -append

Using the Measure-Command, we got the following:

Measure-Command {
	...
} | Select-Object TotalSeconds | Format-List
...
TotalSeconds : 313,6251476

Too much on the secondary FS will be longer. The descent really did not like the ten-story pipe, so for a start I structured it:

Get-WinEvent -FilterHashtable @{
    LogName="Security";ID=4663;StartTime=$TimeEventStart;EndTime=$TimeEventEnd
} `
    | Where-Object {([xml]$_.ToXml()).Event.System.EventRecordID -match "$PrevEvent"} `
    | Where-Object {([xml]$_.ToXml()).Event.EventData.Data  `
        | Where-Object {$_.name -eq "ObjectName"} `
        | Where-Object {($_.'#text') -notmatch ".*tmp"} `
        | Where-Object {($_.'#text') -notmatch ".*~lock*"} `
        | Where-Object {($_.'#text') -notmatch ".*~$*"}
    }
	| Select-Object TimeCreated,
        @{
            n="Файл_";
            e={([xml]$_.ToXml()).Event.EventData.Data `
                | Where-Object {$_.Name -eq "ObjectName"} `
                | ForEach-Object {$_.'#text'}
            }
        },
        @{
            n="Пользователь_";
            e={([xml]$_.ToXml()).Event.EventData.Data `
                | Where-Object {$_.Name -eq "SubjectUserName"} `
                | ForEach-Object {$_.'#text'}
            }
        }

It turned out to reduce the number of storeys of the pipe and remove the Foreach enumerations, and at the same time make the code more readable, but this did not give a big effect, the difference is within the error margin:

Measure-Command {
    $time =  (Get-Date) - (New-TimeSpan -min 240)
    $Events = Get-WinEvent -FilterHashtable @{LogName="Security";ID=4660;StartTime=$time}`
        | Select TimeCreated,@{n="EventID";e={([xml]$_.ToXml()).Event.System.EventRecordID}}`
        | Sort-Object EventID
    $DeletedFiles = @()
    $TimeSpan = new-TimeSpan -sec 1
    foreach($Event in $Events){
        $PrevEvent = $Event.EventID
        $PrevEvent = $PrevEvent - 1
        $TimeEvent = $Event.TimeCreated
        $TimeEventEnd = $TimeEvent+$TimeSpan
        $TimeEventStart = $TimeEvent- (New-TimeSpan -sec 1)
        $DeletedFiles += Get-WinEvent -FilterHashtable @{LogName="Security";ID=4663;StartTime=$TimeEventStart;EndTime=$TimeEventEnd} `
            | Where-Object {`
                ([xml]$_.ToXml()).Event.System.EventRecordID -match "$PrevEvent" `
                -and (([xml]$_.ToXml()).Event.EventData.Data `
					| where {$_.name -eq "ObjectName"}).'#text' `
						-notmatch ".*tmp$|.*~lock$|.*~$*"
            } `
            | Select-Object TimeCreated,
            @{n="FilePath";e={
                (([xml]$_.ToXml()).Event.EventData.Data `
                | Where-Object {$_.Name -eq "ObjectName"}).'#text'
                }
            },
            @{n="UserName";e={
                (([xml]$_.ToXml()).Event.EventData.Data `
                | Where-Object {$_.Name -eq "SubjectUserName"}).'#text'
                }
            } `
    }
} | Select-Object TotalSeconds | Format-List
$DeletedFiles | Format-Table UserName,FilePath -AutoSize
...
TotalSeconds : 302,6915627

I had to think a little with my head. What operations take the most time? It would be possible to stumble another dozen Measure-Command, but in general, in this case, it is already obvious that most of the time is spent on queries to the log (this is not the fastest procedure even in MMC) and on repeated conversions to XML (moreover, in the case of EventRecordID this is not necessary at all). Let’s try to do both of them once, and at the same time to exclude intermediate variables:

Measure-Command {
	$time =  (Get-Date) - (New-TimeSpan -min 240)
	$Events = Get-WinEvent -FilterHashtable @{LogName="Security";ID=4660,4663;StartTime=$time}`
		| Select TimeCreated,ID,RecordID,@{n="EventXML";e={([xml]$_.ToXml()).Event.EventData.Data}}`
		| Sort-Object RecordID
	$DeletedFiles = @()
	foreach($Event in ($Events | Where-Object {$_.Id -EQ 4660})){
		$DeletedFiles += $Events `
		| Where-Object {`
			$_.Id -eq 4663 `
				-and $_.RecordID -eq ($Event.RecordID - 1) `
				-and ($_.EventXML | where Name -eq "ObjectName").'#text'`
					-notmatch ".*tmp$|.*~lock$|.*~$"
		} `
		| Select-Object `
		@{n="RecordID";e={$Event.RecordID}}, TimeCreated,
		@{n="ObjectName";e={($_.EventXML | where Name -eq "ObjectName").'#text'}},
		@{n="UserName";e={($_.EventXML | where Name -eq "SubjectUserName").'#text'}}
	}
} | Select-Object TotalSeconds | Format-List
$DeletedFiles | Sort-Object UserName,TimeDeleted | Format-Table -AutoSize -HideTableHeaders
...
TotalSeconds : 167,7099384

But this is the result. Acceleration almost twice!

Automate


Rejoiced, and that's enough. Three minutes is better than five, but what's the best way to run a script? Once an hour? This way, entries that appear simultaneously with the script launch can slip away. Make an inquiry not in an hour, but in 65 minutes? Then the entries may be repeated. Yes, and then look for a record of the desired file among thousands of logs - a mess. Write once a day? Log rotation will forget half. Something more reliable is needed. In the comments to the Deks article, someone talked about a subset application running in service mode, but this, you know, is from the category “There are 14 competing standards” ...

In the Windows Task Scheduler, you can create a trigger for an event in the system log. Like this:



Excellent! The script will be launched exactly when the file is deleted, and our log will be created in real time! But our joy will be incomplete if we cannot determine what event we need to record at the time of launch. We need a trick. We have them! A short googling showed that according to the “Event” trigger, the scheduler can transmit information about the event to the executable file. But this is done, to put it mildly, not obvious. The sequence of actions is as follows:

  1. Create a task with a trigger of type “Event”;
  2. Export the task in XML format (via the MMC console);
  3. Add a new “ValueQueries” branch with elements describing variables to the “EventTrigger” branch:

    
    		...
    		Event/System/EventRecordID

    where “eventRecordID” is the name of the variable that can be passed to the script, and “Event / System / EventRecordID” is the Windows log schema element, which can be found at the link at the bottom of the article. In this case, it is an element with a unique event number.
  4. Import the task back into the scheduler.

But we don’t want to push all this with a mouse on 20 servers, right? Need to automate. Unfortunately, PowerShell is not omnipotent, and the New-ScheduledTaskTrigger cmdlet does not yet know how to create triggers like Event. Therefore, we apply the cheat code and create a task through a COM object (for now, we often have to resort to COM, although regular cmdlets can do more and more with each new version of PS):

$scheduler = New-Object -ComObject "Schedule.Service"
$scheduler.Connect("localhost")
$rootFolder = $scheduler.GetFolder("\")
$taskDefinition = $scheduler.NewTask(0)

It is necessary to allow the simultaneous launch of several instances, and also, it seems to me, it is worthwhile to prohibit the manual launch and set the time limit:

$taskDefinition.Settings.Enabled = $True
$taskDefinition.Settings.Hidden = $False
$taskDefinition.Principal.RunLevel = 0 # 0 - обычные привилегии, 1 - повышенные привилегии
$taskDefinition.Settings.MultipleInstances = $True
$taskDefinition.Settings.AllowDemandStart = $False
$taskDefinition.Settings.ExecutionTimeLimit = "PT5M"

Create a trigger of type 0 (Event). Next, we set up an XML query to get the events we need. The XML request code can be obtained in the MMC "Event Log" console by selecting the necessary parameters and switching to the "XML" tab:



$Trigger = $taskDefinition.Triggers.Create(0)
$Trigger.Subscription = ''

The main trick: specify the variable that you want to pass to the script.

$Trigger.ValueQueries.Create("eventRecordID", "Event/System/EventRecordID")

Actually, the description of the command being executed:

$Action = $taskDefinition.Actions.Create(0)
$Action.Path = 'PowerShell.exe'
$Action.WorkingDirectory = 'C:\Temp'
$Action.Arguments = '.\ParseDeleted.ps1 $(eventRecordID) C:\Temp\DeletionLog.log'

And - take off!

$rootFolder.RegisterTaskDefinition("Log Deleted Files", $taskDefinition, 6, 'SYSTEM', $null, 5)

“The concept has changed”


Let's go back to the script for recording logs. Now we do not need to receive all the events, but we need to get the one and only thing passed as an argument. To do this, we will add headers that turn the script into a cmdlet with parameters. To the heap - we will make it possible to change the path to the log "on the fly", perhaps, it will come in handy:

[CmdletBinding()]
Param(
  [Parameter(Mandatory=$True,Position=1)]$RecordID,
  [Parameter(Mandatory=$False,Position=2)]$LogPath = "C:\DeletedFiles.log"
)

Then a nuance arises: until now, we received events with the Get-WinEvent cmdlet and filtered it with the -FilterHashtable parameter. It understands a limited set of attributes, which does not include EventRecordID. Therefore, we will filter through the -FilterXml parameter, but now we can!

$XmlQuery=""
$Event = Get-WinEvent -FilterXml $XmlQuery `
    | Select TimeCreated,ID,RecordID,@{n="EventXML";e={([xml]$_.ToXml()).Event.EventData.Data}}`

Now we no longer need the Foreach-Object enumeration, since only one event is processed. Not two, because the event with the code 4660 is used only to initiate a script; it does not carry useful information in itself.
Remember, in the beginning I wanted users to be able to recognize a villain without my participation? So, if the file is deleted in the document folder of a department - we also write a log to the root of the department folder.

$EventLine = ""
if (($Event.EventXML | where Name -eq "ObjectName").'#text' -notmatch ".*tmp$|.*~lock$|.*~$"){
    $EventLine += "$($Event.TimeCreated)`t"
    $EventLine += "$($Event.RecordID)`t"
    $EventLine += ($Event.EventXML | where Name -eq "SubjectUserName").'#text' + "`t"
    $EventLine += ($ObjectName = ($Event.EventXML | where Name -eq "ObjectName").'#text')
    if ($ObjectName -match "Documents\Подразделения"){
        $OULogPath = $ObjectName `
			-replace "(.*Documents\\Подразделения\\[^\\]*\\)(.*)",'$1\DeletedFiles.log'
        if (!(Test-Path $OULogPath)){
            "DeletionDate`tEventID`tUserName`tObjectPath"| Out-File -FilePath $OULogPath
        }
        $EventLine | Out-File -FilePath $OULogPath -Append
    }
    if (!(Test-Path $LogPath)){
        "DeletionDate`tEventID`tUserName`tObjectPath" | Out-File -FilePath $LogPath }
    $EventLine | Out-File -FilePath $LogPath -Append
}

Summary cmdlet

Well, the pieces are cut, it remains to put everything together and optimize a little more. It will turn out something like this:

[CmdletBinding()]
Param(
  [Parameter(Mandatory=$True,Position=1,ParameterSetName='logEvent')][int]$RecordID,
  [Parameter(Mandatory=$False,Position=2,ParameterSetName='logEvent')]
    [string]$LogPath = "$PSScriptRoot\DeletedFiles.log",
  [Parameter(ParameterSetName='install')][switch]$Install
)
if ($Install) {
    $service = New-Object -ComObject "Schedule.Service"
    $service.Connect("localhost")
    $rootFolder = $service.GetFolder("\")
    $taskDefinition = $service.NewTask(0)
    $taskDefinition.Settings.Enabled = $True
    $taskDefinition.Settings.Hidden = $False
    $taskDefinition.Settings.MultipleInstances = $True
    $taskDefinition.Settings.AllowDemandStart = $False
    $taskDefinition.Settings.ExecutionTimeLimit = "PT5M"
    $taskDefinition.Principal.RunLevel = 0
    $trigger = $taskDefinition.Triggers.Create(0)
    $trigger.Subscription = '
    '
    $trigger.ValueQueries.Create("eventRecordID", "Event/System/EventRecordID")
    $Action = $taskDefinition.Actions.Create(0)
    $Action.Path = 'PowerShell.exe'
    $Action.WorkingDirectory = $PSScriptRoot
    $Action.Arguments = '.\' + $MyInvocation.MyCommand.Name + ' $(eventRecordID) ' + $LogPath
    $rootFolder.RegisterTaskDefinition("Log Deleted Files", $taskDefinition, 6, 'SYSTEM', $null, 5)
} else {
    $XmlQuery=""
    $Event = Get-WinEvent -FilterXml $XmlQuery `
        | Select TimeCreated,ID,RecordID,@{n="EventXML";e={([xml]$_.ToXml()).Event.EventData.Data}}
    if (($ObjectName = ($Event.EventXML | where Name -eq "ObjectName").'#text') `
		-notmatch ".*tmp$|.*~lock$|.*~$"){
        $EventLine = "$($Event.TimeCreated)`t" + "$($Event.RecordID)`t" `
        + ($Event.EventXML | where Name -eq "SubjectUserName").'#text' + "`t" `
		+ $ObjectName
        if ($ObjectName -match ".*Documents\\Подразделения\\[^\\]*\\"){
            $OULogPath = $Matches[0] + '\DeletedFiles.log'
            if (!(Test-Path $OULogPath)){
                "DeletionDate`tEventID`tUserName`tObjectPath"| Out-File -FilePath $OULogPath
            }
            $EventLine | Out-File -FilePath $OULogPath -Append
        }
        if (!(Test-Path $LogPath)){
            "DeletionDate`tEventID`tUserName`tObjectPath" | Out-File -FilePath $LogPath }
        $EventLine | Out-File -FilePath $LogPath -Append
    }
}

It remains to place the script in a convenient place for you and run with the -Install switch.

Now employees of any department can see in real time who, what and when removed from their directories. I note that I did not consider here access rights to log files (so that the villain could not delete them) and rotation. The structure and access rights to the directories on our filer are pulled to a separate article, and rotation to some extent complicates the search for the desired line.

Used materials:

- The most beautiful guide to regular expressions
- Tutorial on creating an event
- bound task - Description of the script API of the Task Scheduler

UPD: There was a typo in the final script, after line 41 there was an extra grave. For the discovery of gratitude to the reader Habr Ruslan Sultanov .

Also popular now: