Importance of using relevant fields to be displayed for process creation events in SIEMs
In this blog, we will be looking at some process creation events and how to structure the search using relevant fields to get the most of the information. We will be using ELK as the SIEM and Windows Sysmon logs with event ID 1 as the process creation event.
Importance of relevant fields
So, in general, how SIEMs work is they have agents deployed on the machines, these agents are responsible for collecting the logs from various sources on the endpoint and passing it on to the centralized repository of an SIEM which will parse it, index it, clean in and store the logs.
The search bar basically acts like a SQL query where we provide a query to the centralized database for the logs we are looking for. Once it processes the query we get a bunch of events all are raw meaning they have a key value format. The logs in raw format are very hard to read and we wont get a thing from such logs. This is why we have fields, they are present in the logs but are not readable enough. We can select specific fields from the events which will be presented in a tabular form.
Process creation events
Event ID 1 are process creation event, meaning, whenever a new process is created (execution of a file, not the presence of a file) a new process creation event is logged having a dedicated process ID (PID) and also contains information about which process launched this process (if any) called as Parent process having a parent process ID. There are more fields provided in this type of event log, but more specifically considering incident point of view, what I prefer using to investigate a host compromise are the following fields:
1. winlog.computer_name : this will give us the host name on which the activity is going on.
2. user.name: This will provide us under which user the activity is being taken place, will also provide us a general idea of which user has been compromised so as we could reverse engineer this information and lookout for initial access methods taken place for this user (emails, credential release etc.)
3. process.parent.name: The name of the process that initiated the malicious activity, if necessary a parent process ID field is also good to use as we could go further back to investigate the entire process tree.
4. process.parent.command_line: This will give you the entire command line script executed by this parent process to use the child process.
5. process.name: In our context this is nothing but a name of the process spawned from the parent process (child process).
6. process.command_line: This provides a command line script executed by the child process.
Eg .
Timestamp bottom → top (topmost are the more recent logs)
Lets see an example from above image, we have all the field sets, if you observe we can clearly understand what is happening here, such kind of structuring helps analyst create a sequential timeline to explain in their triage what steps did attacker took after compromising the host.
From the above image we can form a storyline in the following way:
Attacker used a malicious binary called installer.exe that spawns cmd.exe which executes a script:
cmd /c "powershell Get-MpPreference | findstr /si DisableRealtimeMonitoring"
following this event a powershell process is spwaned to invoke a powershell script Get-MpPreference followed by another process spawned findstr.exe as there is a pipe in the previous script, so the output of the Get-MpPreference is passed to the findstr to find whether the DisableRealtimeMonitoring is set to True or False.
Following this activity attacker launched another instance of cmd.exe to execute another script:
cmd /c "C:\'Program Files'\'Windows Defender'\MpCmdRun.exe -RemoveDefinitions -All"
this cmd process spawns another powershell instance to execute this instance and this powershell instance spawns MpCmdRun.exe to execute the actual script. By the time it is clear that attacker is trying to evade any defenses from Windows Defender (to check the RealtimeMonitoring status and remove all the signatures stored in Windows defender)
So, the process tree goes like:
installer.exe
|_cmd.exe
|_powershell.exe
|_findstr.exe
|_cmd.exe
|_powershell.exe
|_MpCmdRun.exe
If we didn't have the specific fields selected, there was no way we could have interpreted this activity using the raw logs.
Avoid Fragmentation of commands due to operators like “|” and “&&”
For some reasons using these operators create a new log entry in the SIEMs considering process.names. Lets consider an example below:
As we can see cmd used a script whoami && hostname, however process.command_line logged this as two separate events “whoami” and “hostname”. Lets say if we were not using the process.parent_commandline field it would have been very confusing to interpret the actual command used.
Conclusion:
Machine logs way too many activities which are very hard to interpret during an incident. Considering the amount of activities going on in the background which are mostly legitimate we can miss some really important information hidden in these logs. The use of relevant fields can help us pinpoint the malicious activity we are looking for and if necessary pivot to our further investigation from that information. So my general rule of thumb to investigate process creation logs:
timestamp | computer/hostname | username | parent_processname | parent_process_commandline | processname(child process) | process_commandline
(if you know the exact timestamps of the malicious activities, because if you don't, you may get lots of events that are legitimate)
References:
These screenshots are taken while solving challenge labs from Try Hack Me. https://tryhackme.com/r/room/paymentcollectors