I recently had the privilege to
handle a very unique case with a rather familiar narrative: a machine was talking to a known-bad malware command and control point, other machines spoke to those C&C shortly thereafter, and now many machines are offline. And oh, by the way, "we ran A/V on them, did a defrag, and ran CCleaner."
Due to the dynamic nature of targeted attacks, effective prevention and response stems from better situational awareness typically gained through forensics investigations.
In order to effectively mediate your adversary (and by mediate I mean actively monitor and contain) – you really need to spell out and understand your adversary, placing what you see under the microscope in the context of your operations. This doesn’t matter if you’ve been hit by a foreign actor, a commodity drive by, Sally the down the street or Billy down the hall – with each attack there is a lot of great information that can come out of a meaningful forensic investigation that can shed light not only on your technical vulnerabilities and the capability of your threat actors, but in how you conduct your day to day activities.
But what if your investigation turns up absolutely nothing? What if
you’re not finding your smoking gun to confirm whether or not you’re dealing
with an incident or just another vexing event? How do you effectively mitigate which
apparently does not exist?
One of the Achilles heels of
forensics and incident response is time. If enough
time passes by, the threat propagates, evidence is naturally overwritten, risk
is enhanced, and providing effective mitigation strategies becomes proportionately
harder because your understanding of what facilitated the threat in the first
place is much more ambiguous.
Being able to recover deleted files in a forensic investigation is paramount and is one of the best ways to fully understand the nature of an incident in its totality. However, it is also a problem that manifests in the way that file systems operate. If allocated space is freed and enough time passes, any file of any size can be overwritten and the possibility of making a full file recovery can be nearly impossible. Additionally, due to the way in which file recovery operates, the files recovered will not have any associated file system metadata to help ascertain the original location, name, or temporal information to place the evidence in to context.
This blog post introduces and demonstrates 2 forensic techniques using open source tools that utilize descriptive techniques to aid in the identification of malicious binaries in unallocated space and hopefully will be of use to you in your investigations.
Being able to recover deleted files in a forensic investigation is paramount and is one of the best ways to fully understand the nature of an incident in its totality. However, it is also a problem that manifests in the way that file systems operate. If allocated space is freed and enough time passes, any file of any size can be overwritten and the possibility of making a full file recovery can be nearly impossible. Additionally, due to the way in which file recovery operates, the files recovered will not have any associated file system metadata to help ascertain the original location, name, or temporal information to place the evidence in to context.
This blog post introduces and demonstrates 2 forensic techniques using open source tools that utilize descriptive techniques to aid in the identification of malicious binaries in unallocated space and hopefully will be of use to you in your investigations.
Hunting a Ghost:
As the corporate breach narrative
follows, almost always the common denominator is malware. This case was no
different.
After reviewing network logs and the filesystem timeline, the awkward nature of this case quickly presented itself: the machine in question had initiated communications to a known bad command and control center (C&C) a month prior lasting only a few days. The system, up until that point and well after, had gone through multiple windows updates and was actively used by several employees. To add fuel to the fire, upon notification of the incident occurring, the machine had several rounds of A/V and spyware cleaner sweeps. Standard activities but possibly very devastating for evidence extraction and timeline creation.
After reviewing network logs and the filesystem timeline, the awkward nature of this case quickly presented itself: the machine in question had initiated communications to a known bad command and control center (C&C) a month prior lasting only a few days. The system, up until that point and well after, had gone through multiple windows updates and was actively used by several employees. To add fuel to the fire, upon notification of the incident occurring, the machine had several rounds of A/V and spyware cleaner sweeps. Standard activities but possibly very devastating for evidence extraction and timeline creation.
After careful review of all the
data we had at our disposal for malware persistence and removal in conjunction with evidence of
file execution (historically, too, through System Restore Points), I was coming
up short. There just wasn’t anything on the system to indicate that it was
infected with ANYTHING. All in all, the system I had in my hands had looked
clean or was very well cleaned.
AppCompat Cache:
After being inspired by a
tool released by Andrew Davis over at Mandiant [1] , I’ve been taking a lot of
time lately understanding the Windows Application Compatibility Database (AppCompat Cache) as a
way to determine if execution of a questionable application occurred.
Now, if you are not familiar with the Windows Application Database (and neither was I), it’s a mechanism for the Windows Operating system to identify problematic modules or executables that need to be handled differently upon load. In short, the windows operating system maintains a list of programs and the corresponding “shims” (think of it as “quick fixes” for compatibility problems) to be used within the loader upon program execution to handle problematic events that would make the application - or your OS - croak. If your program contains methods or modules that have been known to cause issues, the Application Compatibility Engine, initialized by kernel32.dll, will do a lookup in a shared resource in the registry for the executable and will load the corresponding shim – a helper DLL.
Now, if you are not familiar with the Windows Application Database (and neither was I), it’s a mechanism for the Windows Operating system to identify problematic modules or executables that need to be handled differently upon load. In short, the windows operating system maintains a list of programs and the corresponding “shims” (think of it as “quick fixes” for compatibility problems) to be used within the loader upon program execution to handle problematic events that would make the application - or your OS - croak. If your program contains methods or modules that have been known to cause issues, the Application Compatibility Engine, initialized by kernel32.dll, will do a lookup in a shared resource in the registry for the executable and will load the corresponding shim – a helper DLL.
What’s cool about this feature is
that it is able to identify the application independently of its absolute path: so with
this, the application compatibility database maintains a list of pertinent
metadata to help making identification of the problem program easier (metadata such
as file name, path, file size, last modification date, size, etc.)
There’ a lot more going on than
what I’ve explained here, so to learn more about this neat feature, see Alex
Ionescu’s blog post [2].
What does this mean to us
forensically? Well, that’s simple: we have a database of programs that have
been executed on the system and a ton of supplementary metadata to aid in our
investigation.
So after my ritual beer and slice
of buffalo (or bar-b-q) pizza to help me think my forensic problems over, I
went back to my evidence collection to review, specifically, the AppCompat
Cache.
Looking over the AppCompat Cache,
I noticed something very odd: an entry for what appeared to be cmd.exe in
system32. Not exciting, right? But why
would something so fundamental need to be shimmed? After closer inspection, I realized that I had
been fooled. The program in question was not actually cmd.exe, but cmcl.exe.
cmcl.exe posing as cmd.exe - sneaky,sneaky. |
Funny.
Now that I had an executable in
mind, I immediately went back to my image to search for the illusive cmcl.exe,
and as you already figured, it was deleted.
PE TimeDateStamp:
Even though the file was deleted, I still held on some hope that it could be recovered. After discussion with my peers, my optimism of file recovery being possible at this point quickly faded - and with the final report deliverable casting a long shadow, I needed something tangible to write about. So, to get
the job done, I resorted to The Sleuth Kit, and threw blkls at the image to
extract the unallocated clusters, and then proceeded to carve out Portable
Executables.
In the end, over 500 exe’s and
dll’s had been carved and then an overwhelming feeling of “Now What?” ensued. A quick
moment of relief came when I remembered that the AppCompat Cache tracks byte
sizes and figured that would be a great parameter in which to narrow down the
seemingly hundreds of nameless carved executables. Turns out, I had about 50
other files that matched that same byte size, all of which were unknown to the
malware world. Great.
In order to save time on the
analyzing each and every file, I decided to take a stab at baselining and mapping the compile times for all executables (exe and dlls). I remembered a while back, Harlen Carvey [3] had
compiled a malware checklist for Forensic Analysts and one of the items on the
list with the PE Compile Time attribute.
I remember questioning the utility of this supposed forensic gem… and by questioning I literally said out-loud somewhat narrow minded, “why the
hell would I care about the compile time.” After all, I was well aware that
they could be forged and that depending on the compiler chosen, the timestamp
could be excluded, defaulted to a standard value, or inherited from other
embedded objects and resources. But, after a long hard thought of how this piece of information could be exploited to my advantage, the answer was simple: since the directory the file
originated from was known, then perhaps PE Compile Time could be anomalous in
relation to the time stamps of files within that directory.
A while back, I assimilated Ero
Carrera’s pefile python library [4] in to some of my malware analysis scripts to extract specific elements of the PE
header for some other research I’ve been conducting on and off for the past
year and quickly modified it so that I could retrieve compile times from all of
the executables carved out of unallocated space. To have a good
reference, I first created a baseline of timestamps for all files in the operating system and then for all the
carved files. (Note: HexCorn Blog has developed a very awesome tool to extract
the TimeDateStamps from PE Files and I highly suggest you take a look at it [5]).
Admist all the noise that followed,
one lone file stood out, not matching the byte size listed in the AppCompat
Cache, but had oddly been compiled much later than the standard files
and was the only file I was able to find that had been compiled that year.
Holding back my excitement, I
carefully placed the executable in to my analysis VM, executed it, and watched
as the binary called out the C&C observed in the network logs, dropped and
executed my good friend, cmcl.exe.
Conclusion:
So all and all, this ended being
a good case from my perspective and was very ecstatic that I could go home and
drink a beer out of enjoyment and eat a slice a pizza simply because I was
hungry.
When it comes to recovering deleted
files, you’re pretty much at the mercy of Windows’ allocation algorithms and it is best to conduct everything in a timely manner. However,
every now and then luck pulls through.
In short, I highly recommend that
you take some time to play with ShimCacheParser, and now, HexCorn's sweet Time Stamp Utility both linked below, and take time to think about the
TimeDateStamps or “PE Compile Date” for your investigations. There is a reasonable
amount of intelligence you can gain from these data elements, and if anything,
they offer a great starting point of where and when to look.
References:
[2] - http://www.alex-ionescu.com/?p=39