Sunday, September 2, 2012

FORENSIC CASE BREAKS: Identifying Malware thanks to AppCompat Cache and PE TimeDateStamps


I recently had the privilege to handle a very unique case with a rather familiar narrative: a machine was talking to a known-bad malware command and control point, other machines spoke to those C&C shortly thereafter, and now many machines are offline. And oh, by the way, "we ran A/V on them, did a defrag, and ran CCleaner."

Due to the dynamic nature of targeted attacks, effective prevention and response stems from better situational awareness typically gained through forensics investigations. 

In order to effectively mediate your adversary (and by mediate I mean actively monitor and contain) – you really need to spell out and understand your adversary, placing what you see under the microscope in the context of your operations.  This doesn’t matter if you’ve been hit by a foreign actor, a commodity drive by, Sally the down the street or Billy down the hall – with each attack there is a lot of great information that can come out of a meaningful forensic investigation that can shed light not only on your technical vulnerabilities and the capability of your threat actors, but in how you conduct your day to day activities.

But what if your investigation turns up absolutely nothing?  What if you’re not finding your smoking gun to confirm whether or not you’re dealing with an incident or just another vexing event? How do you effectively mitigate which apparently does not exist?

One of the Achilles heels of forensics and incident response is time. If enough time passes by, the threat propagates, evidence is naturally overwritten, risk is enhanced, and providing effective mitigation strategies becomes proportionately harder because your understanding of what facilitated the threat in the first place is much more ambiguous.

Being able to recover deleted files in a forensic investigation is paramount and is one of the best ways to fully understand the nature of an incident in its totality. However, it is also a problem that manifests in the way that file systems operate. If allocated space is freed and enough time passes, any file of any size can be overwritten and the possibility of making a full file recovery can be nearly impossible. Additionally, due to the way in which file recovery operates, the files recovered will not have any associated file system metadata to help ascertain the original location, name, or temporal information to place the evidence in to context.

This blog post introduces and demonstrates 2 forensic techniques using open source tools that utilize descriptive techniques to aid in the identification of malicious binaries in unallocated space and hopefully will be of use to you in your investigations.

Hunting a Ghost:
As the corporate breach narrative follows, almost always the common denominator is malware. This case was no different.

After reviewing network logs and the filesystem timeline, the awkward nature of this case quickly presented itself: the machine in question had initiated communications to a known bad command and control center (C&C) a month prior lasting only a few days. The system, up until that point and well after, had gone through multiple windows updates and was actively used by several employees. To add fuel to the fire, upon notification of the incident occurring, the machine had several rounds of A/V and spyware cleaner sweeps. Standard activities but possibly very devastating for evidence extraction and timeline creation.

After careful review of all the data we had at our disposal for malware persistence and removal in conjunction with evidence of file execution (historically, too, through System Restore Points), I was coming up short. There just wasn’t anything on the system to indicate that it was infected with ANYTHING. All in all, the system I had in my hands had looked clean or was very well cleaned.

AppCompat Cache:
After being inspired by a tool released by Andrew Davis over at Mandiant [1] , I’ve been taking a lot of time lately understanding the Windows Application Compatibility Database (AppCompat Cache) as a way to determine if execution of a questionable application occurred. 

Now, if you are not familiar with the Windows Application Database (and neither was I), it’s a mechanism for the Windows Operating system to identify problematic modules or executables that need to be handled differently upon load. In short, the windows operating system maintains a list of programs and the corresponding “shims” (think of it as “quick fixes” for compatibility problems) to be used within the loader upon program execution to handle problematic events that would make the application - or your OS -  croak. If your program contains methods or modules that have been known to cause issues, the Application Compatibility Engine, initialized by kernel32.dll, will do a lookup in a shared resource in the registry for the executable and will load the corresponding shim – a helper DLL. 

What’s cool about this feature is that it is able to identify the application independently of its absolute path: so with this, the application compatibility database maintains a list of pertinent metadata to help making identification of the problem program easier (metadata such as file name, path, file size, last modification date, size, etc.)

There’ a lot more going on than what I’ve explained here, so to learn more about this neat feature, see Alex Ionescu’s blog post [2].

What does this mean to us forensically? Well, that’s simple: we have a database of programs that have been executed on the system and a ton of supplementary metadata to aid in our investigation.

So after my ritual beer and slice of buffalo (or bar-b-q) pizza to help me think my forensic problems over, I went back to my evidence collection to review, specifically, the AppCompat Cache.

Looking over the AppCompat Cache, I noticed something very odd: an entry for what appeared to be cmd.exe in system32.  Not exciting, right? But why would something so fundamental need to be shimmed?  After closer inspection, I realized that I had been fooled. The program in question was not actually cmd.exe, but cmcl.exe. 

cmcl.exe posing as cmd.exe
cmcl.exe posing as cmd.exe - sneaky,sneaky.
Funny.

Now that I had an executable in mind, I immediately went back to my image to search for the illusive cmcl.exe, and as you already figured, it was deleted. 

PE TimeDateStamp:
Even though the file was deleted, I still held on some hope that it could be recovered. After discussion with my peers,  my optimism of file recovery being possible at this point quickly faded - and with the final report deliverable casting a long shadow, I needed something tangible to write about. So, to get the job done, I resorted to The Sleuth Kit, and threw blkls at the image to extract the unallocated clusters, and then proceeded to carve out Portable Executables.

In the end, over 500 exe’s and dll’s had been carved and then an overwhelming feeling of “Now What?” ensued. A quick moment of relief came when I remembered that the AppCompat Cache tracks byte sizes and figured that would be a great parameter in which to narrow down the seemingly hundreds of nameless carved executables. Turns out, I had about 50 other files that matched that same byte size, all of which were unknown to the malware world. Great.

In order to save time on the analyzing each and every file, I decided to take a stab at baselining and mapping the compile times for all executables (exe and dlls). I remembered a while back, Harlen Carvey [3] had compiled a malware checklist for Forensic Analysts and one of the items on the list with the PE Compile Time attribute.  I remember questioning the utility of this supposed forensic gem… and by questioning I literally said out-loud somewhat narrow minded, “why the hell would I care about the compile time.” After all, I was well aware that they could be forged and that depending on the compiler chosen, the timestamp could be excluded, defaulted to a standard value, or inherited from other embedded objects and resources. But, after a long hard thought of how this piece of information could be exploited to my advantage, the answer was simple: since the directory the file originated from was known, then perhaps PE Compile Time could be anomalous in relation to the time stamps of files within that directory.

A while back, I assimilated Ero Carrera’s pefile python library [4] in to some of my malware analysis scripts to extract specific elements of the PE header for some other research I’ve been conducting on and off for the past year and quickly modified it so that I could retrieve compile times from all of the executables carved out of unallocated space. To have a good reference, I first created a baseline of timestamps for all files in the operating system and then for all the carved files. (Note: HexCorn Blog has developed a very awesome tool to extract the TimeDateStamps from PE Files and I highly suggest you take a look at it [5]). 

Admist all the noise that followed, one lone file stood out, not matching the byte size listed in the AppCompat Cache, but had oddly been compiled much later than the standard files and was the only file I was able to find that had been compiled that year.

Holding back my excitement, I carefully placed the executable in to my analysis VM, executed it, and watched as the binary called out the C&C observed in the network logs, dropped and executed my good friend, cmcl.exe.

Conclusion:
So all and all, this ended being a good case from my perspective and was very ecstatic that I could go home and drink a beer out of enjoyment and eat a slice a pizza simply because I was hungry.

When it comes to recovering deleted files, you’re pretty much at the mercy of Windows’ allocation algorithms and it is best to conduct everything in a timely manner. However, every now and then luck pulls through.

In short, I highly recommend that you take some time to play with ShimCacheParser, and now, HexCorn's sweet Time Stamp Utility  both linked below, and take time to think about the TimeDateStamps or “PE Compile Date” for your investigations. There is a reasonable amount of intelligence you can gain from these data elements, and if anything, they offer a great starting point of where and when to look.

References:
[2] - http://www.alex-ionescu.com/?p=39