Monday, May 09, 2011

External Memory Acquisition, Analysis and Compromise Detection For Linux Virtual Machines

In the physical world, host level forensics can be complicated by the fact that when a host OS has been thoroughly compromised, the information provided by the host OS may be unreliable when rootkits subvert OS functions and take steps to attempt to conceal or obfuscate their presence.

Disk image acquisition and analysis has the advantage of not relying on a subverted host’s OS for information – provided you can grab the disk - but memory acquisition and forensics often have to be performed inside the subject OS. Live memory analysis can also be frustrated by any number of dependencies including the need to login and obtain privileged user access; the possibility of having to make and load a memory access driver; the need to dump memory to a file and move the file elsewhere; and the possibility of running up against restrictions on memory devices while trying to dump memory. Various hardware based memory acquisition options - and associated of evasion techniques - exist for physical machines; one of the best treatments of these is Rutkowska's http://invisiblethings.org/papers/cheating-hardware-memory-acquisition-updated.ppt

In the virtual world, a snapshot can be made of a running or suspended virtual machine. Snapshots are created at the hypervisor layer, without interfacing with the guest OS, and yield a set of files – one of which is essentially a memory dump. This provides a relatively effortless method for acquiring memory with, it would seem, a minimal attack surface for interference from within the subject OS. Of course it’s also possible to acquire memory from a VM the traditional way by logging into a guest OS and dumping memory to a file.

I’ve recently been testing an interesting memory forensics tool for Linux called Second Look® from the security R&D firm Pikewerks. As their documentation explains, “The supported memory image format is a raw physical memory image; i.e., a byte at position x in an image file corresponds to a byte at physical address x. If you cannot acquire memory via virtualization or other out-of-band technique, the Second Look memory acquisition script may be used to acquire memory from a system.”

It turns out that .vmem snapshot files follow this format closely enough that this tool can actually parse memory from VMware snapshot files in addition to traditional memory dumps. Acquiring memory by taking a snapshot is relatively effortless; no need for tedious mucking around inside the guest Os and there should be no disruption to the guest VM.

Second Look is a security and forensics tool for Linux. It detects the version of the Linux kernel in a memory image and compares it to a reference kernel from the Second Look repository. They include a script for generating reference kernels for systems with custom kernels (more on that later).

The tool’s capabilities are summarized by Pikewerks: “Second Look® captures, and forensically preserves, a computer's volatile random-access memory (RAM). It analyzes the Linux Operating System Kernel in live memory or via a memory image, verifying its integrity and searching for signs of rootkits or other subversive software that have modified the executable kernel code or kernel data structures. With Second Look, analysts and investigators have a tool that provides a comprehensive view of a system, uninfluenced by any malware that might be running on it. Information pulled directly out of memory includes running processes, active network connections, loaded kernel modules, and many other essential system parameters. Second Look uncovers hidden kernel modules, processes, and network activity with ease. Additionally, in an effort to assist with the analysis of kernel memory, Second Look integrates a real-time disassembler that allows inspection of any function or segment of kernel memory."

You can still do things the hard way; generating memory dumps from inside a virtual machine can be accomplished from inside the OS as in the physical machine world. The Second Look tool includes a script for making a traditional memory dump inside the guest OS with this syntax:

sudo secondlook-memdump.sh MEMORY_IMAGE [MEMORY_SOURCE]

The script defaults to /dev/crash though you can alternatively specify /dev/mem as an argument. The toolkit includes source for their “Physical Memory Access Driver” to acquire memory from systems where neither the Red Hat crash driver, nor a usable /dev/mem device, are available.

The memory dump procedure looks like this:

# ./secondlook-memdump.sh dumpfile
Second Look (tm) Release TRUNK - Physical Memory Acquisition Script

Copyright (c) 2010-2011 Pikewerks Corporation
All rights reserved.

Loading crash driver...
Reading RAM-backed physical address ranges from /proc/iomem...
Dumping pages 16 to 158...
143+0 records in
143+0 records out
585728 bytes (586 kB) copied, 0.0175581 seconds, 33.4 MB/s
Dumping pages 256 to 261871...
261616+0 records in
261616+0 records out
1071579136 bytes (1.1 GB) copied, 22.576 seconds, 47.5 MB/s
Dumping pages 261888 to 262143...
256+0 records in
256+0 records out
1048576 bytes (1.0 MB) copied, 0.0306265 seconds, 34.2 MB/s
Unloading crash driver...

The result is a memory dump file which can be taken away for analysis:

-rwxrwxrwx 1 root root 1073741824 May 5 15:08 dumpfile

Analysis of a Linux virtual machine – CentOS 5.5 in this case – yields the following output using the command line version of the Second Look tool:

Second Look (tm) Release 2010.11 (c) 2008-2010 Pikewerks Corporation

No reference module is available to verify loaded kernel module 'vmxnet'
No reference module is available to verify loaded kernel module 'vmxnet3'
No reference module is available to verify loaded kernel module 'pvscsi'
No reference module is available to verify loaded kernel module 'vmci'
No reference module is available to verify loaded kernel module 'vmhgfs'
No reference module is available to verify loaded kernel module 'vmmemctl'
No reference module is available to verify loaded kernel module 'vsock'
No reference module is available to verify loaded kernel module 'vmblock'


This output is benign; these alerts are actually flagging the various drivers and kernel modules loaded by the VMware tools because these are not present in the reference kernel. These alerts could be suppressed by creating a reference kernel on VMs with VMware tools installed.

Let’s look at something more interesting: an example of how external memory acquisition analysis can be employed to quickly identify a virtual machine compromise by a rootkit which takes steps to conceal its presence.

The VM in this case is a CentOS 5.5 guest running on an host instrumented with a VMsafe firewall/IDS. The vIDS alerted on a series of ICMP exchanges between this VM and a neighboring VM which is indeed atypical behavior for this machine as it has no users logged on most of the time.

Digging deeper reveals the ICMP exchange was initiated by a neighbor VM. Immediately after the ICMP exchange, the target VM opened a connection to tcp/8822 on the same neighbor which initiated the ICMP exchange. This is not a port normally used by these machines and all of this is inexplicable behavior.

So what’s sending this traffic? Interestingly, netstat output on the destination VM does not show a connection from this VM to tcp/8822 on the neighbor VM;

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 3583/portmap
tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 3093/cupsd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 3857/sendmail: acce
tcp 0 0 0.0.0.0:829 0.0.0.0:* LISTEN 3615/rpc.statd
udp 0 0 0.0.0.0:823 0.0.0.0:* 3615/rpc.statd
udp 0 0 0.0.0.0:826 0.0.0.0:* 3615/rpc.statd
udp 0 0 0.0.0.0:68 0.0.0.0:* 6011/dhclient
udp 0 0 0.0.0.0:5353 0.0.0.0:* 3958/avahi-daemon:
udp 0 0 0.0.0.0:111 0.0.0.0:* 3583/portmap
udp 0 0 0.0.0.0:631 0.0.0.0:* 3093/cupsd
udp 0 0 0.0.0.0:39166 0.0.0.0:* 3958/avahi-daemon:
udp 0 0 :::5353 :::* 3958/avahi-daemon:


We should be able to see this network socket and it's endpoint PID unless it’s being hidden or the network detect data is just wrong and the traffic doesn’t exist, which would be a rather large and unusually specific malfunction. A memory dump could reveal any hidden network connections.

I made a snapshot of this VM and analyzed it with the GUI version of Second Look which reports active network socket connections found in the memory dump. Lo and behold, the output shows the vFirewall/IDS is correct and there is indeed a socket connection from local tcp/42746 to tcp/8822 on the neighbor VM:

So what’s going on? The alert output from the Second Look tool reveals a hidden kernel module:

$ secondlook-cli -m 564d4b1a-221f-4eea-4b74-70cf436f043c.vmem -a
Second Look (tm) Release 2010.11 (c) 2008-2010 Pikewerks Corporation

Mismatch of 6 bytes at 0xc0404fd9 [syscall_trace_entry+21] (kernel .text+15849)
Mismatch of 6 bytes at 0xc0404f05 [no_singlestep+12] (kernel .text+15637)
Mismatch of 6 bytes at 0xc0404e9b [sysenter_past_esp+68] (kernel .text+15531)
No reference module is available to verify loaded kernel module 'vmhgfs'
No reference module is available to verify loaded kernel module 'vmmemctl'
No reference module is available to verify loaded kernel module 'vmxnet'
No reference module is available to verify loaded kernel module 'vmci'
No reference module is available to verify loaded kernel module 'vsock'
No reference module is available to verify loaded kernel module 'vmblock'
Potential hidden module detected: vmalloc allocation at 0xf8d14000 appears to contain a module ('enyelkm') that is not in the kernel modules list
Potential hidden module detected: sysfs module list entry 0xf8d15d80 (enyelkm) does not appear in the kernel modules list
Packet type structure at 0xf8d16fc0 [enyelkm:my_pkt+0] has protocol handler function in a region of unverified kernel code or data


The GUI tool gives the same alerts on the presence of a hidden module named enyelkm:


From the alert screen you can pivot to Second Look's disassembly view:

I didn’t take a screenshot of this, but if you go to the "Vmalloc Allocations" screen in the Second Look GUI's "Information" tab, you would see the entry for EnyeLKM highlighted in red (indicating that a module was found in this region of memory that is not present in the kernel's "official" modules list). The address and size columns show the location and extent of the allocation containing the rootkit. Right-clicking on the address and selecting "Show data..." takes you to the data tab, showing a hexdump of this portion of memory. Update the "Length" box to reflect the size of the allocation, less 4096 to account for virtual padding at the end of the allocation that is not mapped to actual physical memory. Clicking "Save to file..." will export the data. This data could then be viewed with other disassemblers. For example, with x86dis:

$ x86dis -r 0 12288 -s att < enyedump.bin
00000000 83 EC 10 sub $0x10, %esp
00000003 0F 01 4C 24 0A sidt 0xA(%esp)
00000008 8B 54 24 0C movl 0xC(%esp), %edx
0000000C 81 C2 00 04 00 00 add $0x00000400, %edx
00000012 8B 02 movl (%edx), %eax
00000014 89 44 24 02 movl %eax, 0x2(%esp)
00000018 8B 42 04 movl 0x4(%edx), %eax
0000001B 0F B7 54 24 02 movzxw 0x2(%esp), %edx
00000020 89 44 24 06 movl %eax, 0x6(%esp)
00000024 0F B7 44 24 08 movzxw 0x8(%esp), %eax

Pikewerks comments: “This basically matches what you see in Second Look's disassembly view, except that we are using a different disassembly library (udis86 vs. x86dis's libdisasm) so the precise formatting of the instructions may differ slightly, and we are able to supply symbolic information that should greatly help an analyst in deciphering the code: both symbols found within the rootkit module, and kernel symbols for when the rootkit references kernel functions or data.”

So this VM has the enyelkm rootkit installed as some readers may have already surmised. The tcp/8822 traffic is not visible on the host because the rootkit hides it. On a well instrumented network, these sorts of stealth techniques could actually be a detection method for rootkits if we could collect and correlate network and host data in order to detect traffic visible on the network but not on the host.

The presence of an unexpected protocol handler, which this rootkit employs, is also flagged:

Protocol handler detects, as Pikewerks comment, are “attempts to detect network backdoors implemented deep in the kernel. This presentation has some nice diagrams and background information:

http://invisiblethings.org/papers/ITUnderground2004_Linux_kernel_backdoors.ppt

If
the address of a protocol handler function is exactly that of a known kernel symbol in the kernel proper or a verified module, we consider it valid. If the address corresponds to a symbol in an unverified module, we consider it a yellow alert. If the address does not correspond to any symbol at all, we consider it a red alert condition. Not all yellow alerts on protocol handlers are indicative of rootkits. For example, VMware Server registers a protocol handler which will generate an alert when the VMware kernel modules can't themselves be verified. In this case, though, the alert is a backdoor of the EnyeLKM rootkit. It is yellow rather than red only because we found the EnyeLKM module and it contained symbols. By following the disassembly of the "capturar" function, or the source code in EnyeLKM's remoto.c, you can see that upon receiving an ICMP or TCP packet with a certain embedded key the rootkit will launch a reverse shell for the attacker.

In order to completely follow this it is important to have some understanding of the context in which the function registered in a protocol handler (or "packet type" structure, as it's called in the kernel) is invoked. Some background in the kernel networking implementation is needed here. The diagram in slide 3 of the presentation I referenced gives the big picture. In the
lower left, you can see a decision being made based on the protocol.

The corresponding kernel code can be seen here, in kernel function netif_receive_skb:
http://lxr.linux.no/#linux+v2.6.18/net/core/dev.c#L1233 This uses inlined function deliver_skb, which contains the actual invocation of the ptype's function: http://lxr.linux.no/#linux+v2.6.18/net/core/dev.c#L1687 Note that the first thing passed to the function is the skb ("socket buffer").

Now right-click on the function field of the suspect protocol handler, and view disassembly. What is "capturar" doing with that skb? At a glance, you can see it's checking some field of the skb, but this is a bit hard without looking at the skb struct definition so we know what offset various fields are at (SL has this info, maybe in the future we'll add some additional markup to the disassembly).

The code to the protocol handler hook function is in file remoto.c. The capturar function is the "entry point" for the rootkit's backdoor. That's the function the kernel invokes with an skb. The control flow in Enye gets a bit complex, because when it finds a packet with matching "key", this function just sets a flag that causes another part of Enye's code to launch the shell, but with either source or disassembly one can eventually piece it all together.

One need not understand all this to be alarmed upon initially seeing this alert, however. It is enough that there is a protocol handler function in a hidden kernel module to be worried!"

Another next step at this point, in addition to analysis of the memory dump, could be to obtain this kernel module’s file for identification and/or malware analysis. It would be convenient if we could carve this module file out of the memory dump but, as Pikewerks point out, this isn’t entirely possible;

“It is possible to identify the end address of EnyeLKM and to extract the memory containing the EnyeLKMrootkit to a file. However, this will not result in exactly the original .ko file used to load the rootkit. Besides the relocations that occur as a module is loaded, some sections of a kernel module are marked "init" and are discarded entirely after the module is loaded, so it just isn't possible to reconstruct a .ko from what's found in memory .Modules are typically loaded either by the 'modprobe' or 'insmod'programs. (Modprobe being a "smart" loader that can search and resolve dependencies, and insmod being a "dumb" loader that must explicitly be pointed at the module to load; analogous to apt-get vs dpkg, or yum vsrpm, in the realm of packages.) But at the end of the day, the way that modules are actually loaded into the kernel is via the init_module system call, and that call takes not a path but rather a buffer containing the module. So the kernel doesn't know what file the module came from......or if it came from a file at all. It is possible to write an alternate kernel module program that would receive a module over the network and load it without ever writing it out to disk. It is not necessarily the case that a module ever existed as a file on the system where it is loaded! That said, you certainly would want to search your disk for any modules you unexpectedly found loaded in your kernel. But it would likely have to be an exhaustive search; what you find in memory probably won't tell you exactly where to look.”

Searching the filesystem from a command shell on the infected VM’s operating system yields nothing as this rootkit conceals itself in several ways; it is hidden on the file system and it is not reported in the output of lsmod. What we really need is a disk image, or the ability to mount and examine a copy of a virtual disk .vmdk file so we can search this virtual disk from a different workstation. This is also fairly easy; virtual disks can be copied while a guest is running and they can be mounted using the Virtual Disk Development Kit. The VDDK uses this syntax:

Usage: vmware-mount diskPath [partition num] mountPoint
vmware-mount [option] [opt args]

There are two modes for mounting disks. If no option is specified, we mount individual partitions from virtual disks independently. The filesystem on the partition will be accessible at the mount point specified.

The -f option mounts a flat representation of a disk on a user-specified mount point. The user must explicitly unmount the disk when finished. A disk may not be in both modes at once.

diskID is an identifier of the form username@hostname:/path/to/vm for remote disks and just the path for local disks. Options that mount a remote disk also require -h -u -F and optionally -v options. The -v option is required when connecting to a Virtual Center.

Options:
-p list all partitions on a disk
-l list all mounted partitions on a disk
-L list all mounted disks
-d cleanly unmount this partition
(closes disk if it is the last partition)
-f mount a flat representation of the disk
at "mountPoint/flat."
-k unmount all partitions and close disk
-K force unmount all partitions and close disk
-x unmount all partitions and close all disks
-X force unmount all partitions and close all disks
Options for remote disks:
-v inventory path of the vm
-h hostname of remote server
-u username for remote server
-F file containing password
-P optional TCP port number (default: 902)

I should note that while these tools are useful for routine analysis and threat detection, it is unclear whether acquiring and examining a virtual disk in this way is forensically sound. I’m not finding an option to mount a virtual disk read-only using the VDDK and instead rely on combinations of file permissions and the “read-only” shared folder setting for working with read-only data using forensic analysis tools in virtual machine workstations. I don’t know that this technique is forensically sound so I assume that it isn’t. I plan to explore what best practices and/or consensus exists on forensic acquisition of disk and memory evidence from virtual machines.

Using the following syntax, I mounted the copied virtual disk and verified I could see its root filesystem:

user@ubuntu:/mnt/hgfs/Setup$ sudo vmware-mount /mnt/hgfs/Setup/centos-1066.vmdk 2 /mnt/vdisk

user@ubuntu:/mnt/vdisk$ ls
bin dev home lost+found misc net proc sbin srv tmp var
boot etc lib media mnt opt root selinux sys usr


Now that we can see this disk’s filesystem without interference, we can search for and examine the enyelkm module;

user@ubuntu:/mnt/vdisk$ find /mnt/vdisk -name *enye*
/mnt/vdisk/etc/.enyelkmHIDE^IT.ko


user@ubuntu:/mnt/vdisk$ file /mnt/vdisk/etc/.enyelkmHIDE^IT.ko
/mnt/vdisk/etc/.enyelkmHIDE^IT.ko: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped


The HIDE^IT string is a marker used by enyelkm; it conceals files with names containing this string. This enye sample is not recognized by virustotal or the Norman sandbox at the time of this writing but it can be confirmed as enye through malware analysis; enye has a hard-coded local privilege escalation and it ships with a program called “connect” that triggers a reverse-rootshell to tcp/8822 or 8823 on the caller after a simple ICMP based authentication exchange. It were an unknown malware sample, it could be sent to malware analysis in-house, or to the SANS ISC, or another malware intelligence source.

Also worth noting is that it's important to perform a serious forensic investigation in cases like this in order to determine how the host was compromised and what, if any, actions it took while under the control of someone else.

This is largely an example of manual response and analysis but the secondlook tool could also be used in batch mode by scheduling snapshots or memory dumps and submitting them to batch analysis in order to continuously perform kernel memory inspection and malware detection. Some of Pikewerks’ customer base apparently do this now in order to monitor for intrusions and novel malware on Linux populations. An additional case study is on the web at http://pikewerks.com/_datasheets/secondlook.pdf

It would be interesting to explore creating reference kernels for the hypervisor platforms in order to explore memory analyses based techniques for detecting tampering with the kernel and/or hypervisors. It is not difficult to create a reference kernel, if you have either a debug kernel or kernel headers, using a script included with the Second Look tool as the vendor describes;

“You can use the "makezrk" script to create a reference kernel (ZRK) for a system. This script is included with Second Look and can be found at /usr/bin/secondlook-makezrk.sh. You'll also need to copy the "structinfo generation" program from /usr/bin/secondlook-genstructinfo to the same location of the target system. The makezr script uses this program to create kernel data structure metadata for the ZRK. And likewise the "dummymod" from /usr/share/secondlook/dummymod to the same location on the target. But please note that creating a reference kernel does require kernel headers to be installed so it may not be possible to analyze it unless the vendor provides the headers and/or source for the kernel they're shipping. “


No comments: