From SMM to userland in a few bytes10 Jan 2016
In 2014, @coreykal, @xenokovah, @jwbutterworth3 @ssc0rnwell gave a talk entitled Extreme Privilege Escalation on Windows 8/UEFI Systems at Black Hat USA. They introduced the idea of a SMM rootkit called The Watcher slides (57 to 63). To sum it up:
- The Watcher lives in SMM (where you can’t look for him)
- It has no build-in capability except to scan memory for a magic signature
- If it finds the signature, it treats the data immediately after the signature as code to be executed
- In this way the Watcher performs arbitrary code execution on behalf of some controller.
This idea is awesome, and I wanted to try to implement it on Linux. Actually, it was far more easier than expected, thanks to QEMU and SeaBIOS.
The BIOS flash is initially modified in order to have a malicious code executed in SMM by the SMI handler, which scans memory for a magic signature on a regular basis.
Periodic SMI generation
In order to scan memory regularly, the SMI handler must be called
regularly. There’s a mechanism that’s exactly meant for this: the
Rate Select, through
It guarantees the generation of a SMI at least at least once every 8, 16, 32, or
64 seconds. On a modern CPU, it’s straightforward to use this register to
ensures that the SMM handler is called regularly. Nevertheless, it doesn’t seem
to exist on the emulated CPU (Intel® 440FX) of the virtual machine.
But this isn’t the only manner to call the SMI handler regularly. The APIC can be configured to redirect IRQ to SMM. In fact, the I/O APIC defines a redirection table for this purpose. This technique was already used by chpie and SMM-Rootkits-Securecom08.pdf to implement a keylogger in 2008. Actually, it can also be used to call a SMI handler regularly.
For instance, IRQs on
ata_piix occurs quite frequently (about every 2 seconds)
in my VM:
$ grep ata_piix /proc/interrupts
14: 4357 IO-APIC-edge ata_piix
ATA PIIX seems to be the Intel PATA/SATA controller. This looks like a great opportunity to let our SMI handler be called regularly by redirecting IRQ #14 to SMM.
The Intel® 82093AA I/O APIC Datasheet describes how to use the I/O redirection table registers. The registers are 64 bits wide. Basically, they’re just set to the interrupt vector.
IOREDTBL 14 = 0x000000000000003e. But one can change the
Delivery Mode to another value, for example
0b010 to deliver a SMI:
Once the SMI handler is called, the interrupt must be sent to the Local APIC through an IPI. Simply write to the Local APIC registers:
#define LOCAL_APIC_BASE (void *)0xfee00000
#define INTERRUPT_VECTOR 0x3e
static void forward_interrupt(void)
uint32_t volatile *a, *b;
unsigned char *p;
p = LOCAL_APIC_BASE;
b = (void *)(p + 0x310);
a = (void *)(p + 0x300);
*b = 0x00000000;
*a = INTERRUPT_VECTOR << 0;
The IRQ redirection must be created once the OS finished to configure APIC, otherwise it may be overwritten.
If you’re too lazy to read MSRs to get IOAPIC and Local APIC memory addresses, they’re located here:
$ grep APIC /proc/iomem
fec00000-fec003ff : IOAPIC 0
fee00000-fee00fff : Local APIC
Memory scan from the SMM handler
Since the SMM handler is called regularly, it must be as fast as possible. The memory scan is easy: just walk through all the physical pages. Even if not bullet-proof, the techniques from OS Dev.org are sufficient for a proof of concept. If one page starts with the magic signature, the payload located just after is executed.
The SMM payload must be as simple as possible since SMM handlers execute in with paging disabled, no interruptions, etc.
Fortunately, the VDSO library is mmaped in every userland processes. A few
syscalls (on x64:
time) use VDSO to be
faster. One can dump the VDSO of a random process to take a look at it:
$ gdb /bin/ls
gdb$ b __libc_start_main
Breakpoint 1, __libc_start_main
gdb$ dump_binfile vdso 0x7ffff7ffa000 0x7ffff7ffc000
and notice that
clock_gettime is right at the entrypoint address:
$ readelf -a vdso | grep Entry
Entry point address: 0xffffffffff700700
$ gdb vdso
gdb$ x/5i 0xffffffffff700700
0xffffffffff700700 <clock_gettime>: push rbp
0xffffffffff700701 <clock_gettime+1>: mov rbp,rsp
0xffffffffff700704 <clock_gettime+4>: push r15
0xffffffffff700706 <clock_gettime+6>: push r14
0xffffffffff700708 <clock_gettime+8>: push r13
Moreover, there is about 2800 bytes of free space to put some code at the end of the mapping:
$ hd vdso | tail -3
00001510 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
All of this make VDSO looks like a promising target. VDSO is easy to fingerprint
in memory because the first bytes are the ELF header, and a few strings are
always present. Since
clock_gettime is the entrypoint of the library,
there’s no need to implement an ELF parser in ring-2. Finally, VDSO library is
mapped on 2 consecutive physical pages. One don’t have mess with the page
tables of the current process to find the second page.
Here’s the plan: once the magic is found by the SMI handler at the beginning of
a physical page, the code located after it is executed. The code executed in SMM
walk (again) through the physical pages to find the 2 VDSO’s consecutive
physical pages. Finally, the prologue of
clock_gettime is hijacked to a custom
userland payload written at the end of VDSO.
This SMM payload written in C and compiled with metasm is about 750 bytes long.
Once again, this is sufficient for a proof of concept, but a fully weaponized exploit should parse the VDSO ELF carefully. One could note that once this SMM payload is executed, all present and future userland processes will be backdoored and the IRQ redirection can be safely removed.
The userland payload will be called every time a process makes a call to one of
the previously mentioned system call. Since
clock_gettime() might be called
frequently by different processes, we don’t want to overload the machine with a
lot of Python processes. Thus, the payload checks if the file
and in that case does nothing because it has already been executed. On the other
/tmp/.x doesn’t exist, the payload forks:
- the parent restore registers before returning to
- the child executes a python one-liner which creates
/tmp/.xand does a TCP reverse shell to the attacker.
This userland payload, written in assembly, is 280 bytes long.
For a local attacker, it’s trivial to execute code in SMM: just mmap a page and
MAGIC+CODE. That’s it.
For a remote attacker, there are several ways to reach one’s goal. For example,
send a lot of UDP packets containing
MAGIC+CODE prefixed with random padding
PAGE_SIZE-sizeof(CODE) bytes) to any UDP port. With a bit
of luck, it triggers the SMM handler in a few seconds.
Waiting for the shell
Once the SMM payload is executed, the attacker only have to wait for the reverse shell. It can take a few seconds to a few minutes given the running processes on the machine.
An impatient attacker may want to check if everything went well by triggering
one of the VDSO syscalls. For instance, one can try to login to a ssh server
with an invalid account. OpenSSH server calls
clock_gettime() before writing
the log entry, which triggers the execution of the userland payload by any
A non-root process may however call
clock_gettime(), and the reverse shell
would not have root privileges. The uid of the process may be checked in the
payload, but to save a few bytes, the attacker may also remove
wait for a new reverse shell.
Eventually, here’s a screenshot of the backdoor. On the left, the debug log
/tmp/bios.log of the virtual machine whose SMM is backdoored. On the right a
Python script wich send in a loop the payload to be executed by the SMI handler
in the hope of being mmaped at the very beginning of a page. And at the bottom,
the reverse shell from a root process of the VM whose VDSO has been hijacked:
The idea of The Watcher is practical on Linux: it’s pretty straightforward to execute code in userland from SMM reliably, thanks to VDSO. The code of SeaBIOS is modified to include a malicious SMI handler, and the memory of the OS is never altered until the attacker manages to put the payload in memory. The payload size is no longer than 1084 bytes, and can be injected through the network even if not port is open.
Nevertheless, SeaBIOS’ SMM support is basic, and I didn’t find a way to
automatically install The Watcher at the boot of the machine (there should be
a more elegant way than a bootkit). At present, a SMI must be issued
outb(0xXY, 0xb2)) to start The Watcher.
The code of this proof-of-concept is available on github: the-sea-watcher.
On a more encouraging note,
a simple trick to potentially determine if a machine is infected by a SMM
rootkit: just count the number of SMIs since the last reset of the machine, with
MSR_SMI_COUNT (since Nehalem):
$ sudo aptitude install linux-tools-common linux-tools-generic
$ wget http://kernel.ubuntu.com/git/cking/debug-code/.git/tree/smistat/smistat.c http://kernel.ubuntu.com/git/cking/debug-code/.git/tree/smistat/Makefile
$ sudo modprobe msr
$ sudo ./smistat
pointed out by @XenoKovah,
MSR_SMI_COUNT isn’t a meaningful detector.