How do you protect a C program against memory corruption exploits? We should try to write code with no bugs, but we also need protection against any bugs which may lurk. Put another way, I try not to crash my bike but I still wear a helmet.
Operating systems now support a variety of tricks to make life difficult for would-be attackers. But most of these hardening features need to be enabled at compile time. When I started contributing to Mosh, I made it a goal to build with full hardening on every platform, not just proactive distributions like Ubuntu. This means detecting available hardening features at compile time.
Mosh uses Autotools, so this code is naturally part of the Autoconf script. I know that Autotools has a bad reputation in some circles, and I'm not going to defend it here. But a huge number of existing projects use Autotools. They can benefit today from a drop-in hardening recipe.
I've published an example project which uses Autotools to detect and enable some binary hardening features. To the extent possible under law, I waive all copyright and related or neighboring rights to the code I wrote for this project. (There are some third-party files in the m4/
subdirectory; those are governed by the respective licenses which appear in each file.) I want this code to be widely useful, and I welcome any refinements you have.
This article explains how my auto-detection code works, with some detail about the hardening measures themselves. If you just want to add hardening to your project, you don't necessarily need to read the whole thing. At the end I talk a bit about the performance implications.
How it works
The basic idea is simple. We use AX_CHECK_{COMPILE,LINK}_FLAG
from the Autoconf Archive to detect support for each feature. The syntax is
AX_CHECK_COMPILE_FLAG
(flag, action-if-supported, action-if-unsupported, extra-flags)
For extra-flags we generally pass -Werror
so the compiler will fail on unrecognized flags. Since the project contains both C and C++ code, we check each flag once for the C compiler and once for the C++ compiler. Also, some flags depend on others, or have multiple alternative forms. This is reflected in the nesting structure of the action-if-supported and action-if-unsupported blocks. You can see the full story in configure.ac
.
We accumulate all the supported flags into HARDEN_{C,LD}FLAGS
and substitute these into each Makefile.am
. The hardening flags take effect even if the user overrides CFLAGS
on the command line. To explicitly disable hardening, pass
./configure --disable-hardening
A useful command when testing is
grep HARDEN config.log
Complications
Clang will not error out on unrecognized flags, even with -Werror
. Instead it prints a message like
clang: warning: argument unused during compilation: '-foo'
and continues on blithely. I don't want these warnings to appear during the actual build, so I hacked around Clang's behavior. The script wrap-compiler-for-flag-check
runs a command and errors out if the command prints a line containing "warning: argument unused
". Then configure
temporarily sets
CC="$srcdir/scripts/wrap-compiler-for-flag-check $CC"
while performing the flag checks.
When I integrated hardening into Mosh, I discovered that Ubuntu's default hardening flags conflict with ours. For example we set -Wstack-protector
, meaning "warn about any unprotected functions", and they set --param=ssp-buffer-size=4
, meaning "don't protect functions with fewer than 4 bytes of buffers". Our stack-protector flags are strictly more aggressive, so I disabled Ubuntu's by adding these lines to debian/rules
:
export DEB_BUILD_MAINT_OPTIONS = hardening=-stackprotector
-include /usr/share/dpkg/buildflags.mk
We did something similar for Fedora.
Yet another problem is that Debian distributes skalibs (a Mosh dependency) as a static-only library, built without -fPIC
, which in turn prevents Mosh from using -fPIC
. Mosh can build the relevant parts of skalibs internally, but Debian and Ubuntu don't want us doing that. The unfortunate solution is simply to reimplement the small amount of skalibs we were using on Linux.
The flags
Here are the specific protections I enabled.
-D_FORTIFY_SOURCE=2
enables some compile-time and run-time checks on memory and string manipulation. This requires-O1
or higher. See alsoman 7 feature_test_macros
.-fno-strict-overflow
prevents GCC from optimizing away arithmetic overflow tests.-fstack-protector-all
detects stack buffer overflows after they occur, using a stack canary. We also set-Wstack-protector
(warn about unprotected functions) and--param ssp-buffer-size=1
(protect regardless of buffer size). (Actually, the "-all
" part of-fstack-protector-all
might implyssp-buffer-size=1
.)Attackers can use fragments of legitimate code already in memory to stitch together exploits. This is much harder if they don't know where any of that code is located. Shared libraries get random addresses by default, but your program doesn't. Even an exploit against a shared library can take advantage of that. So we build a position independent executable (PIE), with the goal that every executable page has a randomized address.
Exploits can't overwrite read-only memory. Some areas could be marked as read-only except that the dynamic loader needs to perform relocations there. The GNU linker flag
-z relro
arranges to set them as read-only once the dynamic loader is done with them.In particular, this can protect the PLT and GOT, which are classic targets for memory corruption. But PLT entries normally get resolved on demand, which means they're writable as the program runs. We set
-z now
to resolve PLT entries at startup so they get RELRO protection.
In the example project I also enabled -Wall -Wextra -Werror
. These aren't hardening flags and we don't need to detect support, but they're quite important for catching security problems. If you can't make your project -Wall
-clean, you can at least add security-relevant checks such as -Wformat-security
.
Demonstration
On x86 Linux, we can check the hardening features using Tobias Klein's checksec.sh. First, as a control, let's build with no hardening.
$ ./build.sh --disable-hardening
+ autoreconf -fi
+ ./configure --disable-hardening
...
+ make
...
$ ~/checksec.sh --file src/test
No RELRO No canary found NX enabled No PIE
The no-execute bit (NX) is mainly a kernel and CPU feature. It does not require much compiler support, and is enabled by default these days. Now we'll try full hardening:
$ ./build.sh
+ autoreconf -fi
+ ./configure
...
checking whether C compiler accepts -fno-strict-overflow... yes
checking whether C++ compiler accepts -fno-strict-overflow... yes
checking whether C compiler accepts -D_FORTIFY_SOURCE=2... yes
checking whether C++ compiler accepts -D_FORTIFY_SOURCE=2... yes
checking whether C compiler accepts -fstack-protector-all... yes
checking whether C++ compiler accepts -fstack-protector-all... yes
checking whether the linker accepts -fstack-protector-all... yes
checking whether C compiler accepts -Wstack-protector... yes
checking whether C++ compiler accepts -Wstack-protector... yes
checking whether C compiler accepts --param ssp-buffer-size=1... yes
checking whether C++ compiler accepts --param ssp-buffer-size=1... yes
checking whether C compiler accepts -fPIE... yes
checking whether C++ compiler accepts -fPIE... yes
checking whether the linker accepts -fPIE -pie... yes
checking whether the linker accepts -Wl,-z,relro... yes
checking whether the linker accepts -Wl,-z,now... yes
...
+ make
...
$ ~/checksec.sh --file src/test
Full RELRO Canary found NX enabled PIE enabled
We can dig deeper on some of these. objdump -d
shows that the unhardened executable puts main
at a fixed address, say 0x4006e0
, while the position-independent executable specifies a small offset like 0x9e0
. We can also see the stack-canary checks:
b80: sub $0x18,%rsp
b84: mov %fs:0x28,%rax
b8d: mov %rax,0x8(%rsp)
... function body ...
b94: mov 0x8(%rsp),%rax
b99: xor %fs:0x28,%rax
ba2: jne bb4 <c_fun+0x34>
... normal epilogue ...
bb4: callq 9c0 <__stack_chk_fail@plt>
The function starts by copying a "canary" value from %fs:0x28
to the stack. On return, that value had better still be there; otherwise, an attacker has clobbered our stack frame.
The canary is chosen randomly by glibc at program start. The %fs
segment has a random offset in linear memory, which makes it hard for an attacker to discover the canary through an information leak. This also puts it within thread-local storage, so glibc could use a different canary value for each thread (but I'm not sure if it does).
The hardening flags adapt to any other compiler options we specify. For example, let's try a static build:
$ ./build.sh LDFLAGS=-static
+ autoreconf -fi
+ ./configure LDFLAGS=-static
...
checking whether C compiler accepts -fPIE... yes
checking whether C++ compiler accepts -fPIE... yes
checking whether the linker accepts -fPIE -pie... no
...
+ make
...
$ file src/test
src/test: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux),
statically linked, for GNU/Linux 2.6.26, not stripped
$ ~/checksec.sh --file src/test
Partial RELRO Canary found NX enabled No PIE
We can't have position independence with static linking. And checksec.sh
thinks we aren't RELRO-protecting the PLT — but that's because we don't have one.
Performance
So what's the catch? These protections can slow down your program significantly. I ran a few benchmarks for Mosh, on three test machines:
A wimpy netbook: 1.6 GHz Atom N270, Ubuntu 12.04
i386
A reasonable laptop: 2.1 GHz Core 2 Duo T8100, Debian sid
amd64
A beefy desktop: 3.0 GHz Phenom II X6 1075T, Debian sid
amd64
In all three cases I built Mosh using GCC 4.6.3. Here's the relative slowdown, in percent.
Protections | Netbook | Laptop | Desktop |
---|---|---|---|
Everything | 16.0 | 4.4 | 2.1 |
All except PIE | 4.7 | 3.3 | 2.2 |
All except stack protector | 11.0 | 1.0 | 1.1 |
PIE really hurts on i386
because data references use an extra register, and registers are scarce to begin with. It's much cheaper on amd64
thanks to PC-relative addressing.
There are other variables, of course. One Debian stable system with GCC 4.4 saw a 30% slowdown, with most of it coming from the stack protector. So this deserves further scrutiny, if your project is performance-critical. Mosh doesn't use very much CPU anyway, so I decided security is the dominant priority.