How do you protect a C program against memory corruption exploits? We should try to write code with no bugs, but we also need protection against any bugs which may lurk. Put another way, I try not to crash my bike but I still wear a helmet.
Operating systems now support a variety of tricks to make life difficult for would-be attackers. But most of these hardening features need to be enabled at compile time. When I started contributing to Mosh, I made it a goal to build with full hardening on every platform, not just proactive distributions like Ubuntu. This means detecting available hardening features at compile time.
Mosh uses Autotools, so this code is naturally part of the Autoconf script. I know that Autotools has a bad reputation in some circles, and I'm not going to defend it here. But a huge number of existing projects use Autotools. They can benefit today from a drop-in hardening recipe.
I've published an example project which uses Autotools to detect and enable some binary hardening features. To the extent possible under law, I waive all copyright and related or neighboring rights to the code I wrote for this project. (There are some third-party files in the m4/ subdirectory; those are governed by the respective licenses which appear in each file.) I want this code to be widely useful, and I welcome any refinements you have.
This article explains how my auto-detection code works, with some detail about the hardening measures themselves. If you just want to add hardening to your project, you don't necessarily need to read the whole thing. At the end I talk a bit about the performance implications.
How it works
The basic idea is simple. We use AX_CHECK_{COMPILE,LINK}_FLAG from the Autoconf Archive to detect support for each feature. The syntax is
AX_CHECK_COMPILE_FLAG(flag, action-if-supported, action-if-unsupported, extra-flags)
For extra-flags we generally pass -Werror so the compiler will fail on unrecognized flags. Since the project contains both C and C++ code, we check each flag once for the C compiler and once for the C++ compiler. Also, some flags depend on others, or have multiple alternative forms. This is reflected in the nesting structure of the action-if-supported and action-if-unsupported blocks. You can see the full story in configure.ac.
We accumulate all the supported flags into HARDEN_{C,LD}FLAGS and substitute these into each Makefile.am. The hardening flags take effect even if the user overrides CFLAGS on the command line. To explicitly disable hardening, pass
./configure --disable-hardening
A useful command when testing is
grep HARDEN config.log
Complications
Clang will not error out on unrecognized flags, even with -Werror. Instead it prints a message like
clang: warning: argument unused during compilation: '-foo'
and continues on blithely. I don't want these warnings to appear during the actual build, so I hacked around Clang's behavior. The script wrap-compiler-for-flag-check runs a command and errors out if the command prints a line containing "warning: argument unused". Then configure temporarily sets
CC="$srcdir/scripts/wrap-compiler-for-flag-check $CC"
while performing the flag checks.
When I integrated hardening into Mosh, I discovered that Ubuntu's default hardening flags conflict with ours. For example we set -Wstack-protector, meaning "warn about any unprotected functions", and they set --param=ssp-buffer-size=4, meaning "don't protect functions with fewer than 4 bytes of buffers". Our stack-protector flags are strictly more aggressive, so I disabled Ubuntu's by adding these lines to debian/rules:
export DEB_BUILD_MAINT_OPTIONS = hardening=-stackprotector
-include /usr/share/dpkg/buildflags.mk
We did something similar for Fedora.
Yet another problem is that Debian distributes skalibs (a Mosh dependency) as a static-only library, built without -fPIC, which in turn prevents Mosh from using -fPIC. Mosh can build the relevant parts of skalibs internally, but Debian and Ubuntu don't want us doing that. The unfortunate solution is simply to reimplement the small amount of skalibs we were using on Linux.
The flags
Here are the specific protections I enabled.
-D_FORTIFY_SOURCE=2enables some compile-time and run-time checks on memory and string manipulation. This requires-O1or higher. See alsoman 7 feature_test_macros.-fno-strict-overflowprevents GCC from optimizing away arithmetic overflow tests.-fstack-protector-alldetects stack buffer overflows after they occur, using a stack canary. We also set-Wstack-protector(warn about unprotected functions) and--param ssp-buffer-size=1(protect regardless of buffer size). (Actually, the "-all" part of-fstack-protector-allmight implyssp-buffer-size=1.)Attackers can use fragments of legitimate code already in memory to stitch together exploits. This is much harder if they don't know where any of that code is located. Shared libraries get random addresses by default, but your program doesn't. Even an exploit against a shared library can take advantage of that. So we build a position independent executable (PIE), with the goal that every executable page has a randomized address.
Exploits can't overwrite read-only memory. Some areas could be marked as read-only except that the dynamic loader needs to perform relocations there. The GNU linker flag
-z relroarranges to set them as read-only once the dynamic loader is done with them.In particular, this can protect the PLT and GOT, which are classic targets for memory corruption. But PLT entries normally get resolved on demand, which means they're writable as the program runs. We set
-z nowto resolve PLT entries at startup so they get RELRO protection.
In the example project I also enabled -Wall -Wextra -Werror. These aren't hardening flags and we don't need to detect support, but they're quite important for catching security problems. If you can't make your project -Wall-clean, you can at least add security-relevant checks such as -Wformat-security.
Demonstration
On x86 Linux, we can check the hardening features using Tobias Klein's checksec.sh. First, as a control, let's build with no hardening.
$ ./build.sh --disable-hardening
+ autoreconf -fi
+ ./configure --disable-hardening
...
+ make
...
$ ~/checksec.sh --file src/test
No RELRO No canary found NX enabled No PIE
The no-execute bit (NX) is mainly a kernel and CPU feature. It does not require much compiler support, and is enabled by default these days. Now we'll try full hardening:
$ ./build.sh
+ autoreconf -fi
+ ./configure
...
checking whether C compiler accepts -fno-strict-overflow... yes
checking whether C++ compiler accepts -fno-strict-overflow... yes
checking whether C compiler accepts -D_FORTIFY_SOURCE=2... yes
checking whether C++ compiler accepts -D_FORTIFY_SOURCE=2... yes
checking whether C compiler accepts -fstack-protector-all... yes
checking whether C++ compiler accepts -fstack-protector-all... yes
checking whether the linker accepts -fstack-protector-all... yes
checking whether C compiler accepts -Wstack-protector... yes
checking whether C++ compiler accepts -Wstack-protector... yes
checking whether C compiler accepts --param ssp-buffer-size=1... yes
checking whether C++ compiler accepts --param ssp-buffer-size=1... yes
checking whether C compiler accepts -fPIE... yes
checking whether C++ compiler accepts -fPIE... yes
checking whether the linker accepts -fPIE -pie... yes
checking whether the linker accepts -Wl,-z,relro... yes
checking whether the linker accepts -Wl,-z,now... yes
...
+ make
...
$ ~/checksec.sh --file src/test
Full RELRO Canary found NX enabled PIE enabled
We can dig deeper on some of these. objdump -d shows that the unhardened executable puts main at a fixed address, say 0x4006e0, while the position-independent executable specifies a small offset like 0x9e0. We can also see the stack-canary checks:
b80: sub $0x18,%rsp
b84: mov %fs:0x28,%rax
b8d: mov %rax,0x8(%rsp)
... function body ...
b94: mov 0x8(%rsp),%rax
b99: xor %fs:0x28,%rax
ba2: jne bb4 <c_fun+0x34>
... normal epilogue ...
bb4: callq 9c0 <__stack_chk_fail@plt>
The function starts by copying a "canary" value from %fs:0x28 to the stack. On return, that value had better still be there; otherwise, an attacker has clobbered our stack frame.
The canary is chosen randomly by glibc at program start. The %fs segment has a random offset in linear memory, which makes it hard for an attacker to discover the canary through an information leak. This also puts it within thread-local storage, so glibc could use a different canary value for each thread (but I'm not sure if it does).
The hardening flags adapt to any other compiler options we specify. For example, let's try a static build:
$ ./build.sh LDFLAGS=-static
+ autoreconf -fi
+ ./configure LDFLAGS=-static
...
checking whether C compiler accepts -fPIE... yes
checking whether C++ compiler accepts -fPIE... yes
checking whether the linker accepts -fPIE -pie... no
...
+ make
...
$ file src/test
src/test: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux),
statically linked, for GNU/Linux 2.6.26, not stripped
$ ~/checksec.sh --file src/test
Partial RELRO Canary found NX enabled No PIE
We can't have position independence with static linking. And checksec.sh thinks we aren't RELRO-protecting the PLT — but that's because we don't have one.
Performance
So what's the catch? These protections can slow down your program significantly. I ran a few benchmarks for Mosh, on three test machines:
A wimpy netbook: 1.6 GHz Atom N270, Ubuntu 12.04
i386A reasonable laptop: 2.1 GHz Core 2 Duo T8100, Debian sid
amd64A beefy desktop: 3.0 GHz Phenom II X6 1075T, Debian sid
amd64
In all three cases I built Mosh using GCC 4.6.3. Here's the relative slowdown, in percent.
| Protections | Netbook | Laptop | Desktop |
|---|---|---|---|
| Everything | 16.0 | 4.4 | 2.1 |
| All except PIE | 4.7 | 3.3 | 2.2 |
| All except stack protector | 11.0 | 1.0 | 1.1 |
PIE really hurts on i386 because data references use an extra register, and registers are scarce to begin with. It's much cheaper on amd64 thanks to PC-relative addressing.
There are other variables, of course. One Debian stable system with GCC 4.4 saw a 30% slowdown, with most of it coming from the stack protector. So this deserves further scrutiny, if your project is performance-critical. Mosh doesn't use very much CPU anyway, so I decided security is the dominant priority.
Does this work on the BSDs?
ReplyDeleteThe Autoconf code runs without error on all of the supported Mosh platforms, including FreeBSD. I'm not sure what set of protections you get on FreeBSD.
ReplyDeleteI just tested it on FreeBSD 9.0 for amd64 and we do get the full set of hardening features. This is using GCC 4.2.1 and GNU ld 2.17.50.
ReplyDeleteI believe for Mac OS X you need to use -Wl,-pie even with gcc. If you use "gcc -v" when linking on Mac OS X you won't see -pie being passed to the linker. Here's the output of "otool -h" on an executable built with -pie in LDFLAGS on Mac OS X 10.6:
ReplyDeleteMach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x80 2 11 1776 0x00000085
Compare that to one built with -Wl,-pie
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
0xfeedfacf 16777223 3 0x80 2 11 1776 0x00200085
In /usr/include/mach-o/loader.h we find this:
#define MH_PIE 0x200000 /* When this bit is set, the OS will
load the main executable at a
random address. Only used in
MH_EXECUTE filetypes. */
@Todd C. Miller: Thanks, we'll look into that!
ReplyDeleteI think you'll find that it is not just clang that gives a non-fatal warning for unrecognized options. The HP-UX C compiler does this too. What you really want is to toggle the value of ac_c_werror_flag / ac_cxx_werror_flag like autoconf's -g checks do. Rather that twiddle ac_c_werror_flag directly I use AC_LANG_WERROR but since there is no (exported) way to restore ac_c_werror_flag the checks need to be at the end of the configure script. This is not a huge deal if you are careful, but it sure would be nice to be able to just toggle werror.
ReplyDeletePut another way, I try not to crash my bike but I still wear a helmet.You can learn more: China Travel Agency | China tour operator | China tour packages
ReplyDelete