Debugger features such as breakpoints go well beyond the normal ways in which programs interact. How can one process force another process to pause at a particular line of code? We can bet that the debugger is using some special features of the CPU and/or operating system.
As I learned about implementing breakpoints, I was surprised to find that these interfaces, while arcane, are not actually very complex. In this article we'll implement a minimal but working breakpoints library in about 75 lines of C code. All of the code is shown here, and is also available on GitHub.
True, most of us will never write a debugger. But breakpoints are useful in many other tools: profilers, compatibility layers, security scanners, etc. If your program has to interact deeply with another process, without modifying its source code, breakpoints might just do the trick. And I think there's value in knowing what your debugger is really doing.
Our code will compile with gcc and will run on Linux machines with 32- or 64-bit x86 processors. I'll assume you're familiar with the basics of UNIX programming in C. We'll make heavy use of the ptrace
system call, which I'll try to explain as we go. If you're looking for other reading about ptrace
, I highly recommend this article and of course the manpage.
The API
Our little breakpoint library is named breakfast
. Here's breakfast.h
:
#ifndef _BREAKFAST_H
#define _BREAKFAST_H
#include <sys/types.h> /* for pid_t */
typedef void *target_addr_t;
struct breakpoint;
void breakfast_attach(pid_t pid);
target_addr_t breakfast_getip(pid_t pid);
struct breakpoint *breakfast_break(pid_t pid, target_addr_t addr);
int breakfast_run(pid_t pid, struct breakpoint *bp);
#endif
Before we dive into the implementation, we'll describe the API our library provides. breakfast
is used by one process (the tracing process) to place breakpoints inside the code of another process (the target process). To establish this relationship, the tracing process calls breakfast_attach
with the target's process ID. This also suspends execution of the target.
We use breakfast_getip
to read the target's instruction pointer, which holds the address of the next instruction it will execute. Note that this is a pointer to machine code within the target's address space. Dereferencing the pointer within our tracing process would make little sense. The type alias target_addr_t
reminds us of this fact.
We use breakfast_break
to create a breakpoint at a specified address in the target's machine code. This function returns a pointer to a breakpoint
structure, which has unspecified contents.
Calling breakfast_run
lets the target run until it hits a breakpoint or it exits; breakfast_run
returns 1
or 0
respectively. On the first invocation, we'll pass NULL
for the argument bp
. On subsequent calls, we're resuming from a breakpoint, and we must pass the corresponding breakpoint
struct pointer. The breakfast_getip
function will tell us the breakpoint's address, but it's our responsibility to map this to the correct breakpoint
structure.
The trap instruction
We will implement a breakpoint by inserting a new CPU instruction into the target's memory at the breakpoint address. This instruction should suspend execution of the target and give control back to the OS.
There are many ways to return control to the OS, but we'd like to minimize disruption to the code we're hot-patching. x86 provides an instruction for exactly this purpose. It's named int3
and is encoded as the single byte 0xCC
. When the CPU executes int3
, it will stop what it's doing and jump to the interrupt 3 service routine, which is a piece of code chosen by the OS. On Linux, this routine will send the signal SIGTRAP
to the current process.
Since we're placing int3
in the target's code, the target will receive a SIGTRAP
. Under normal circumstances this would invoke the target's SIGTRAP
handler, which usually kills the process. Instead we'd like the tracing process to intercept this signal and interpret it as the target hitting a breakpoint. We'll accomplish this through the magic of ptrace
.
Let's start off breakfast.c
with some includes:
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <sys/wait.h>
#include <stdlib.h>
#include "breakfast.h"
Then we'll define our trap instruction:
#if defined(__i386)
#define REGISTER_IP EIP
#define TRAP_LEN 1
#define TRAP_INST 0xCC
#define TRAP_MASK 0xFFFFFF00
#elif defined(__x86_64)
#define REGISTER_IP RIP
#define TRAP_LEN 1
#define TRAP_INST 0xCC
#define TRAP_MASK 0xFFFFFFFFFFFFFF00
#else
#error Unsupported architecture
#endif
We use some GCC pre-defined macros to discover the machine architecture, and we define a few constants accordingly. REGISTER_IP
expands to a constant from sys/reg.h
which identifies the machine register holding the instruction pointer.
The trap instruction is stored as an integer TRAP_INST
and its length in bytes is TRAP_LEN
. These are the same on 32- and 64-bit x86. The trap instruction is a single byte, but we'll read and write the target's memory in increments of one machine word, meaning 32 or 64 bits. So we'll read 4 or 8 bytes of machine code, clear out the first byte with TRAP_MASK
, and substitute 0xCC
. Since x86 is a little-endian architecture, the first byte in memory is the least-significant byte of an integer machine word.
Invoking ptrace
All of the various ptrace
requests are made through a single system call named ptrace
. The first argument specifies the type of request and the meaning of any remaining arguments. The second argument is almost always the process ID of the target.
Before we can mess with the target process, we need to attach to it:
void breakfast_attach(pid_t pid) {
int status;
ptrace(PTRACE_ATTACH, pid);
waitpid(pid, &status, 0);
ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_TRACEEXIT);
}
The PTRACE_ATTACH
request will stop the target with SIGSTOP
. We wait for the target to receive this signal. We also request to receive a notification when the target exits, which I'll explain below.
Note that ptrace
and waitpid
could fail in various ways. In a real application you'd want to check the return value and/or errno
. For brevity I've omitted those checks in this article.
We use another ptrace
request to get the value of the target's instruction pointer:
target_addr_t breakfast_getip(pid_t pid) {
long v = ptrace(PTRACE_PEEKUSER, pid, sizeof(long)*REGISTER_IP);
return (target_addr_t) (v - TRAP_LEN);
}
Since the target process is suspended, it's not running on any CPU, and its instruction pointer is not stored in a real CPU register. Instead, it's saved to a "user area" in kernel memory. We use the PTRACE_PEEKUSER
request to read a machine word from this area at a specified byte offset. The constants in sys/regs.h
give the order in which registers appear, so we simply multiply by sizeof(long)
.
After we hit a breakpoint, the saved IP points to the instruction after the trap instruction. When we resume execution, we'll go back and execute the original instruction that we overwrote with the trap. So we subtract TRAP_LEN
to give the true address of the next instruction.
Creating breakpoints
We need to remember two things about a breakpoint: the address of the code we replaced, and the code which originally lived there.
struct breakpoint {
target_addr_t addr;
long orig_code;
};
To enable a breakpoint, we save the original code and insert our trap instruction:
static void enable(pid_t pid, struct breakpoint *bp) {
long orig = ptrace(PTRACE_PEEKTEXT, pid, bp->addr);
ptrace(PTRACE_POKETEXT, pid, bp->addr, (orig & TRAP_MASK) | TRAP_INST);
bp->orig_code = orig;
}
The PTRACE_PEEKTEXT
request reads a machine word from the target's code address space, which is named "text" for historical reasons. PTRACE_POKETEXT
writes to that space. On x86 Linux there's actually no difference between code and data spaces, so PTRACE_PEEKDATA
and PTRACE_POKEDATA
would work just as well.
Creating a breakpoint is straightforward:
struct breakpoint *breakfast_break(pid_t pid, target_addr_t addr) {
struct breakpoint *bp = malloc(sizeof(*bp));
bp->addr = addr;
enable(pid, bp);
return bp;
}
To disable a breakpoint we just write back the saved word:
static void disable(pid_t pid, struct breakpoint *bp) {
ptrace(PTRACE_POKETEXT, pid, bp->addr, bp->orig_code);
}
Running and stepping
Once we attach to the target, its execution stops. Here's how to resume it:
static int run(pid_t pid, int cmd); /* defined momentarily */
int breakfast_run(pid_t pid, struct breakpoint *bp) {
if (bp) {
ptrace(PTRACE_POKEUSER, pid, sizeof(long)*REGISTER_IP, bp->addr);
disable(pid, bp);
if (!run(pid, PTRACE_SINGLESTEP))
return 0;
enable(pid, bp);
}
return run(pid, PTRACE_CONT);
}
We ask ptrace
to continue execution — but if we're resuming from a breakpoint, we have to do some cleanup first. We rewind the instruction pointer so that the next instruction to execute is at the breakpoint. Then we disable the breakpoint and make the target execute just one instruction. Once we're past the breakpoint, we can re-enable it for next time. run
returns 0
if the target exits, which theoretically could happen during our single-step.
The last piece of the puzzle is run
:
static int run(pid_t pid, int cmd) {
int status, last_sig = 0, event;
while (1) {
ptrace(cmd, pid, 0, last_sig);
waitpid(pid, &status, 0);
if (WIFEXITED(status))
return 0;
if (WIFSTOPPED(status)) {
last_sig = WSTOPSIG(status);
if (last_sig == SIGTRAP) {
event = (status >> 16) & 0xffff;
return (event == PTRACE_EVENT_EXIT) ? 0 : 1;
}
}
}
}
cmd
is either PTRACE_CONT
or PTRACE_SINGLESTEP
. In the latter case, the OS will set a control bit to make the CPU raise interrupt 3 after one instruction has completed.
ptrace
will let the target run until it exits or receives a signal. We use the wait
macros to see what happened. If the target exited, we simply return 0
. If it received a signal, we check which one. Anything other than SIGTRAP
should be delivered through to the target, which we accomplish by passing the signal number to the next ptrace
call.
In the case of SIGTRAP
, we check bits 16-31 of status
for the value PTRACE_EVENT_EXIT
, which indicates that the target is about to exit. Recall that we requested this notification by setting the option PTRACE_O_TRACEEXIT
. You'd think (at least, I did) that checking WIFEXITED
should be enough. But I ran into a problem where delivering a fatal signal to the target would make the tracing process loop forever. I fixed it by adding this extra check. I'm certainly no ptrace
expert and I'd be interested to hear more about what's going on here.
Testing it
That's all for breakfast.c
! Now let's write a small test.c
.
#include <sys/types.h>
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
#include "breakfast.h"
void child();
void parent(pid_t);
int main() {
pid_t pid;
if ((pid = fork()) == 0)
child();
else
parent(pid);
return 0;
}
We fork two separate processes, a parent and a child. The child will do some computation we hope to observe. Let's make it compute the famous factorial function:
int fact(int n) {
if (n <= 1)
return 1;
return n * fact(n-1);
}
void child() {
kill(getpid(), SIGSTOP);
printf("fact(5) = %d\n", fact(5));
}
The parent is going to invoke PTRACE_ATTACH
and send the child SIGSTOP
, but the child might finish executing before the parent gets a chance to call ptrace
. So we make the child preemptively stop itself. This isn't a problem when attaching to a long-running process.
Actually, for this fork-and-trace-child pattern, we should be using PTRACE_TRACEME
. But we implemented a minimal library, so we work with what we have.
The parent uses breakfast
to place breakpoints in the child:
void parent(pid_t pid) {
struct breakpoint *fact_break, *last_break = NULL;
void *fact_ip = fact, *last_ip;
breakfast_attach(pid);
fact_break = breakfast_break(pid, fact_ip);
while (breakfast_run(pid, last_break)) {
last_ip = breakfast_getip(pid);
if (last_ip == fact_ip) {
printf("Break at fact()\n");
last_break = fact_break;
} else {
printf("Unknown trap at %p\n", last_ip);
last_break = NULL;
}
}
}
In principle we can use breakfast
to trace any running process we own, even if we don't have its source code. But we still need a way to find interesting breakpoint addresses. Here it's as easy as we could want: the child forked from our process, so we get the address of its fact
function just by taking the address of our fact
function.
Let's try it out:
$ gcc -Wall -o test test.c breakfast.c
$ ./test
Break at fact()
Break at fact()
Break at fact()
Break at fact()
Break at fact()
fact(5) = 120
We can see the five calls to fact
.
Inspecting the target
The ability to count function calls is already useful for performance profiling. But we'll usually want to get more information out of the stopped target. Let's see if we can read the arguments passed to fact
. This part will be specific to 64-bit x86, though the idea generalizes.
Each architecture defines a C calling convention, which specifies how function arguments are passed, often using a combination of registers and stack slots. On 64-bit x86, the first argument is passed in the RDI
register. You can verify this by running objdump -d test
and looking at the disassembled code for fact
.
So we'll modify test.c
to read this register:
#include <sys/ptrace.h>
#include <sys/reg.h>
...
if (last_ip == fact_ip) {
int arg = ptrace(PTRACE_PEEKUSER, pid, sizeof(long)*RDI);
printf("Break at fact(%d)\n", arg);
...
And we get:
$ gcc -Wall -o test test.c breakfast.c
$ ./test
Break at fact(5)
Break at fact(4)
Break at fact(3)
Break at fact(2)
Break at fact(1)
fact(5) = 120
Cool!
Notes
This is my first non-trivial project using ptrace
, and it's likely I got some details wrong. Please reply with corrections or advice and I'll update the code if necessary.
This code is available for download at GitHub. It's released under a BSD license, which imposes few restrictions. It's definitely not suitable as-is for production use, but feel free to take the code and build something more with it.
The x86 architecture has dedicated registers for setting breakpoints, subject to various limitations. We ignored this feature in favor of the more flexible software breakpoints. Hardware breakpoints can do some things this technique can't, such as breaking on reads from a particular memory address.
Nice post.
ReplyDeleteOne subtlety: It seems that replacing a long's worth in disable() means that bad things could happen when two breakpoints are within sizeof(long)-1 of each other. To avoid this I think it suffices for disable() to replace only the single byte occupied by the int3 instruction.
still don't know what happens in:
ReplyDeleteptrace(PTRACE_POKETEXT, pid, bp->addr, (orig & TRAP_MASK) | TRAP_INST);
overwrite this single assemble code ? what happens?
record pc, jump to int3, and then jump back?
Sophia,
ReplyDeleteThat is replacing the instruction at the breakpoint address with the trap machine instruction int3.
ptrace(PTRACE_POKETEXT, /* write into data area */
pid, /* process identifier */
bp->addr, /* address of the breakpoint */
(orig & TRAP_MASK) | TRAP_INST); /* see below */
Here, the variable orig contains the original instruction data at our desired breakpoint address. For x86, this is 4 bytes. Imagine orig contains the value 0xDEADBEEF (some arbitrary machine instruction), then orig & TRAP_MASK will evaluate to 0xDEADBE00. 0xDEADBE00 | TRAP_INST is the same as 0XDEADBE00 | 0xCC since TRAP_INST is defined as 0xCC (the int3 machine instruction). This expression will evaluate to 0xDEADBECC.
Because of an oddity with x86 (it's little endian), the bytes are actually reversed when they're written back. So the value 0xCCBEADDE will be written to memory at our breakpoint address. The end result is that the x86 instruction int3, our "break" instruction, is now located at the address we wanted to break at: bp->addr.
We are situated in Rugby, North Dakota, not a long way from Minot International Airport. We offer conveyance anyplace in the United States. You can likewise get your furbaby from our cattery or airport..our favored decision. During the time we will have an assortment of excellent Standard Munchkin Kitten for sale accessible including silvers.
ReplyDeletehttps://munchkinskitty.company.com
Common tags: munchkin cat breeder, munchkin cat breeders near me, munchkin cats for sale near me, munchkin cat adoption, munchkin cat price, munchkin kitty for sale, buy munchkin kitty, buy munchkin kitty online, buy munchkin kitty for charismas, best kitty for charismas gifts, munchkin kitty, munchkin cat for sale
munchkin cat breeder
munchkin cat breeders near me
munchkin cats for sale near me
munchkin cat adoption
munchkin cat for sale
munchkin kittens for sale
munchkin cats for sale
standard munchkin kittens for sale
munchkin cat price
munchkin breeders
munchkin cats kittens
munchkin kitty for sale
munchkin cat breeder
munchkin cat for sale ohio
https://munchkinskitty.company.com
munchkin cat near me
munchkin cat forsale
munchkin cat for sale
This is my blog. Click here.
ReplyDeleteรวมเทคนิคเล่นสล็อตออนไลน์
ReplyDeleteان الخزانات من اكثر الاشياء التي يجب الاهتمام بها ومتابعتها والحرص علي نظافتها ,لذلك فان عزل الخزانات يساعد بشكل كبير في التخلص من مسببات تلف الخزانات وعدم تأثر المياه بدرجة الحرارة واشعة الشمس. حيث تحافظ علي صحة وسلامة السكان من التلوث الذي يحدث نتيجة الصدأ او البكتيريا المتكونة بسبب عدم غسل الخزانات بإستمرار , لذلك يجب عليك عزل الخزانات حفاظا علي صحتك وصحة افراد عائلتك , وشركتنا تتخصص في عزل الخزانات بأفضل واحدث الطرق والتي نضمن من خلالها امان تام للمياه ,وحيث نستخدم مواد آمنة جدا بحيث لا تتأثر بوجود المياه حولها.
شركة عزل خزانات بتبوك
انتشار الحشرات بالمنزل يتسبب لنا عميلنا العزيز في الكثير من الأضرار؛ لذلك يكون التخلص منها أمر ضرروي للغاية، ويجب السعي له بكل الطرق، ونحاول استخدام كل الطرق والمبيدات الحشرية للتخلص من الحشرات المنتشرة داخل المنزل، ولكن في الكثير لا نصل لنتائج مرضية لنا، لذلك تقدم شركة مكافحة حشرات بتبوك افضل الخدمات باقل الاسعار
شركة مكافحة حشرات بتبوك
يعاني كثير من السيدات في أثناء تنظيف المنزل من ضيق الوقت وعدم كفايته للانتهاء من تنظيف المنزل بسرعة وبشكل كامل، ويبحثن عن طرق للقيام بالمهام بسرعة أكبر للتغلب على هذه المشكلة
شركة منزل الشموخ لتقديم خدمات التنظيف بارخص الاسعار وبعاملة مدربة، فنحن افضل شركة من خلال تخفيض الأسعار وليس تخفيض الاسعار فقط ولكن الجوده والخامه والحصول علي افضل نتائج حتي نبني سمعة طيبة عن شركتنا من خلال عملنا فقط لأننا رواد في مجالنا
شركة تنظيف بتبوك
ต้อนรับเดือนใหม่กันเเบบสวยๆ ขอเชิญเกมเมอร์พบกับเกมอัพเดทใหม่ล่าสุด เกมออกใหม่ เดือนมีนาคม 2022 ดูเพิ่มเติม เกมมากมายหลากหลายเเนวที่เหล่าเกมเมอร์ไม่ควรพลาด เกมไหนน่าโดน เกมไหนน่าเล่น ตามเรามาดูเลย รับรองว่าถูกใจคอเกมอย่างเเน่นอน เเก่เบื่อได้ดี กักตัวเเบบไม่มีเซ็งอย่างเเน่นอน.
ReplyDeleteGutt Websäit : Zonahobisaya
ReplyDeleteGutt Websäit : Terdalam
Gutt Websäit : Zonahobisaya
Gutt Websäit : Zonahobisaya
Gutt Websäit : Logo
Gutt Websäit : Zonahobisaya
Gutt Websäit : lambang
Gutt Websäit : Zonahobisaya
총판출장샵
ReplyDelete총판출장샵
고고출장샵
심심출장샵
서울출장샵
서울출장샵
홍천출장샵
서울출장샵
Pg Slot เครดิตฟรี หรือ มีกิจกรรมที่แจกเครดิตฟรี PG SLOT มีโปรโมชั่นและก็ยังมีแอดมินทำงานที่จะรอเอาใจใส่ดูแลผู้เล่นตลอด 24ชั่วโมง ฝากไม่มีอย่างต่ำที่เว็บของเรา Pg-Slot.Game
ReplyDeleteแพนด้า 555 เป็นสล็อตเว็บไซต์ตรงแท้ๆที่ผมขอบอกเลยจ้าครับผมว่า ไม่มีเว็บไซต์ตรงไหนแจกเครดิตฟรีให้ท่านได้เท่า แพนด้า อีกแล้วนะครับ pgslot ถูกใจเครดิตฟรีจุกๆอย่างไม่ต้องสงสัย
ReplyDeletepg soft เครดิต ฟรี ปลดล็อกความตื่นเต้นของเกมออนไลน์กับ PG Soft เครดิตฟรี เรียนรู้วิธีเพิ่มประสิทธิภาพการเล่นเกม PG SLOT รับเครดิตฟรี และดำดิ่งสู่โลกแห่งความบันเทิงอันน่าตื่นเต้น
ReplyDeleteslot pg new game ของนักพนันตัวจริง เกมสุดมัน ตื่นเต้น ชนะง่าย PG SLOT เกมสล็อตแตกง่าย ตฝ่าด่าน ไปกับ PG สล็อต เอาอกเอาใจเกมเมอร์ ทั่วทุกมุมโลก เกมสล็อตออนไลน์ ที่เล่นแล้ว
ReplyDelete단밤콜걸
ReplyDelete콜걸
서울콜걸
부산콜걸
인천콜걸
광주콜걸
세종콜걸
울산콜걸
ما هي شركة سيتي جروب العربية السعودية!
ReplyDelete
ReplyDeleteGreat site, continue the good work!
Keep up the wonderful works guys I've included you guys to my own blogroll.
ReplyDelete