(Side note, this blog post was written and then forgotten about quite a while ago, So I’ve finished it off, it was halfway done.)
While on holiday in the evenings with downtime to spare, I had realised that I have not really done any hardcore disassemablly for quite a long time, that was a shame since most of my knowledge of x86 ASM and architecture came from tearing down old ( and new sometimes ) and learning what they did and all of the surprises along the way.
Lucky for me I still had ( despite being a very long way from home ) all of my test samples that I have gathered from over the years backed up and ready to go. Looking though the list I had to pick from. I chose a sample called Trojan.DOS.Erase26
. Looking that name up on google does not bring up much other than a few listings of “our AntiVirus product blocks this” or “here is where you can download this sample”. One of the great shames and losses is that sites that used to explain what malware did are either being archived, or being shutdown, making it really quite hard to figure out what a sample does (other than the name)
First thing I generally do with malware is to test that it works, lots of samples for reasons I have never really understood do not work at all or just hang. This one didn’t. This is what the payload looks like when it runs:
YAM… Yam, Huh? That is strange, at the end of the YAM sled, the machine hangs and there is no way to back out other than restarting the VM. Once the VM restarts. No visible damage is done.
Anyway to find out what this malware is doing. I chose to load it up in IDA to see what’s up, For the first run, just normal settings:
Selecting “No” on the 32/16bit question, since MSDOS uses Realmode, nearly all software is 16 bit.
One of the fun things about going back in time to take apart malware is that you have to figure out what file format the file is in, Most small utilities (and by proxy, malware) present themselves as .COM files. These files choose the CP/M style of execution that basically goes along the line of:
A) Read file from Disk B) Load that file 100% into a place in RAM C) Jump into that area in RAM
No validation is done for COM files as they are loaded, if someone was to rename a .TXT file into a .COM file, and them proceeded to execute it, they would find that they computer would most likely hang, or do other very strange behaviour.
Because of this, there is no way to off the bat “discover” that a file is a COM file. Because it is just a binary blob that is ran on the CPU. So IDA has a little trouble starting off:
However since we already know that COM files are loaded into RAM and just run. It’s just as simple as going to the beginning of the file in IDA and pressing C
With that IDA will start running though the whole program and decode as much as it can follow though.
This normally works well for nearly all executables since they are produced by compilers and while compilers are smart they generally do not do things that confuse other systems, However it became clear that this file was hand assembled.
The analysis stopped almost right away and left nearly none of the executable visible, this was because the person who wrote this file did a nice trick to both confuse people, and save space in his program…
Shortly after the initial jump (that lands at loc_11
) we set a few registers (with the mov’s) and then jump somewhere else. The smart part of this is the loc_14+1
part that IDA points out, the binary has two meanings.
The writer jumps to a offset of 1 on the same area where the instructions binary when offset, have a different meaning, a meaning still useful to the program.
seg000:0012 loc_12:
seg000:0012 jmp short near ptr loc_1C+1
seg000:0015 ; ---------------------------------------------------------------------------
seg000:0015 add ax, 0EBFEh ; <<<<<<<<------- Entry point we talked about above
seg000:0018 cld
seg000:0019 add ah, 3Bh ; ';'
seg000:001C
seg000:001C loc_1C:
seg000:001C jmp short loc_12
seg000:001C ; ---------------------------------------------------------------------------
In this different “view” of the binary we see that we set a few registers and then Clear the Direction flag using CLD
. after that we jump to loc_12
in the program that shifts the program back into the original “mode” that is was in.
Once we jump back there we are into the main run of the program, from here things are fairly simple,
The HLT
opcode is a fairly deadly one if no interrupts have been set, however in DOS there are a few timers setup to do timer tasks, so when calling this opcode this stalls the CPU for a short while and the resumes it on the next opcode (this is the delay you see when printing out YAMYAMYAMYA-)
The PUSH
and POP
for cs
-> ds
is just a way to copy over the contents of cs
into ds
, after that the two MOV
’s set the values of dx
and ax
for the INT
call to be ran.
MSDOS uses software interrupts as syscalls, so nicely for us, IDA has documented them for us in the disassembally it’s provided. Here we can see that two syscalls, one to print “YAM” (That points to memory at tip of the binary. Intesting that it only points to the end of the string. The larger part of the string is: PhuckedYAM
)
Another interesting to observe is that the call to write to the disk has a little bit of set up needed in it, the parts that I am mainly interested in is where it is writing and to what disk. My suspicion to why it is not killing the HDD of the VM is that I am running the sample in a read only VFAT drive from QEMU. Since these are created on runtime. They are not affected by a system trashing the first sector, or infact, in this case, since it hangs the machine I would not be able to even observe the file system damage if it has done any.
So. I moved over the image to C: ( a writeable area ) and ran it again, and get greeted with this:
ben@metropolis:~/Documents/yampost$ qemu-system-x86_64 -hda 31 -vnc :0 -sdl
qemu: fatal: Trying to execute code outside RAM or ROM at 0x00000000000a79d1
EAX=00000000 EBX=00000000 ECX=000f4241 EDX=000e0000
ESI=000010df EDI=0000432e EBP=0000d4a1 ESP=00000859
EIP=0000d4a1 EFL=00007246 [---Z-P-] CPL=0 II=0 A20=0 SMM=0 HLT=0
ES =4d4f 0004d4f0 0000ffff 00009300
CS =9a53 0009a530 0000ffff 00009a00
SS =1a71 0001a710 0000ffff 00009300
DS =1a71 0001a710 0000ffff 00009300
FS =0000 00000000 0000ffff 00009300
GS =0000 00000000 0000ffff 00009300
LDT=0000 00000000 00000000 00008200
TR =0018 8000dd74 00002069 00008900
GDT= 800370a4 0000010f
IDT= 00000000 0000ffff
CR0=00000010 CR2=000a1000 CR3=00000000 CR4=00000000
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
CCS=00000030 CCD=00000000 CCO=LOGICW
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=d08b1ffffff74eb6 404d FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
Aborted (core dumped)
Okay…
So what actually changed on the disk?
ben@metropolis:~/Documents/yampost$ diff A B
469097c469097
< 008090d0 00 00 00 00 00 00 78 7b b1 46 4d 13 ef 00 00 00 |......x{.FM.....|
---
> 008090d0 00 00 00 00 00 00 9c 7c b1 46 4d 13 ef 00 00 00 |.......|.FM.....|
469101c469101
< 00809110 00 00 00 00 00 00 78 7b b1 46 4e 13 be 16 00 00 |......x{.FN.....|
---
> 00809110 00 00 00 00 00 00 9c 7c b1 46 4e 13 be 16 00 00 |.......|.FN.....|
469103c469103
< 00809130 00 00 00 00 00 00 78 7b b1 46 51 13 e7 24 00 00 |......x{.FQ..$..|
---
> 00809130 00 00 00 00 00 00 9c 7c b1 46 51 13 e7 24 00 00 |.......|.FQ..$..|
469105c469105
< 00809150 00 00 00 00 00 00 78 7b b1 46 56 13 c8 05 00 00 |......x{.FV.....|
---
> 00809150 00 00 00 00 00 00 9c 7c b1 46 56 13 c8 05 00 00 |.......|.FV.....|
469107c469107
< 00809170 00 00 00 00 00 00 78 7b b1 46 57 13 2c 00 00 00 |......x{.FW.,...|
---
> 00809170 00 00 00 00 00 00 9c 7c b1 46 57 13 2c 00 00 00 |.......|.FW.,...|
469109c469109
< 00809190 00 00 00 00 00 00 78 7b b1 46 58 13 c8 05 00 00 |......x{.FX.....|
---
> 00809190 00 00 00 00 00 00 9c 7c b1 46 58 13 c8 05 00 00 |.......|.FX.....|
469115c469115
< 008091f0 00 00 00 00 00 00 79 7b b1 46 e5 04 00 00 30 00 |......y{.F....0.|
---
> 008091f0 00 00 00 00 00 00 9c 7c b1 46 e5 04 00 00 30 00 |.......|.F....0.|
576405c576405
< 009ce450 00 00 00 00 00 00 77 7b b1 46 00 00 00 00 00 00 |......w{.F......|
---
> 009ce450 00 00 00 00 00 00 99 7c b1 46 00 00 00 00 00 00 |.......|.F......|
Not much and the system still boots. Is this the malware being defective?
Looking back at the final stage (that the debugger can decode) it calls a int that isnt used, and that does not seem to be defined later on. Maybe this looked for an extension that a driver might have loaded?
After very large amounts of digging in GDB for what happens, it remains a mystery to me.
Here is the sample I’ve been working off, if anyone happens to know what is actually going on, do let me know!
My real suspect at this point is that the malware has been nulled out to be less destructive.
Until next time.