Remote debugging of real mode code with gdb

08/2011

At work, I recently had to debug an old MS-DOS application that was running under QEMU. QEMU exhibited strange graphical behaviours with this particular application: only the upper half of the screen seemed to be correctly displayed, the other half left blank. The same program works perfectly well on a "physical" old machine. Other facts: it was exactly the same with bochs, but worked well with dosbox. For various reasons the program had to work with QEMU, I could not rely on dosbox.

The other difficulty was that I did not have access to the program's sources. It was apparently built with a DOS-based ancestor of Windev, called "Hyper Screen".

I have been using two complementary approaches to find and fix the problem: debugging and disassembling. The first one allows to focus on a short portion of code where the problem might be located. Once located, the use of a disassembler becomes helpful.

Special mode for GDB

About the debugging part, and because I run the program through QEMU, I had access to the debugging infrastructure offered by QEMU. In particular, it implements a gdb server, meaning a gdb client can connect to it and interact with the emulated CPU.

For that purpose, QEMU may be run with the following flags -S -s: it will freeze the cpu and wait for a gdb client to connect on localhost:1234. (use the target remote localhost:1234 command inside gdb)

The problem is that we want to debug real mode code. This a very annoying mode of the Intel processor, where registers and data are 16 bits wide and 20 bits (well ... 21 sometimes) of address are available. In this mode, every reference to a physical address in memory is made by the use of two pointers: a segment and an offset inside this segment. Each of them is 16 bits wide, and the resulting physical address is obtained by segment * 16 + offset.

GDB and the gdb remote server implementations of QEMU (and bochs) has poor support for this mode. For instance, it does not know anything about segmented memory access and only considers linear memory addresses. That is why you always have to switch between these two representations of memory.

Fortunately, GDB can be scripted quite easily. That is what I did to feel more confortable with this old piece of code. It has been largely inspired by this. The result is very close to what the author displays in one of its post here (in french).

For gdb, the current instruction pointer is given by the register eip. However, in real mode, you also have to consider the code segment register cs. The current instruction in real mode is located in memory at an address pointed to by cs:ip.

When you want to add a breakpoint at a given address inside the code, gdb (and the QEMU/bochs remote server) will break when the value of eip reaches this particular address. There are then two problems:

when you enable a breakpoint on an address x, your emulated code might break on every address s:x, where s is an arbitrary segment address. This is very unlikely.
you must know the content of cs when a code is executed before you can place a breakpoint.

For example, suppose the current values of cs and ip are respectively 0xF000 and 0xE384 and that you have asked the disassembling of cs * 16 + ip , gdb will display something like :

0xfe384       xor ax,ax
0xfe385       call 0xfe395
0xfe388       ...
...
0xfe395

Suppose now that you want to add a breakpoint just after the return of the function, you will have to set a breakpoint at 0xe388 and not 0xfe388. But if you are inspecting a code in memory, you cannot be sure of the offset part of the address. For instance, here you would have the very same display with cs = 0xF100 and ip = 0xD384.

Hereafter is an extract of a debugging session using this special gdb mode.

---------------------------[ STACK ]---
0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 
---------------------------[ DS:SI ]---
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
---------------------------[ ES:DI ]---
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
----------------------------[ CPU ]----
AX: 0000 BX: 0000 CX: 0000 DX: 0633
SI: 0000 DI: 0000 SP: 0000 BP: 0000
CS: F000 DS: 0000 ES: 0000 SS: 0000

IP: C41F EIP:0000C41F
CS:IP: F000:C41F (0xFC41F)
SS:SP: 0000:0000 (0x00000)
SS:BP: 0000:0000 (0x00000)
OF <0>  DF <0>  IF <0>  TF <0>  SF <0>  ZF <0>  AF <0>  PF <0>  CF <0>
ID <0>  VIP <0> VIF <0> AC <0>  VM <0>  RF <0>  NT <0>  IOPL <0>
---------------------------[ CODE ]----
   0xfc41f:	mov    eax,cr0
   0xfc422:	and    eax,0x9fffffff
   0xfc428:	mov    cr0,eax
   0xfc42b:	cli    
   0xfc42c:	cld     
   0xfc42d:	mov    eax,0x8f
   0xfc433:	out    0x70,al
   0xfc435:	in     al,0x71
   0xfc437:	cmp    al,0x0
   0xfc439:	jne    0xfc44e

The secret lies inside the stop-hook() function of the script: it will be executed each time gdb gets awaken.

You can then play with the classical nexti function that will step to the next instruction.

Some helpful macros have been added:

break_int : adds a breakpoint on a software interrupt vector (the way the good old MS DOS and BIOS expose their APIs)
break_int_if_ah : adds a conditional breakpoint on a software interrupt. AH has to be equals to the given parameter. This is used to filter service calls of interrupts. For instance, you sometimes only wants to break when the function AH=0h of the interruption 10h is called (change screen mode).
stepo : this is a kabalistic macro used to 'step-over' function and interrupt calls. How does it work ? The opcode of the current instruction is extracted and if it is a function or interrupt call, the "next" instruction address is computed, a temporary breakpoint is added on that address and the 'continue' function is called.
step_until_ret : this is used to singlestep until we encounter a 'RET' instruction.
step_until_iret : this is used to singlestep until we encounter an 'IRET' instruction.
step_until_int : this is used to singlestep until we encounter an 'INT' instruction.

Ideas for a better debugging experience

Considering that gdb server implementations still lack the support of interesting features like the hardware watchpoints and do not have a true support for the real mode (always switching from segmented representation to linear representation is really exhausting), why not use a dedicated client that speaks the gdb "protocol" ?

It could not find anything fancy in this direction, except proprietary solutions, like IDAPro. If someone knows better, I would be grateful.

You can find the gdb script here. Copy it to ''~/.gdbinit'' and let the magic happens.

Tools for disassembling

Once I knew an approximate location of the incriminated code, I switched to a disassembler to figure out the overall logic of this particular portion of code.

Here again, I did not find something that helps the analysis of asm code as mush as IDAPro can do. I recently stumbled upon metasm, but did not had time to deeply test.

And the problem was ...

That the VGABios implementation used by QEMU and bochs was missing some obscure functions used in a very uncommon screen mode (640x350): the computation of the number of lines available to text was not good if you selected a non-standard font first ...

It has now been fixed.