Assembly Language Programming and Shellcoding – Hello World

Hello all,

With this post, we will be starting the actual programming. We will be using various things we have seen in previous posts and write a simple hello world program. Additionally, we will analyse the executable in GDB and see how we can use GDB to analyse such codes. Let’s begin:

Important parts/keywords of assembly code:

Comments: To add comment in assembly language, we use semicolon (;) and then the comments. The first two lines of the code are example of comments.

global: GLOBAL is NASM directive. It will export the defined symbol in object code. The linker will read that symbol and its value in object code file. Then it will define where to put that symbol in the actual executable. What does this mean? Very simply:

  1. _start is marked GLOBAL symbol.
  2. Linker ld will put _start and its value in object code.
  3. ld knows that _start is the symbol from which execution of program must start. Hence it will accordingly link the files.

_start: As we have seen above, _start will decide from where execution flow starts. It is as good as main() function of C/C++.

section: Section keyword is used for defining memory segments which we have seen in previous posts. In this program we have used two segments: TEXT segment (used for keeping code) and DATA segment (used for defining variables).

General Code Execution

Normally code execution includes following steps:

  1. Initialization of program
  2. Jumps of program
  3. Graceful exit once everything done.

In above code as well,

  1. Initialization starts from _start,
  2. Defined variables will be fetched from .data section,
  3. Hello world will be printed on screen
  4. Program will be exited gracefully

Now to write this code we have to understand two syscalls: 1. Write 2. Exit

As discussed in previous blogpost, we can find these syscall numbers in following file:

/usr/src/linux-headers-4.13.0-36/arch/sh/include/uapi/asm/unistd_32.h

Now lets look into these syscalls in more detail:

WRITE Syscall: As we have seen in earlier post, Linux Programmer Manual (or Man pages) are best source of info. By executing man 2 write, we will get the details on how the syscall is used.

As we can see, we have to find couple of things as below:

  • Syscall number for WRITE: From unistd_32.h, we know that it is “4”.
  • File descriptor fd: This can be 0, 1, 2 for standard input, standard output and standard error, respectively. Since we are interested in output, it will be “1”.
  • Buffer name buf: This is name of the string we have defined in program. In our case it is “msg”
  • Size of buffer count: Now this is tricky. We can manually count the length of buffer. But for runtime calculation, we will use $-msg. Here $ represents the current location of assembler. What $-msg will do is subtract location of msg from current location of assembler, which is effectively length of buffer.

Exit Syscall: Exit syscall Manpage is shown below:

For successful exit, we need following details:

  • Syscall number for Exit syscall: From unistd_32.h, we know that it is “1”.
  • Status: Based on requirement, you can pass 0 (Exit_Success) or 1 (Exit_Failure). I’ll pass “0”.

As we have all required details, we are ready to write code.

Write the &%$#ing code already!!!!

Sure, lets begin!!!!!

  1. First things first… Add description of code and your name as comments.

  1. Define _start as global variable

  1. Define .text section for adding code

  1. Then define _start:

  1. Below _start, start with register sanitization i.e. assigning 0 to them. XORing register with itself will reset it to zero. MUL ECX will multiply EAX with ECX. It will make EAX=0 and overwrite EDX with 0s.

  1. Since we have reset all registers, we will move above values to respective registers for Write subroutine.

  1. We will repeat same process for graceful exit.

  1. Finally, we will define .DATA section and define buffer (msg) and its length (len)

The final code will look like below:

Its time to assemble and link it

For assembling the code, we will use following command.

  • -f is option for selecting output format. “elf32” is Executable and Linkable Format for 32bit systems. This is important incase you are using (which I’m 99.99% certain) x64 machine.
  • -g is to generate debugging information. “gdb” is GDB compatible symbol format.

In our case it will be

To link it to executable file, we will use following command.

Now the file is ready to execute. In our case, it will be like this:

It’s time for GDB!!!!

GNU Debugger (GDB) is one of the most important tool while writing any low-level program. We will see the basic usage of GDB. Using GDB is simple. Here see yourself 😛

Well not really!!! Let me help you to understand some important options of GDB.

First type “$ gdb -h” to get all the available options. Out of all that the only option use is “-q” which is quit start. It just suppresses licensing info. Another option which can be helpful is “-p” which is used for attaching already running process to GDB.

So now the program is loaded, we will see some internal options.

Inspecting loaded executable inside GDB:

  • info” (i) command is used for extracting information of various kind. Just type “(gdb) help info” to get every available option. Some useful options are:
  1. info registers — List of integer registers and their contents
  2. info symbol — Describe what symbol is at location ADDR
  3. info breakpoints — Status of specified breakpoints (all user-settable breakpoints if no argument)
  4. info files — Names of targets and files being debugged

You can try all of them 😀

  • break” (b) command is used for setting a break point. This breakpoint can be set against “address”, “function” etc. GDB also offers facility of conditional breakpoints, which we will be using multiple times.
  • run” (r) command will run the loaded executable inside GDB.
  • disassemble” (disas) is the command to disassemble the pointed instruction.
  • stepi” command will help to execute step by step execution.
  • x” command is used for Examining the memory in various formats. We will look it more details in future posts.
  • print” (p) command will print register values.

So lets use these commands step by step:

  1. Load program in GDB. I’ve shown above so I’m not going to screenshot again.

2. Type following command to get details about list of symbols in the executable.

3. Now set breakpoint for _start function.

4. Now run the program.

You can observe that on running the program the break point is hit. This is because execution starts with _start.

5. Now we can check the registers etc. Lets do that.

As we can see since program is not running register values are mostly zero. Mind you that these values are relative.

6. Lets disassemble the code. Note that disas command can disassemble address, function or register value. In our case we have disassembled EIP register value, which is address of next instruction.

The arrow (=>) is showing next instruction to be executed.

7. Now lets see one interesting feature of GDB called Hook. Hook is basically used for binding number of instruction to be executed per instruction. So lets “define hook-stop”.

Here I have defined very simple hook. It will disassemble $eip and next 10 instructions, then display value of EAX, EBX, ECX, EDX respectively. On running program, you’ll get following output:

We can observe the step by step changes in the value of registers. So after couple of “stepi”s it will be something like below.

Update: I forgot to tell you one very important thing. By default follows ATT convention disassembled code. Above disassembly convention is ATT (full of $ and %). To change the convention to Intel, use following command:

Now if you run “disas” command you can see following:

You can see the difference, right?

8. Finally just type “c” to continue execution. It will execute the program to the end, if no other breakpoint present.

Stop this very long post already!!!!!

Yeah, yeah!! I guess I have covered the simple hello world program extensively. So its good idea to stop now :D. In next blog, I’ll cover some more basic program in assembly. Long way to go!!!! Till then, Auf Wiedersehen!!!!!

I am guy in infosec, who knows bits and bytes of one thing and another. I like to fiddle in many things and one of them is computer.

Tags: , , , , ,

SLAER

I am guy in infosec, who knows bits and bytes of one thing and another. I like to fiddle in many things and one of them is computer.

Leave a Reply