Getting started with Assembly language programming – End of theory (Not really :P)

Hi Everyone,

Sorry for not posting for such long time. Some personal things + official business + Procrastination (as usual) cause the delay. But now I’m back from hiatus.

This is part 3 of assembly language programming and shellcoding. If you have not read previous two parts, you can read those here and here. As the title suggests, this will be last post with theory. Well, its not entirely true. This will be last “all” theory post. From next post, we will starting actual programming and I’ll be adding theory ,wherever required.

In last post, we have seen anatomy of Intel processors. In this post, we will be having high level overview of CPU modes, Memory models, Linux System structure and some important tools we will be using while coding. So, lets start!!!

CPU Modes

Ever wondered, who decides what is the maximum RAM can be supported by operating system on particular type of CPU architecture? It is CPU mode or state in which CPU is working. Different CPU might have different modes. Consider normal Android phone. It has User mode, Safe mode and Recovery mode (maybe more).

For Intel x86 architecture, there are two important modes:

  1. Real Mode: CPU goes in this mode when computer turned off/on/reset. Hence 16bits out of 32/64 bits registers are available. Since only 16bits available, amount of available memory is 64KB (2^16 = 64KB). Due to low amount of memory, virtual memory and memory protection are difficult to apply.
  2. Protected Mode: This is the mode in which normal user works. Unlike real mode, all registers are available in this mode. First of all, 4GB memory is available in this mode (2^32 = 4GB). Memory protection, multi-tasking, paging etc. available. All the programming and shellcoding activities will be carried out in this mode only.

Memory Models

Different Memory Models

Memory model is the memory addressing method through which running process views the memory as continuous address space. This wiki page will give fairly good details of various memory models.

For x86, accessible memory is 4GB. Since 4GB is not very large memory. Hence it uses Flat (Linear) memory model.

What is linux?

Linux is an open-source operating system, majorly developed by Linus Torvalds. It is free (mostly), popular among tech persons, fairly secure. Linux has many versions developed by various development communities, known as Distros. Some of the famous distros are Ubuntu, Backtrack, Kali, RHEL, Debian etc. Honestly, this list goes on and history behind them is interesting. So lets dive in common Linux architecture.

Linux Architecture:

Basic Linux Architecture
  • Hardware layer – Hardware consists of actual hardware devices (RAM/ HDD/ CPU etc).
  • Shell – An interface to kernel, hiding complexity of kernel’s functions from users. Consider Windows CMD. Takes commands from user and executes kernel’s functions. e.g. Korn Shell, Bourne Shell etc.
  • Kernel – Core component of Operating System, interacts directly with hardware, provides low level services to upper layer components. Shellcode are executed on OS kernel.
  • Utilities/Applications – Consider normal programs like notepad, ping etc.

x86 Linux used Flat memory model with Protected mode. It has various protection mechanism such as access control, privileged code etc. Now lets see how process looks like in Linux.

Linux x86 Memory map

Any modern operating system out there follows something call Virtual Memory model. In this model, available memory is divided into two parts. 1 GB of higher memory is kept for Kernel functions, while first 3 GB is kept for user space. It looks something like this in concept:

Memory map for Linux x86 process

Now these segments will be used during shellcode writing for various reasons as follow:

  • Text (.txt) segment: This is also known as Code Segment. This is section contains actual executable instructions.
  • Data (.data) segment: This is segment where the initialized variable are present.
  • BSS (.bss) segment: This is segment where uninitialized variables are present.
  • Heap segment: This is where dynamic memory allocation takes place.
  • Stack segment: Pssst!!! We have discussed this earlier if you remember.

With this I’ll end OS theory. Now lets look into tools we will be using for shellcode development.

Shellcoding Arsenal

After looking at underlying architecture, it’s time for some most essential tools we will be using during assembly programming exercises.

Assembler: Assembler is piece of software which converts assembly language code into binary machine code to run directly on processor. We have to choose assemblers based on platform (in our case Linux on x86 processor). We will be using Netwide ASseMbler (NASM) in our future posts as it is free and easy to use.

Linker: Linker takes object code generated by assembler and combine them to generate executable file. Linux has dynamic linker present, used with “ld” command.

Debugger: Debugger is a software used to check low-level execution of any executable. We can observer as well as modify register values on run-time. Honestly, without debugger writing assembly code is impossible. GNU Debugger (GDB) will be our choice of debugger. We might add “PEDA” framework for ease of functioning.

Object-dump: Object-dump is the Linux utility for extracting shellcode from executable.

We will be looking into these tools in more details in future posts while doing Lab Setup and actual code writing.

I guess that’s it for now. In next post, we will be looking into lab setup. So there will little less theory and more of hands-on from next time. If you have make till the end, feel free to comment, criticize and share. Your comments are valuable and to bring me out of hibernation and focus on important things like studying and writing blog. Till then, Auf Wiedersehan!!!

References:

  1. Processor Modes: http://flint.cs.yale.edu/feng/cos/resources/BIOS/procModes.htm
  2. Memory models on x86: http://www.c-jump.com/CIS77/ASM/Memory/lecture.html
  3. Modern Linux Architecture: https://cumulusnetworks.com/blog/linux-architecture/
  4. Linux Memory Management: https://elinux.org/images/b/b0/Introduction_to_Memory_Management_in_Linux.pdf
  5. NASM: https://en.wikipedia.org/wiki/Netwide_Assembler
  6. Linux Dynamic Linker: http://man7.org/linux/man-pages/man8/ld.so.8.html
  7. GDB: https://www.gnu.org/software/gdb/
  8. Object Dump: https://sourceware.org/binutils/docs/binutils/objdump.html

I am guy in infosec, who knows bits and bytes of one thing and another. I like to fiddle in many things and one of them is computer.

Tags: , , , , ,

SLAER

I am guy in infosec, who knows bits and bytes of one thing and another. I like to fiddle in many things and one of them is computer.

Leave a Reply