Lately I’ve been reading the book Programming from the Ground Up by Jonathan Barlett. It teaches x86 assembly language programming from the very grounding blocks. Applying the concepts learnt in the book, I’ll show you how to write a factorial function in x86_64 Assembly.
There are many assembly languages: Intel, AT&T, NASM, etc,… In this tutorial, I’m going to use the AT&T syntax, which can be easily assembled in any linux machine through the GNU Assembler (GAS).
Let’s start by creating a file named
main.s (’s’ is the standard extension for assembly programs).
Every assembly program is composed by three sections: data, bss and text. The data section is used to initialize constants. Those constants are preallocated during the program initialization. The bss section is used to declare buffers, or dynamically allocated data. Finally, the text section is used to keep the actual code. None of them are mandatory, but it’s a good pratice to define them explicitally.
The “.” indicates that the instruction is a directive to the assembler (just like # in C is a directive to the compiler).
In order to execute your program, the operating system needs to know where’s the start of your program (like a “main” function). In Assembly, this is accomplished through the
1 2 3
A label is just a placeholder for a memory position. On Von Neumann architecture (which is the architecture used by most of modern computers), code is actually kept on memory along with the data that it manipulates. For the processor, there’s absolutely no distinction into code and data. In the above example, the label would exercise the function of pointing to a ‘code line’ (This is a oversimplification).
You must have noticed the use of
globl directive. It’s used to make a symbol visible to the linker. We’ll talk about it later, just be informed that it’s mandatory to the _start label.
Alright! Now let’s make our program just exit in order to obtain a basic executable program.
1 2 3 4 5 6
rbx are both general purporse registers. The
int instruction is a interruption. When we use the 0x80 code, we are saying that this interruption must be handled by the linux operating system. In another words, we are performing a system call. The code of the system call is stored in the
rax register. The
1 is the code for the exit system call. The exit system call takes one parameter, the return value, which is stored in the
rbx register. In this case, we are returning the value
0. By the way, the
$ is used to indicate constant values. If we omit it, it would be interpreted as a memory address instead.
In C, the code we just typed would be equivalent to:
1 2 3 4
You can turn it into a executable code by calling:
1 2 3
But a program like that is of no use, right? So let’s create our factorial function. Functions in assembly are a kind of a headache, because you need to explicitally manipulate the stack.
Stack is a contiguous region of memory reserved for the program by the operating system.
There’s a special register called
rsp (stack pointer) that points to the ‘top’ of the stack. So how do we store things in the stack? By simply using
mov instruction and manipulating the stack pointer. For example, let’s suppose we want to store two values in the stack,
2. It could be accomplished that way:
1 2 3 4 5
Stack is a structure that grows upward, in the sense of it first begins with addresses of higher values and grows towards addresses of lower values. In the above example, we are using 8 bytes of the stack to store the value
1 and equally 8 bytes to store the value
2, so we need to step down the stack pointer by a value of 16 (we use 8 bytes because on x64 architecture, the registers have 8 bytes. 8 x 8 = 64 bits).
Since manipulating the stack pointer directly is boring, there are two special instructions to accomplish the same effect:
push subtracts the stack pointer by 8 and stores the parameter into the current address pointed by the stack pointer.
pop moves the data stored in the address currently pointed by the stack pointer to a register and adds the stack pointer by 8.
Recursive factorial function
Create a file named
factorial.s. Insert the following code:
1 2 3 4
We use the
globl directive because we want the factorial label to be visible to other programs (we want to use the function in other programs). Here’s the first trick: “Calling” a function is simply jumping to the memory address where this function is defined.
The new thing here is the
type directive. Don’t worry, it’s just a directive to indicate that the label is not a common label but actually a function.
Now let’s go back to our main program and call this function:
1 2 3 4 5 6 7 8 9
This may seem complicated at the first glance, but it’s rather simple. We first store the value
5 in the stack. Since we cannot pass arguments directly while calling the function like in other languages, we need to store the arguments in the stack. In this case, 5 will be first the argument for the
factorial function: The number we want to calculate the factorial.
After that, we use the
call instruction. What the
call instruction actually does is just storing the current address into the stack pointer (because we need to know to where return after the function has been finished!) and jumping to the
Finally, we increase the stack pointer (because we no longer need the function parameters stored in the stack, we can safely override them to prevent memory leaks) and move the function return (the function return is, by convenction, always stored in the
rax register) to the
rbx so we can display it after the program has been executed (through the echo $? command).
Our main program is finished. It’s roughly equivalent to the following C program:
1 2 3 4 5
Now let’s focus on our factorial function.
The very first thing we need to do in a function is to store the
rbp is the base pointer register. It points to the beginning of the current stack frame. Stack frame is the region of stack where we stored our function parameters and function return address. Since we can call functions inside functions, we need to store the ‘context’ of the previous function call (if any).
1 2 3 4 5
We move the current stack pointer to the
rbp register. Now we are within the context of the current function. Let’s retrieve the arguments:
1 2 3 4 5 6
Remember: We use
rbp + 16 because we pushed the parameter value into the stack first (the second pushed value was the function return address, therefore it stands on top).
Since we are implementing the recursive version, we first need to define the recursion base: If the parameter value is less or equal than 1, return 1.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Wow, a lot of things happened here! Let’s calmly analyze each one of them. First, we compare the parameter value to one (through the
cmp instruction). Then, we check if the parameter value is equal than one. If so, we jump to the
factorial_base label. This conditional jump is accomplished by the
je instruction (jump on equal). The
cmp instruction sets a flag on
flags register, which the conditional jump will look up to decide if it will jump or not.
Once within the
factorial_base label, we move the value
1 into the
rax register. The
rax will store the final value of our calculation. The program flow will then move automatically to the label right below,
factorial_end label will restore the stack pointer to where it was when the function was called (in case we have manipulated it inside the function body, which we didn’t, but it’s a good pratice to keep our code generic) and will then restore the base pointer register. Finally, there’s this
ret instruction, which will return to the value currently pointed by the stack pointer (the function return address).
Now let’s implement the recursive calls:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
After we checked our recursion base condition (and failed), we decremented the value of the parameter in order to pass it as argument to the same function again (through the
decl instruction). The way of calling the same function is identical to the code we wrote on our main program: We push the arguments into the stack and call the
Once the recursive function has been finished, its return value is stored in the
rax register. Before multiplying the current parameter by the return value of the recursive call, we first need to restore its original value (remember: When we made recursive call, it was overrided). We accomplish it by calling:
mov 16(%rbp), %rbx. We then multiply the value of
rax and store the result in
rax through the
imul instruction. After it, we jump to the end of our function (
That’s it! Not that hard, right? Take of time to ensure you have understood every piece of our code.
The above code is roughly equivalent to the following C code:
1 2 3 4 5
hehe, Assembly is such a pain. :)
Generating executable code
Now let’s generate our executable code. First, let’s generate object code from our both assembly programs:
Let’s link our two object codes into a single executable code:
We can check our program return by calling:
A value of
120 must have prompted, that is the factorial of 5.
It’s worth noting that if we omit the
.globl factorial in
factorial.s, the linker would prompt an error, since it cannot resolve the symbol.
Iterative factorial function
The factorial function can also be implemented in a single loop. Observe the following C code:
1 2 3 4 5 6 7 8 9 10
Since the program is bigger in C than the previous version, we may think that it would be even more complex in Assembly than the previous Assembly program. However, it’s actually simpler:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
We copy the parameter value to the
rbx register. It will exercise the same function of the
i variable in the C code. We then decrease the value of i and then jump to the loop.
On our loop, we compare the value of
rbx (i) to one. If it’s equal, we exit our loop. Otherwise, we multiply
rax and store the result in
rax. In the end, we return to the beginning of our loop.
Well, well… We learnt important Assembly concepts in this tutorial: System calls, the stack, functions… It’s a lot! You already are able to implement some other simple algorithms (try Fibonacci!).