In many applications, mixing Assembly and C is routine (pun intended). There are many reasons for it, but, in general, you want to use Assembly when you want to deal with the hardware directly or perform a task with maximum speed and minimum use of resources, while you use C to perform some high level stuffs that don’t attend the former requirements. In either case, you’ll need one integrated system.
There are three ways to mix Assembly and C:
- Using Assembly-defined functions into C
- Using C-defined functions into Assembly
- Using Assembly code in C
We’ll explore them all in this tutorial.
Using Assembly-defined functions in C
Let’s first take the example of a function that takes no parameters and doesn’t return anything, like one that just prints something on screen.
1 2 3 4 5 6 7 8 9 10 11 12 13
(If you don’t quite understand the above syntax, read my previous tutorial)
Now let’s create a C program to call this function:
1 2 3 4 5 6 7
Notice the use of
extern keyword. It tells the compiler that the definition of a given function or variable is defined in somewhere else other than the current file. It’s the linker job to connect this declaration with the actual definition.
Now let’s compile and link our both programs at the same time in order to obtain an executable file:
That’s all! Pretty easy, right? Now let’s advance to a more challenging scenario: A function that returns a value. As I said on previous tutorial, by convention, Assembly functions return values on
AX register. This is also true for C programs. Check out this example:
1 2 3 4 5
This function only puts the value ‘10’ into the EAX register. Now on C side:
1 2 3 4 5 6 7 8
It’s worth noting that, on Assembly side, I’m moving a two words value into the EAX register. I could move a four words value to the RAX register instead, but it would print 0. Why? Here’s the reason:
As you may know, RAX is the 64 bits version of the AX register, hence it can store 64 bits simultaneously. Those bits are stored from left to right, i.e., let’s suppose we move the decimal value ‘10’ into the RAX register. It would appear that way:
01010000000000…0 (0101 + 60 zeroes).
The EAX holds the 32 most significant bits (the lower half), therefore, if I access this sequence through EAX, I would only see zero values! And this is what the
int datatype is implicitly converted to, since it’s a datatype with size equals to 32 bits. In order to avoid this problem, I should either stick with EAX, EBX… registers or use
long int on C side.
Lesson learnt: One must check if the size of registers match the size of types in C.
Now the last scenario: A function that takes parameters and returns a value, like that one that returns the sum of two values:
1 2 3 4 5 6 7 8 9
Now the Assembly definition:
1 2 3 4 5 6
You may be asking: Hey, what’s wrong? Why am I using the
Here’s the trick: In GCC compiler, instead of the parameters being pushed into the stack by the callee to be read from the calling function, they are stored in registers. It’s the calling function job to push them into the stack if they need to. Those registers are used in the following order:
- _di: Holds the first argument
- _si: Holds the second argument
- _dx: Holds the third argument
- _cx: Holds the fourth argument
- r8d: Holds the fifth argument
- r9d: Holds the sixth argument
And so on… In the above example, the value
2 is stored in the
edi register and the value
3 is stored in the
esi register. Therefore, we simply sum them (through the
addl instruction) and move the result to
Using C-defined functions into Assembly
Here’s the first example: Using the
printf C function into Assembly:
1 2 3 4 5 6 7 8 9 10 11 12
Now compile the Assembly program with GCC:
The GCC will automatically link with the function definition. In the same way we used the
extern keyword in C, we use the
.extern directive to tell the Assembler that
printf is defined externally.
That is equivalent to the following C program:
1 2 3 4 5 6
When compiling Assembly programs with GCC, the starting symbol is no longer
main is a function, therefore it must have the
ret instruction in the end of it.
printf in C takes two or more parameters: The format and the value(s). As said previously, the first parameters goes to
rdi register while the second parameter goes to
rsi register. Note: Before calling the function, the value of
rax must be zero!
Our second example is using the
scanf function. Like
printf, it takes two more parameters: The format and the destinating addresses where the standard input will be stored. Note: The second and so on parameters are no longer values, but memory addresses (pointers).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
First, we declare three “variables” in data section:
- a: A two words (32 bits) region of memory that initially stores the value zero;
- b: A two words (32 bits) region of memory that initially stores the value zero;
- format: A region of memory that stores the ASCII string “%d %d”.
We then pass the address of
format as first parameter, the address of
a as second parameter and the address of
b as third parameter. Before calling
scanf, we set
RAX to 0 (just like in the printf example). After it, we move the value stored in
a address to
eax register and the value stored in
b address to
ebx register. We then sum them both and store the result in
After executing the program, if we echo the program execution status:
We’ll able to see the sum of both typed numbers.
The above example is equivalent to the following C program:
1 2 3 4 5 6 7 8 9 10
Using Assembly code in C
Our third category is pretty straight-forward. See the example:
1 2 3 4 5 6 7 8 9 10 11 12 13
Now you can compile it normally:
The compiler will simply insert the assembly code in the appropriated place in the compiled code.
We’ve just learnt very very powerful tools! Learning how to mix Assembly and C give us a deep insight of how the C compiler actually works. I strongly recommend this website for further learning. Play with it around, try some snippets, and see how it’s translated into Assembly.