0xV3n0m

Announcement

Welcome To My Personal Blog

An Empty Function

At that time, the author said that the simplest function that can be written is the Empty Function.

Let’s see how it turns into Assembly.

Here’s the code:

1
void f() {
2
    return;
3
}

Now, let’s compile it for:

1 - X86

And here’s what we got:

1
f:
2
    ret

So it returned only one instruction, which is ret, and that means the code returns to execute what comes after the function that called it.

2 - ARM

And here’s what we got:

1
f
2
    PROC
3
    BX      lr
4
    ENDP

If you notice, in ARM, the address that we should return to is not stored in the stack like in x86, but instead it is stored inside a register called Link Register (lr).

Let me explain it to you in a simpler way and go through it step by step.

In x86, when you make a call to a function, the processor puts the return address in the stack.

When the function finishes and executes ret, the processor takes that address from the stack and returns to it.

In ARM, the method is a bit different. ARM has a place where it stores the return address called the Link Register (lr).

That’s where the address that the function should return to after finishing is saved.

So instead of putting it in the stack, it stores it inside LR.

Now what is this BX that we wrote, and we say it jumps back to what’s inside LR?

Simply, BX means “Branch and eXchange”.

And this instruction tells the processor:

"Jump to the address stored in this register (here LR)"

And it can also switch between ARM or Thumb mode (that’s something specific to ARM architecture).

But here we’re using it just to “return”, not to switch modes.

3 - MIPS

In the MIPS architecture, there are two ways to write register names:

Either by numbers (from $0 to $31),
Or by symbolic names (like $v0, $a0, etc.)

The GCC compiler outputs the code like this:

1
j   $31
2
nop

But the IDA program writes it with symbolic names like this:

1
j  ra
2
nop

The first instruction j $ra means jump to the address stored in $ra — and that’s the same idea as “returning” to the function that called you.

The register $ra is the same concept as the LR register in ARM.

As for the second instruction nop, it’s short for No Operation, which means “do nothing” — it’s sometimes written for alignment or ordering inside the processor.

Returning Values

The author also said that there’s another type of function that basically just returns a constant number, like this one:

1
int f()
2
{
3
    return 123;
4
}

Let’s take a look at it.

1 - x86

When compiled (whether by GCC or MSVC) with optimization enabled on an x86 architecture, the compiler produces this code:

1
f:
2
    mov eax, 123
3
    ret

This code does two things:

First, it moves the value 123 into the EAX register — and by the way, that’s the register the processor always uses to store the return value of a function.
Second, the ret instruction means “return to the place where this function was called.”

2 - ARM

Here things are a bit different, and it looks like this:

1
f PROC
2
    MOV r0, #0x7B ; 123
3
    BX lr
4
ENDP

In short, the ARM architecture uses the R0 register to hold the return value of a function. That’s why the number 123 is copied into R0.

By the way, the instruction MOV doesn’t actually move the value — it just makes a copy of it and places it there.

3 - MIPS

In this case, the author said that when compiled with GCC, the code looks like this:

1
j $31
2
li $2,123 # 0x7B

And in IDA, it shows like this:

1
jr $ra
2
li $v0, 0x7B

The register $2 (also known as $v0) is the one that stores the value returned by the function.

The instruction LI stands for “Load Immediate”, which means it loads that number directly into the register — the same concept as MOV.

Now, the question is — why do we jump to $31 first and then load the number, when logically it should be the other way around?

Well, the author explained that this happens because of something in the RISC architecture called the “branch delay slot”.

Simply put, this means the instruction that comes after a branch (or jump) gets executed before the jump itself.

So the compiler swaps their order — it’s not a big deal for now, but keep it in mind.

And just as an example, in UNIX you have:

/bin/true → returns 0
/bin/false → returns 1

That’s because in UNIX, 0 means “everything is fine”, and any other number means “an error occurred”.

Note

Here’s a short demo video showing how to test the code on Compiler Explorer (godbolt.org) to see the actual assembly output — just like in the book.