1.15 More about results returning
The author said that in x86, the result of function execution is usually returned in the EAX register. If the type is byte or char, the lower part of the EAX register which is AL is used. If the function returns a float number, the FPU register ST(0) is used. In ARM, the result is usually returned in the R0 register.
1.15.1 Attempt to use the result of a function returning void
Well, what happens if the return value of main() was void not int?
The so-called startup-code calls main() approximately like this:
push envp ; push the environment pointer onto the stackpush argv ; push the argument vector onto the stackpush argc ; push the argument count onto the stackcall main ; call the main functionpush eax ; push the return value from main (in EAX) onto the stackcall exit ; call the exit function with the pushed valueIn other words
exit(main(argc, argv, envp)); // call exit with the return value of main as argument
If you wrote void main() instead of int main(), what happens?
void main()means that no value is expected to be returned explicitly. But theEAXregister may contain any meaningless value (leftover) from previous instructions.- When the startup code does
push eaxaftercall main, it will send the value inEAXtoexit()— and therefore the exit code will be a random value or a value from the last executed function (likeputs()orprintf()if used).
We can illustrate this with code like this:
#include <stdio.h> // include standard I/O headervoid main() // main function declared as void, no return value{ printf("Hello, world!\\n"); // print "Hello, world!" followed by newline};GCC here might replace printf with puts.
puts()returns the number of characters it printed inEAX. If main didn't return a value,EAXwill retain this value.
.LC0: // label for the string.string "Hello, world!" // define the string "Hello, world!"main: // start of main function push ebp // save base pointer mov ebp, esp // set base pointer to stack pointer and esp, -16 // align stack to 16-byte boundary sub esp, 16 // allocate 16 bytes on stack mov DWORD PTR [esp], OFFSET FLAT:.LC0 // store string address on stack call puts // call puts to print the string leave // restore base and stack pointers ret // return from functionWe write a bash script that displays the exit status:
Listing 1.101: tst.sh
#!/bin/sh // shebang for shell script./hello_world // run the hello_world executableecho $? // echo the exit status of the previous commandAnd we run it:
$ tst.shHello, world!1414 is the number of characters that were printed.
The number of characters leaked from printf() (or puts) through EAX/RAX and entered as “exit code”.
By the way, when we decompile C++ with Hex-Rays, sometimes we encounter a function that ends with a class destructor:
...call ??1CString@@QAE@XZ ; CString::CString(void) // call the CString destructormov ecx, [esp+30h+var_C] // move value from stack to ECXpop edi // pop EDI from stackpop ebx // pop EBX from stackmov large fs:0, ecx // move ECX to FS:0 (thread information block)add esp, 28h // add 28h to ESP (clean stack)retn // return from functionAccording to the C++ standard, the destructor does not return anything, but when Hex-Rays does not know that, and thinks that the destructor and the function itself return int, we see something like this in the outputs:
...return CString::~CString(&Str); // Hex-Rays mistakenly shows destructor as returning value}In a clearer sense, it is that when Hex-Rays saw retn, it said that surely this Function returns a Value even though in reality this is just a return to the Caller, nothing more.
1.15.3 Returning a structure
The author then explained and said the truth is that the return value is computed in the EAX register.
And without much chatter, the reason is that old C compilers could not make a function return something that does not fit in one register (usually int)
If one needs to return something bigger, he must return the data through pointers sent as arguments to the function.
So it is very normal that a function returns one value only, and the rest returns it through pointers.
Now we can return a full struct, but the subject is not famous.
If a function must return a large struct, the function that calls it (the caller) must allocate it and send a pointer to it as the first argument, and this happens hidden from the programmer.
Meaning it is the same idea as if you send a pointer in the first argument by hand, but the compiler hides this.
A small example:
struct s { // define structure s int a; // field a int b; // field b int c; // field c};
struct s get_some_values(int a) // function that returns struct s{ struct s rt; // local struct rt rt.a = a+1; // set rt.a to a+1 rt.b = a+2; // set rt.b to a+2 rt.c = a+3; // set rt.c to a+3 return rt; // return the struct};What we got (MSVC 2010 /Ox):
$T3853 = 8 ; size = 4 // temporary variable for struct pointer_a$ = 12 ; size = 4 // parameter a?get_some_values@@YA?AUs@@H@Z PROC ; get_some_values // start of functionmov ecx, DWORD PTR _a$[esp-4] // move a to ECXmov eax, DWORD PTR $T3853[esp-4] // move struct pointer to EAXlea edx, DWORD PTR [ecx+1] // load a+1 to EDXmov DWORD PTR [eax], edx // store a+1 in struct.alea edx, DWORD PTR [ecx+2] // load a+2 to EDXadd ecx, 3 // add 3 to ECX (a+3)mov DWORD PTR [eax+4], edx // store a+2 in struct.bmov DWORD PTR [eax+8], ecx // store a+3 in struct.cret 0 // return?get_some_values@@YA?AUs@@H@Z ENDP ; get_some_values // end of functionThe micro that the compiler uses here to pass the pointer to the struct is named $T3853.
We can write the same example using C99:
struct s { // define structure s int a; // field a int b; // field b int c; // field c};
struct s get_some_values(int a) // function that returns struct s{ return (struct s){.a=a+1, .b=a+2, .c=a+3}; // return initialized struct};- GCC 4.8.1:
_get_some_values proc near // start of functionptr_to_struct = dword ptr 4 // pointer to struct parametera = dword ptr 8 // parameter amov edx, [esp+a] // move a to EDXmov eax, [esp+ptr_to_struct] // move struct pointer to EAXlea ecx, [edx+1] // load a+1 to ECXmov [eax], ecx // store a+1 in struct.alea ecx, [edx+2] // load a+2 to ECXadd edx, 3 // add 3 to EDX (a+3)mov [eax+4], ecx // store a+2 in struct.bmov [eax+8], edx // store a+3 in struct.cretn // return_get_some_values endp // end of functionAs we see, the function fills the fields of the struct that was allocated before by the calling function, as if a pointer to the struct was sent as an argument.
So there is no loss in performance.
To make this part easier for you, I'll explain with a simple explanation that clarifies things a bit.
First, this is the big instruct will be in this shape for example
struct s { // define structure s int a; // field a int b; // field b int c; // field c};This will be its shape in memory
┌─────────┐│ a │├─────────┤│ b │├─────────┤│ c │└─────────┘The caller now before calling the function get_some_values(a)
He does this, allocates a place for the struct in memory like this
Caller Memory:┌──────────────────────────┐│ Empty space to save struct │ ← It will be returned here│ Address = 5000 │└──────────────────────────┘And after that sends the address of this place to the function as a hidden argument
Caller │ │ sends pointer = 5000 ▼Callee (get_some_values)At that time the function receives a pointer to an empty place and starts writing the values inside it
Address 5000:┌─────────┐│ a=a+1 │├─────────┤│ b=a+2 │├─────────┤│ c=a+3 │└─────────┘And this is the final shape
Caller memory:┌────────────────────────────┐│ struct at 5000: ││ a = a+1 ││ b = a+2 ││ c = a+3 │└────────────────────────────┘ ↑ │ callee wrote the values hereAfter now the function finishes, the function does not return the struct directly, she returns the pointer that you originally sent (hidden)
So the caller sees the full struct appeared to him:
return value ← same address 5000
Caller now sees:a = a+1b = a+2c = a+3And this is a summary for all this talk
If this article helped you, please share it with others!
Some information may be outdated





