Reverse Engineering for Beginners (CH1.5 Hello, world!) {Part_1}
1.5 Hello, world!

The author used a famous example to print Hello, World, and that was the example:
1.5.1 x86 β MSVC
Letβs compile this code using MSVC 2010:
(The option /Fa makes the compiler generate an Assembly listing file.)
Hereβs the generated code:
Note something: MSVC generates Assembly code using the Intel syntax, and weβll explain the difference between it and AT&T syntax later.
The compiler produces a file called 1.obj, which is later linked to create 1.exe.
That file contains several sections:
CONSTβ for constant data (like strings)._TEXTβ for the code itself.
The string "hello, world" in C/C++ is of type const char[], but since it doesnβt have an explicit name, the compiler gives it an internal name like $SG3830.
So we can write the code like this:
If we look again at the Assembly, weβll notice the string ends with a small byte (0), and thatβs normal for C/C++ strings.
Analyzing the Assembly code
1. CONST SEGMENT
This part contains constant data (like the texts inside the program).
The computer stores the sentence "hello, world" here, and gives it an internal name so the compiler can access it later.
$SG3830β is the name chosen by the compiler.DBmeans βDefine Bytesβ, i.e., store bytes.'hello, world'β the actual text.0AH= newline code\n.00H= zero byte marking βend of stringβ.
2. PUBLIC _main
This means thereβs a function named main that will be public (available to the whole program).
3. EXTRN _printf:PROC
This means thereβs an external function called printf thatβs not written here but will come from another library (the C standard library).
After that comes the _TEXT SEGMENT part β this is where the actual executable code resides.
_main PROC
This marks the beginning of the main() function.
push ebp
This is the first line in almost any function. The computer saves the old value of ebp (the base pointer) to return to it later.
mov ebp, esp
Here we say: βMake the base pointer (ebp) point to the same place as the stack pointer (esp).β
That means weβve started a βnew frameβ on the stack for this functionβs work.
push OFFSET $SG3830
Here we push the address of the string "hello, world" onto the stack.
After printf() finishes and returns, the address we pushed is still on the stack β but we no longer need it, so we fix the stack pointer by:
Why 4? Because the program is 32-bit, and an address takes exactly 4 bytes. If it were 64-bit, weβd need 8 bytes.
The instruction:
is almost the same as:
but without actually using a register.
Some compilers (like Intel C++ Compiler) prefer:
to make the code smaller (1 byte instead of 3).
Example from Oracle RDBMS code:
Even MSVC can do that sometimes:
After calling printf(), the original C/C++ code has return 0;
In Assembly, that turns into:
The word XOR means βExclusive ORβ, but the compiler uses it instead of MOV EAX, 0 because the code becomes shorter (2 bytes instead of 5).
Out of curiosity, I wanted to know why XOR EAX, EAX is shorter than MOV EAX, 0. Turns out the reason is simple β when encoded in x86 machine code:
Thatβs only 2 bytes.
While MOV EAX, 0 becomes:
Thatβs 5 bytes in total (1 + 4).
This was just something extra I wanted to understand better, so I decided to write it down as well.
Some other compilers use:
which means βsubtract EAX from itselfβ β result is also zero.
Finally:
RET
This returns control to the program that called main() (usually the C runtime code), which then returns back to the operating system.
GCC
Now letβs try compiling the same C/C++ βHello, worldβ code, but this time using GCC on a Linux system, with this command:
Then weβll use a program called IDA Disassembler to see how the function main() was built after compilation. IDA uses the same Intel-syntax style as MSVC.
The result is almost identical to the code generated by MSVC. The address of the string "hello, world" (stored in the .data section) is first loaded into the EAX register, then stored on the stack.
Also, at the beginning of the function, thereβs this line:
Here, GCC performs something called stack alignment.
That means it ensures that the address of ESP is a multiple of 16 (i.e., ends with 0 or 0x0).
Why? Because the CPU reads memory in βblocks,β and if a block starts at a neatly aligned address (like 0x1000 instead of 0x1003), itβs much faster.
So this line aligns the stack for better performance.
Then we have this line:
This allocates 16 bytes on the stack (since 10h = 16). In reality, we only need 4 bytes, but the compiler reserves 16 to maintain proper alignment.
After that, the address of the string is stored on the stack directly without using PUSH.
The variable var_10 is a local variable, and itβs also used as the argument to the printf() function.
Then the function printf() is called.
When GCC is running without optimization, it uses:
instead of shorter instructions like XOR EAX, EAX.
The last instruction LEAVE is equivalent to:
This restores the stack to its original state and recovers the previous EBP value that existed before the function started.
GCC: AT&T syntax
Now letβs see how this code looks when written in AT&T syntax. This style is more common on UNIX systems.
This command tells GCC to generate Assembly code instead of an executable file.
Hereβs the generated code:
The code contains many directives starting with a dot (.). These are called macros, and we donβt need to worry about them now. We can safely ignore them β except for .string, because itβs what stores the text "hello, world\n" in memory as a C-string (ending with null).
After removing unnecessary lines, the simplified version looks like this:
Differences between Intel and AT&T syntax
1- The order of source and destination is reversed:
- Intel:
<instruction> <destination>, <source> - AT&T:
<instruction> <source>, <destination>
So, in Intel:
mov eax, ebx
In AT&T, it becomes:
movl %ebx, %eax
To remember: Think of Intel like an βequals sign (=)β and AT&T like an βarrow ββ, meaning the value moves from left to right.
2 - In AT&T:
- Registers start with % (e.g.,
%eax). - Constants start with $ (e.g.,
$16). - Parentheses ( ) are used instead of square brackets [ ].
3 - AT&T also adds a letter at the end of each instruction to indicate data size:
- q β quad (64-bit)
- l β long (32-bit)
- w β word (16-bit)
- b β byte (8-bit)
Back to our code:
The generated code looks very similar to what IDA produces, but thereβs a small difference:
The value 0FFFFFFF0h appears here as $-16. Theyβre actually the same:
- In decimal: -16
- In hexadecimal: 0xFFFFFFF0
Both represent the same number in 32-bit systems.
Another note: The return value is set using MOV instead of XOR.
So we see:
movl $0, %eax
This copies the value 0 into %eax. The word βmoveβ is a bit misleading β it doesnβt move, it copies. In other architectures, youβll find similar instructions called βLOADβ or βSTOREβ.
String patching (Win32)
We can easily locate the string βhello, worldβ inside the executable file. It looks something like this:
If we wanted to translate it into Spanish, it would look like:
The original string length is 14 bytes (like in the example above), and the new string is 6 bytes long. You can replace it directly and leave the remaining bytes as-is, or pad them with 00 until the space is filled. Example after modification:
If we wanted to insert a longer message, there might be some null bytes (00) after the original English text. Itβs not always safe to overwrite them β they might be used by CRT code. So only do it if you know what youβre doing.
The author shared a real story about software cracking:
There was an image-processing program that, when unregistered, would add watermarks like: βThis image was processed by the evaluation version of [Program Name]β.
By coincidence, they found this string inside the executable, and replaced it with spaces β the watermark disappeared! Technically, the program was still adding the watermark, but the text became invisible.
Software localization in the MS-DOS era
This method was common for translating MS-DOS programs into Russian during the 1980s and 1990s. It was suitable even for people unfamiliar with machine code or executable file formats.
The new text couldnβt be longer than the original, because adding bytes might overwrite nearby code or data. Russian words were often longer, so translated versions had many abbreviations to make text fit.
The same might have happened for other languages too. And with Delphi strings, the string length field also had to be updated if needed.
1.6 x86-64
Now we compile this time using 64-bit MSVC:
In x86-64 all registers were extended to 64-bit, and their names now start with R.
To reduce stack usage (i.e., avoid repeated memory/cache access) there is a common convention to pass function arguments in registers rather than on the stack β commonly called fastcall.
That means some arguments are passed in registers, and the rest (if any) go on the stack.
On Win64, the first four arguments of any function are passed in these registers:
- RCX
- RDX
- R8
- R9
Thatβs what we see here: the pointer to the string passed to printf() is now passed in RCX instead of being pushed on the stack.
Also, pointers are now 64-bit, so they are passed in the 64-bit registers (the ones starting with R-).
For backward compatibility you can still access the lower 32-bit part via the E- prefix.
Example: the RAX / EAX / AX / AL hierarchy:
Byte number: ββββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββ¬βββββ β7th β6th β5th β4th β3rd β2nd β1st β0th β ββββββββββββββββββββββββββββββββββββββββββ€ β RAX (64-bit) β ββββββββββββββββββββββββββββββββ¬ββββββββββ€ β EAX (32-bit) β β ββββββββββββββ¬ββββββββββββββββββββββββββββ€ β AX (16-bit) β ββββββββββββββ¬βββββββββββββ β β AH (8-bit) β AL (8-bit) β ββββββββββββββ΄βββββββββββββ΄βββββββββββββββ
The main() function returns an int, and in C/C++ int is still 32-bit. Therefore the compiler zeroes EAX (the 32-bit subregister) rather than the whole RAX to preserve compatibility.
Also, the function allocates 40 bytes on the stack. These 40 bytes are called the shadow space (explained later).
GCC: x86-64
Now letβs try GCC on a 64-bit Linux system:
On Linux/BSD/macOS the calling convention also passes arguments in registers. According to the System V ABI (used by Unix-like systems), the first six arguments are passed in registers:
If there are more than six arguments the rest go on the stack as usual.
In the example above the pointer to the string is passed in EDI (the lower 32-bit part of RDI).
Why EDI and not RDI? β this is an optimization trick by the compiler:
- Writing to the 32-bit subregister (e.g., EDI) automatically clears the upper 32 bits of the full 64-bit register (RDI).
- This means a
mov edi, imminstruction encodes smaller (5 bytes) thanmov rdi, imm64(7 bytes), saving space in the binary.
Example machine code bytes (from the object file) show this size saving:
As you see, the instruction writing into EDI at 0x4004D4 is 5 bytes long; writing a full 64-bit immediate into RDI would be 7 bytes β GCC chooses the shorter encoding because itβs safe (string addresses are typically below 4GB in these examples) and saves space.
Also note that EAX is zeroed before the call to printf(). According to the calling convention, the number of vector registers used must be placed in EAX for Unix x86-64 calls.
Address patching (Win64)
If we compile this example with MSVC 2013 and the /MD option (linking to external MSVCR*.DLL), the main() function is typically easy to find in the binary. The pointer load might look like:
As an experiment, if we increment that address by 1:
The program will read from the second byte of the string, and the output becomes ello, world instead of hello, world. Running the patched executable indeed prints that altered string.
