x86 Assembly/FASM Syntax
A Wikibookian has nominated this page for cleanup because: page needs general work You can help make it better. Please review any relevant discussion. |
A Wikibookian has nominated this page for cleanup because: page needs general work You can help make it better. Please review any relevant discussion. |
FASM, also known as Flat Assembler, is an optimizing assembler for the x86 architecture. FASM is written in assembly, so it can assemble/bootstrap itself. It runs on various operating systems including DOS, Linux, Unix, and Windows. It supports the x86 and x86-64 instruction sets including SIMD extensions MMX, SSE - SSE4, and AVX.
Hexadecimal Numbers
[edit | edit source]FASM supports all popular syntaxes used to define hexadecimal numbers:
0xbadf00d ; C-Like Syntax
$badf00d ; Pascal-Like Syntax
0badf00dh ; h Syntax, requires leading zero to be valid at assembly time
Labels
[edit | edit source]FASM supports several unique labeling features.
Anonymous Labels
[edit | edit source]FASM supports labels that use no identifier or label name.
- @@: represents an anonymous label. Any number of anonymous labels can be defined.
- @b refers to the closest @@ that can be found when looking backwards in source. @r and @b are equivalent.
- @f refers to the closest @@ that can be found when looking forward in source.
@@:
inc eax
push eax
jmp @b ; This will result in a stack fault sooner or later
jmp @f ; This instruction will never be hit
@@: ; if jmp @f was ever hit, the instruction pointer would be set to this anonymous label
invoke ExitProcess, 0 ; Winasm only
Local Labels
[edit | edit source]Local labels, which begin with a . (period). You can reference a local label in the context of its global label parent.
entry globallabel
globallabel:
.locallabelone:
jmp globallabel2.locallabelone
.locallabeltwo:
globallabel2:
.locallabelone:
.locallabeltwo:
jmp globallabel.locallabelone ; infinite loop
Operators
[edit | edit source]FASM supports several unique operators to simplify assembly code.
The $ Operator
[edit | edit source]$ describes the current location in an addressing space. It is used to determine the size of a block of code or data. The MASM equivalent of the $ is the SIZEOF operator.
mystring db "This is my string", 0
mystring.length = $ - mystring
The # Operator
[edit | edit source]# is the symbol concatenation operator, used for combining multiple symbols into one. It can only be used inside of the body of a macro like rept or a custom/user-defined macro, because it will replace the name of the macro argument supplied with its value.
macro contrived value {
some#value db 22
}
; ...
contrived 2
; assembles to...
some2 db 22
The ` Operator
[edit | edit source]` is used to obtain the name of a symbol passed to a macro, converting it to a string.
macro print_contrived value {
formatter db "%s\n"
invoke printf, formatter, `value
}
; ...
print_contrived SOMEVALUE
; assembles to...
formatter db "%s\n"
invoke printf, formatter, "SOMEVALUE"
Built In Macros
[edit | edit source]FASM has several useful built in macros to simplify writing assembly code.
Repetition
[edit | edit source]The rept directive is used to compact repetitive assembly instructions into a block. The directive begins with the word rept, then a number or variable specifying the number of times the assembly instructions inside of the curly braces proceeding the instruction should be repeated. The counter variable can be aliased to be used as a symbol, or as part of an instruction within the rept block.
rept 2 {
db "Hello World!", 0Ah, 0
}
; assembles to...
db "Hello World!", 0Ah, 0
db "Hello World!", 0Ah, 0
; and...
rept 2 helloNumber {
hello#helloNumber db "Hello World!", 0Ah, 0 ; use the symbol concatenation operator '#' to create unique labels hello1 and hello2
}
; assembles to...
hello1 db "Hello World!", 0Ah, 0
hello2 db "Hello World!", 0Ah, 0
Structures
[edit | edit source]The struc directive allows assembly of data into a format similar to that of a C structure with members. The definition of a struc makes use of local labels to define member values.
struc 3dpoint x, y, z
{
.x db x,
.y db y,
.z db z
}
some 3dpoint 1, 2, 3
; assembles to...
some:
.x db 1
.y db 2
.z db 3
; access a member through some.x, some.y, or some.z for x, y, and z respectively
Custom Macros
[edit | edit source]FASM supports defining custom macros as a way of assembling multiple instructions or conditional assembly as one larger instruction. They require a name and can have an optional list of arguments, separated by commas.
macro name arg1, arg2, ... {
; <macro body>
}
Variable Arguments
[edit | edit source]Macros can support a variable number of arguments through the square bracket syntax.
macro name arg1, arg2, [varargs] {
; <macro body>
}
Required Operands
[edit | edit source]The FASM macro syntax can require operands in a macro definition using the * operator after each operand.
; all operands required, will not assemble without
macro mov op1*, op2*, op3*
{
mov op1, op2
mov op2, op3
}
Operator Overloading
[edit | edit source]The FASM macro syntax allows for the overloading of the syntax of an instruction, or creating a new instruction. Below, the mov instruction has been overloaded to support a third operand. In the case that none is supplied, the regular move instruction is assembled. Otherwise, the data in op2 is moved to op1 and op2 is replaced by op3.
; not all operands required, though if op1 or op2 are not supplied
; assembly should fail
; could also be defined as 'macro mov op1*, op2*, op3' to force requirement of the first two arguments
macro mov op1, op2, op3
{
if op3 eq
mov op1, op2
else
mov op1, op2
mov op2, op3
end if
}
Hello World
[edit | edit source]This is a complete example of a Win32 assembly program that prints 'Hello World!' to the console and then waits for the user to press any key before exiting the application.
format PE console ; Win32 portable executable console format
entry _start ; _start is the program's entry point
include 'win32a.inc'
section '.data' data readable writable ; data definitions
hello db "Hello World!", 0
stringformat db "%s", 0ah, 0
section '.code' code readable executable ; code
_start:
invoke printf, stringformat, hello ; call printf, defined in msvcrt.dll
invoke getchar ; wait for any key
invoke ExitProcess, 0 ; exit the process
section '.imports' import data readable ; data imports
library kernel, 'kernel32.dll',\ ; link to kernel32.dll, msvcrt.dll
msvcrt, 'msvcrt.dll'
import kernel, \ ; import ExitProcess from kernel32.dll
ExitProcess, 'ExitProcess'
import msvcrt, \ ; import printf and getchar from msvcrt.dll
printf, 'printf',\
getchar, '_fgetchar'
This is an example for x86_64 GNU+Linux:
format ELF64 executable 3 ;; ELF64 Format for GNU+Linux
segment readable executable ;; Executable code section
;; Some definitions for readabilty purposes
define SYS_exit 60
define SYS_write 1
define stdout 1
define exit_success 0
_start: ;; Entry point for our program
mov eax, SYS_write ;; SYS_write( // Call the write(2) syscall
mov edi, stdout ;; STDOUT_FILENO, // Write to stdout
mov esi, hello_world ;; hello_world, // Buffer to write to STDOUT_FILENO: hello_world
mov edx, hello_world_length ;; hello_world_length, // Buffer length
syscall ;; );
mov eax, SYS_exit ;; SYS_exit( // Call the exit exit(2) syscall
mov edi, exit_success ;; EXIT_SUCCESS, // Exit with success exit code, required if we don't want a segfault
syscall ;; );
segment readable ;; Read-only constant data section
hello_world: db "Hello world", 10 ;; const char *hello_world = "Hello world\n";
hello_world_length = $ - hello_world ;; const size_t hello_world_length = strlen(hello_world);