SPARC Assembly/SPARC Details
RISC Computers
[edit | edit source]Registers
[edit | edit source]SPARC processors have 32 integer registers. These registers are broken down into 4 basic categories: globals, locals, inputs, and outputs. The table below shows the general breakdown:
Number | Purpose | Specific name |
%r0–%r7 | Globals: accessible anywhere in a program | %g0–%g7 |
%r8–%r15 | Outputs: used to pass values to/ obtain values from subroutines | %o0–%o7 |
%r16–%r23 | Locals: used within subroutines to manipulate data | %l0–%l7 |
%r24–%r31 | Inputs: contain data passed to a subroutine | %i0–%i7 |
Dispersed throughout these categories are several special purpose registers:
Name | Number | Purpose | Pseudonym |
Stack pointer | 14 | Pointer to the head of the stack. | %sp/ %o6 |
Frame pointer | 30 | Pointer to the current stack frame. | %fp/ %i6 |
Return address | 31 | Return address of the subroutine. | %i7 |
Called return address | 15 | Return address of the called subroutine. | %o7 |
As you can see from the above tables, each register has at least two names, and some of the special purpose registers have three. Any of the available names for a given register is perfectly acceptable regardless of the usage context, and it is up to the programmer to choose which names to use at any particular time. Additionally, using the stack and frame pointer registers in a way other than which they were intended is not recommended and can cause severe functionality issues within a program.
SPARC processors also contain an array of floating-point registers and a small number of special-purpose registers. (further description needed here)
The Fetch and Execute Instruction Cycle
[edit | edit source]Delayed Branch
[edit | edit source]SPARC processors are pipelined, and branching is accomplished through a technique called Delayed Branch Execution. Control Transfer Intructions (CTI) are any instruction that changes the current program counter. For instance, a jmp or call instruction are CTI instructions.
In SPARC, when a CTI instruction is executed, the jump is not handled immediately. Instead, there is a one cycle delay before the branch is executed. This means that the first instruction after the jump instruction is actually handled before the jump takes place. Here is an example:
add %r3, %r2, %r5 jmp SetR5ToZero add %r4, %r5, %r2
Notice that the last instruction executes before the jump takes place, not after the subroutine returns. This first instruction after a jump is called a delay slot. It is common practice to fill the delay slot with a special operation that performs no task, called a no-operation, or nop.
Instruction:
|
nop
|
This instruction performs no action, and therefore we don't need to worry about what order it acts in. However, if we put a nop after every branch instruction, we will waste a lot of processor cycles. Therefore, if you can, it is always good practice to try to squeeze additional instructions into the delay slot, so that we don't waste any processor cycles.