Tuesday, January 29, 2013

*Instruction Joke Here* pt. 2

Last time, we discussed what assembly is at its core: a series of mnemonic devices that allow a programmer to direct a CPU at the lowest level.  Now that we're acquainted with it, let's see what they do.  A simple example:

mov    r0, r1

this moves the contents of register1(r1) into r0.  This is the very basic layout of an instruction:

command rd, rs


The command is obviously the instruction type.  As a note, instructions are also sometimes called opcodes.  They're interchangeable The first register is the destination and the second is the source.  This brings up a whole new question, though:  what exactly is a register?  A register is a tiny bit of storage on a CPU that is used as a sort of swap space.  There are 16 general-purpose 32-bit registers available at one time on the ARM7TDMI and ARM946E-S(though there are 31 registers in total) along with a couple status registers, the CPSR and SPSR(current program status register and saved program status register).  The general purpose registers are labelled in 2 ways:

r0   | a1
r1   | a2
r2   | a3
r3   | a4
r4   | v1
r5   | v2
r6   | v3
r7   | v4
r8   | v5
r9   | v6
r10 | sl
r11 | fp
r12 | ip
r13 | sp
r14 | lr
r15 | pc

r0-r15 is how they're labelled in debuggers, so that's our main concern.  The second set are more for identification and we will come back to them at a later point in time.

Now back to the actual use of the instructions.  Say we have some kind of algorithm.  Let's go with something simple to start:

n = 2x + 1

So we want to compute n using assembly.  First thing, r0 is the first register and it is the one that the result is always returned in.  so we will be putting the final number in there.  We'll be taking 1 argument for x and returning the final value.  The way arguments work in assembly is that the first 4 are passed in the first 4 registers, r0-r3 and the rest are sorted to the stack and loaded as needed.  So we will operate under the assumption that r0 is the value we will be acting upon.  

Here's the simple way of doing it:

mov r1, #2
mul  r0, r1
add  r0, #1
bx lr

Here's a more efficient way of doing it:

lsl r0, r0, #1
add r0, #1
bx lr

And finally, here it is with a more complex ARM instruction:

mov r1, #1
add r0, r1, r0, lsl #1
bx lr  

These all accomplish the same thing, but at different speeds.  We don't need to be concerned about that right now since we're just learning the basics, but just as a note it would be(fastest to slowest) 2, 3, 1.

We'll start with taking apart the first one.  It looks like this:

  1. move #2 into r1
  2. multiply the value of r0 by the value of r1 and put the result into r0
  3. add #1 to r0
  4. return  
 The second looks like this:

  1. left-shift r0 by #1(this is equivalent to multiplying by #2)
  2. add #1 to r0
  3. return
The third looks like this:

  1. move #1 into r1
  2. left-shift r0 by #1(multiply by 2) and then add r0 and r1 together
  3. return 

There is a small order of operations sort of concern here, as well.  If you notice in example 3:

add r0, r1, r0, lsl #1

this has to be evaluated the correct way.  It's basically right to left.  r0 is shifted using what's called the "barrel shifter" then it's added to r1 and ends up in r0.  The barrel shifter is a feature on ARM CPUs that allows a register to be shifted as part of another instruction.  This becomes important because it shows up in common uses such as bit shifting.  There's no plain right and left shift instructions in ARM mode so these shifts are taken care of using mov instructions with the barrel shifter.  In THUMB, we would see:

lsl r1, r1 #2 

In ARM:


mov r1, r1, lsl #2

These are equivalent in the different instruction sets.  In the next part we'll get into the difference between ARM and THUMB. 

No comments:

Post a Comment