
ARM Assembler
*************


Using the assembler
===================

Coding in ARM assembler is very straightforward.  If you have used an 
ARM Assembler before, you will already know the instructions.  
Otherwise I recommend 

'Acorn Risc Machine Family Data Manual, Prentice Hall, ISBN 0-13-781618-9' 

for further information.  It tells you everything about the ARM cpu 
you should know and covers the whole instruction set plus lots of 
hardware details.  In fact it was my only source of information at 
hand when starting this Risc-OS Forthmacs port.  

This documentation is by far not complete, but it covers most aspects.  
If you are writing code, just have a look at some kernel sources and 
see how it works.  Whenever you are not sure about the produced code, 
have a look at it by 
    code demo ...... c;
    see demo
and you have the code just in front of you.  

Also there is a chapter "Assembler Tutorial".  

As most Forth assemblers are, this assembler is really just a 
vocabulary which contains the words for assembling ARM code.  It is 
"activated" by adding the assembler vocabulary to the search order.  
There are also some common ways to control assembly which do more than 
just put the assembler vocabulary in the search order.  It also uses a 
'data first - operand last' syntax as Forth generally does.  

Lets now have a look at some kernel source and see what the syntax 
looks like in the forth assembler syntax and in the original Acorn 
Syntax ( displayed by the disassembler utility).  

code count      (s adr -- adr1 cnt ) 
        r0      top     mov
        top     r0 byte )+ ldr
        r0      sp      push c;
see count
code count 
 (   a148 )  mov     r0,r10
 (   a14c )  ldrb    r10,[r0],#1
 (   a150 )  str     r0,[r13,#-4]!
 (   a154 )  ldr     pc,[r8],#4


General syntax
==============

All instructions follow the general syntax: 

        ARM:        opcode  r-dest r-n operand
        Forth:      r-dest r-nsrc operand modifiers  condition op-code


The brackets and commas in the original assembler source are replaced 
by spaces, 'addressing mode indicators' and macros.  The 

ia [r0],#4 will be )+ , indicating a postincrement by 1 or 4 according 
to byte/word access.  

push is a macro meaning 
    -( str.

c; at the end assembles a next instruction 
    ldr     pc,[r8],#4
    pc  ip )+  ldr
and quits assembling.  

The operands ( registers or numbers ) must appear in the correct order 
followed by modifiers.  


Conditions
==========

All instructions can be conditionally executed on ARM cpus.  All 
condition codes are implemented, they should be preferably written 
just before the opcode itself.  You don't have to write down the AL 
condition, it is the default.  

Note: According to ARM standards, NV is NOT implemented and should 
never be used because of future instruction set extensions.  

Condition codes available : 
EQ NE CS CC MI PL VS VC HI LS GE LT GT LE AL 


Shifts
======

There a numerous shifts for operators available, 

ASL #ASL LSL #LSL LSR #LSR ASR #ASR ROR #ROR RRX , 

all shift operator leaded by a # mean count of shift specified by a 
number, otherwise by a register.  

This assembler is clever enough to find out shifted immediates itself, 
so you don't have to worry about lines like 
    top th f0 #  td 24 #lsl mov
just write 
    top th f0000000 # mov
instead.  


Register usage
==============

Registers R0 - R6 are available for use within code definitions.  
Don't try to use them for permanent storage, because they are used by 
many code words with no attempt to preserve the previous contents.  

    r7      floating stack pointer  fsp
    r8      instruction pointer     ip
    r9      user area pointer       up
    r10     top-of-stack register   top
    r11     return stack pointer    rp
    r12     RiscOS frame pointer    fp never touch this
    r13     stack pointer           sp
    r14     link register           lk
    r15     pc + status + flags     pc
Note: In future CPU Versions, the internal structure of the PC 
register will be different, it seems to be better, to imagine PC and 
status register as two registers.  The hardware-errors and the 
.REGISTERS instruction know about this already.  


Structured programming
======================

This assembler supports structured programming not by using labels but 
common forth-like structures instead.  The structures do not have to 
fit on one line, and they may be nested to any level.  The range of 
the branches assembled by these structures is not restricted.  

Implemented structures are: 
    set the flags                   \ produce the condition
    condition if ...                \ if condition is met do this
              else ...              \ otherwise this
              then
    
    
    
    begin ....
          set the flags             \ produce the condition
    condition     while ...         \ do this when condition met
          ( you may set the flags )
    ( condition ) repeat            \ the repeat is normally always done
                                    \ but you may also test for another
                                    \ condition.
    
    
    begin ...
          set the flags             \ produce the condition
    condition until                 \ leave the loop when condition is met
    
    
    
    begin ... again                 \ loop until whatever may happen


Porting
=======

The ARM assembler can be used also by other Forth systems, all 
hardware specific parts are written portable and can be changed in 
case of problems very easily.  So a 68k-Forthmacs can metacompile ARM 
code by this assembler without any change.  In fact, the very first 
metacompilation of this Risc-OS Forthmacs took place on an ATARI-ST 
having 1MB Ram and a 720k disk.  


Byte-sex
========

Both byte-sexes can be produced by this assembler, this allows 
portable assembler code for all ARM CPUs.  LITTLE-ENDIAN and 
BIG-ENDIAN do the switch.  


ARM2/3/6
========

The assembler takes care of some cpu dependent restrictions, ARM2 
disallows the more advanced instructions, ARM3 allows them.  


Forth Virtual Machine Considerations
====================================

The Forth parameter stack is implemented with r13, but the name SP 
should be used instead of r13, in case the virtual machine 
implementation should change.  

The return stack is implemented with r11, and the name RP should be 
used to refer to it.  

The base address of the user area ( the user pointer) is r9 but should 
be referred to as UP. User variable number 124 (for instance) may be 
accessed with the 
    up td 124 d)
addressing mode.  There is a macro 'USER which will assemble this 
addressing mode for you.  

The interpreter pointer IP is r8.  The interpreter is 
post-incrementing, so when a code definition is being executed, IP 
points to the token after the one being executed.  A "token" is the 
number that is compiled into the dictionary for each Forth word in a 
definitions.  For Risc-OS Forthmacs, a token is a 32-bit absolute 
address.  


Assembler Glossary 
===================


________________________________________________________________________
PC                  ( -- n )                                
portable name for the PC register 


________________________________________________________________________
SP                  ( -- n )                                
portable name for the stack pointer 


________________________________________________________________________
FSP                 ( -- n )                                
portable name for the floating stack pointer 


________________________________________________________________________
UP                  ( -- n )                                
portable name for the user pointer 


________________________________________________________________________
IP                  ( -- n )                                
portable name for the instruction pointer 


________________________________________________________________________
RP                  ( -- n )                                
portable name for the return stack pointer 


________________________________________________________________________
TOP                 ( -- n )                                
portable name for the top of stack register 


________________________________________________________________________
'USER               ( --  )        'name'                   
Executed in the form: 
         top   'user <name>   ldr
<name> is the name of a User variable.  Assembles the appropriate 
addressing mode for accessing that User variable.  

In Risc-OS Forthmacs, the addressing mode for User variables is 
                 up #n d)
where #n is the offset of that variable within the User area.  


________________________________________________________________________
;CODE               ( --  )                  C,I            
semi-colon-code
                    ( --  )
Used in the form: 
                 : <name>  ... create ... ;code ... c; (or end-code)
Stops compilation, terminates the defining word <name>, executes 
ASSEMBLER, and does DO-ENTERCODE. 

When <name> is later executed in the form: 
                 <name> <new-name>
to define the word <new-name>, the later execution of <new-name> will 
cause the machine code sequence following the ;CODE to be executed.  

This is analogous to DOES>, except that the behavior of the defined 
words <word-name> is specified in assembly language instead of 
high-level Forth.  

;CODE calls DO-ENTERCODE, this is implementation specific and 
assembles the code needed to start the assembler code with the body of 
the defined word in TOP 
         top     sp      push
         top     lk      th fc000003 # bic

Note for specialists: You may do 
    ;code
      -8 ass-allot
      ...
and handle the link register and stack on your own which can be 
somewhat faster.  

See: CODE DOES> 


________________________________________________________________________
ADR                 ( rx addr --  )                         
Assembler macro with the following effect: 

addr is moved to register rx.  Within short distances this is achieved 
by a PCR instruction, otherwise it's more complicated.  

Note: The address will be relocated correctly! 


________________________________________________________________________
ALIGNING?           ( -- addr  )                            
variable holding flag, true means assembler does aligning on its own.  
Implemented for CPU independent metacompiling.  


________________________________________________________________________
ALU-INSTRUCTIONS    ( r-dest r-op1 op2{r-op2|imm} --  )     
Available instructions with this syntax: 

AND EOR SUB RSB ADD ADC SBC RSC TST TEQ CMP CMN ORR BIC  

These instructions all have two data-inputs to the alu, the register 
r-op1 and the operand op2.  This can be another register or an 8-bit 
immediate.  

The register r-op2 can be "shifted" in any way specified by a shift 
specifier, either a 5-bit integer or another register plus the shifted 
register.  The immediate operand can be rotated right by 
2*(4-bit-integer).  

If you give "large" literals as arguments, the assembler will generate 
the correct shifts itself.  

The # modifier declares an immediate operand as in: \ top r0 3 # add 

The S modifier will set the flags according to the result, the 
instruction will be ADDS instead of ADD .  

MOV and MVN are somewhat different, the operand r-op1 isn't needed.  
Also, both can handle "big" immediates themselves, 
         top th 12345678 # mov
won't be a problem, MOV assembles all instructions needed.  

CMP and (Fcmn) can both handle negative immediate operandes, they try 
to find out which operand is possible.  


________________________________________________________________________
ASS-ALLOT           ( n --  )                deferred       
Allocates n bytes in the dictionary.  The address of the next 
available dictionary location is adjusted accordingly.  

default ALLOT, implemented for ( cpu independent ) metacompiling.  


________________________________________________________________________
ASSEMBLER           ( --  )                                 
Execution replaces the first vocabulary in the search order with the 
ASSEMBLER vocabulary, making all the assembler words accessible.  


________________________________________________________________________
BIG-ENDIAN          ( -- )                                  
Switches assembler to big-endian target code 


________________________________________________________________________
BRANCH              ( addr --  )                            
Assembles a branch instruction to here.  Can be modified by DOLINK and 
all condition codes.  


________________________________________________________________________
BYTE                ( -- )                                  
modifier for the assembler, memory accesses mean byte wide access 


________________________________________________________________________
CODE                ( --  )        'name'    M              
A defining word executed in the form: 
                 code <name> ... end-code or c;
Creates a dictionary entry for <name> to be defined by a following 
sequence of assembly language words.  Words thus defined are called 
code definitions or primitives.  Executes ASSEMBLER and sets the 
opcode defaults .  

This is the most common way to begin assembly.  


See: END-CODE C; 


________________________________________________________________________
CODE!               ( n addr --  )           Deferred       
Stores a 32-bit word into the code at addr.  

This word is deferred so that the metacompiler may change it to 
assemble code into the target dictionary rather than the resident 
dictionary.  It also handles little/big endian target code.  


________________________________________________________________________
CODE,               ( n --  )                Deferred       
Places n in the dictionary at ( assemblers ) HERE and ASS-ALLOTs 
enough space for a word.  

This word is deferred so that the metacompiler may change it to 
assemble code into the target dictionary rather than the resident 
dictionary.  It also handles little/big endian target code.  


________________________________________________________________________
C;                  ( --  )                                 
c-semi-colon
Terminates the current code definition and allows its name to be found 
in the dictionary.  

Sets the CONTEXT vocabulary to be same as the CURRENT vocabulary 
(which removes the ASSEMBLER vocabulary from the search order, unless 
you have explicitly done something funny to the search order while 
assembling the code).  

Executes NEXT to assemble the "next" routine at then end of the code 
word word being defined.  The "next" routine causes the Forth 
interpreter to continue execution with the next word.  


This is the most common way to end assembly, calls END-CODE. 


________________________________________________________________________
CONDITIONS          ( --  )                                 
All instruction are executed only if the correct condition is met, the 
assemblers default is AL (always), but these are also available: 

EQ NE CS CC MI PL VS VC HI LS GE LT GT LE AL 


________________________________________________________________________
DECR                ( reg n# --  )                          
Macro, n# will be subtracted from reg.  


________________________________________________________________________
DOLINK              ( --  )                                 
modifier for BRANCH instruction, the current pc will be saved to the 
link register.  


________________________________________________________________________
END-CODE            ( --  )                                 
Terminates a code definition and allows the <name> of the 
corresponding code definition to be found in the dictionary.  

The CONTEXT vocabulary is set to the same as the CURRENT vocabulary 
(which removes the ASSEMBLER vocabulary from the search order, unless 
you have explicitly done something funny to the search order while 
assembling the code).  

The NEXT routine is not automatically added to the end of the code 
definition.  Usually you want NEXT to be at the end of the definition, 
but sometimes the last thing in the definition is a branch to 
somewhere else, so the NEXT at the end is not needed.  


See: C; 


________________________________________________________________________
ENTERCODE           ( --  )                                 
Starts assembling after stack checking, setting the assembler defaults 
and switching to ASSEMBLER. 


________________________________________________________________________
GET-LINK            ( -- reg --  )                          
Assembler macro, equivalent for: 
    lk fc000003 # bic
this is useful to get the address after a branch instruction.  
    xxxxx dolink branch  ---+
      A) data ...           |
                            |
                            |
      B) top get-link  <----+
So after branching to B), TOP will be set to A) 


________________________________________________________________________
INCR                ( reg n# --  )                          
Macro, n# will be added to reg.  


________________________________________________________________________
LABEL               ( --  )        'name'    F83            
A defining word used in the form: 
    label <name> ... end-code
    label <name> ... c;
Creates a dictionary entry for <name> consisting of a following 
sequence of assembly language words.  When <name> is later executed, 
the address of the first word of the assembly language sequence is 
left on the stack.  


See: END-CODE 


________________________________________________________________________
LDM                 ( rx1 rx2 .. rxn  n#  r-adr --  )       
Load multiple registers from the address pointed to by r-adr, an 
addressing modes must be defined.  

The register list is given by all register names (don't name a 
register twice) and the number of registers.  
     r0 r1 r2 r3 4   sp ia! ldm
This loads registers r0-r3 from the stack and sets the stack pointer 
to the next stack entry.  


See: LDR STM 


________________________________________________________________________
LDR                 ( r-data r-adr operand2 --  )           
r-data is read from memory, the default is word (32-bit) wide, but the 
modifier BYTE sets this byte-wide access.  

The address is calculated using r-adr and the operand2.  It can be 
another register (the shift specified as usual by a 5-bit literal and 
a shift type) or a 12-bit immediate offset.  

operand2 can be added to or subtracted from r-adr according to the 
addressing mode defined by two letters.  The first tells whether 
(i)ncreasing or (d)decreasing should be used, the second whether the 
in/decreasing takes place (b)efore or (a)fter the memory access.  A 
"!" at the end tells "write-back" will take place.  So these modes are 
possible 
         da  ia  db  ib    \ decrease/increase after/before
         da  ia! db! ib!   \ as above plus write-back


Some macros make live a bit more easy, they are somewhat 68k alike, 
and must follow a BYTE modifier because an offset will be calculated 
by the assembler itself.  

    : )      0 #   ib ;
    : )+     @increment  ia ;
    : )-     @increment  da ;
    : -(     @increment  db! ;
    : +(     @increment  ib! ;
    
    : d)     dup abs # offset?  swap 0<  if db  else ib  then ;
    : d)!    dup abs # offset?  swap 0<  if db! else ib! then ;
    : push   -( str ;
    : pop    )+ ldr ;
Examples: 
      top  r6 byte )+ ldr
      top  up 8 d)    ldr


See: STR 


________________________________________________________________________
LITTLE-ENDIAN       ( -- )                                  
Switches assembler to little-endian target code 


________________________________________________________________________
MLA                 ( r-dest r-op1 r-op2 )                  
Assembles a multiply-and-accumulate instruction.  


________________________________________________________________________
MUL                 ( r-dest r-op1 r-op2 )                  
Assembles a multiply instruction.  


________________________________________________________________________
NEXT                ( --  )                                 
Assembler macro which assembles the NEXT routine, which is the Forth 
address interpreter.  

In Risc-OS Forthmacs this is one single instruction.  
         pc  ip  )+  ldr


________________________________________________________________________
NOP                 ( --  )                                 
Assembler macro, equivalent to 
    r0 r0 mov


________________________________________________________________________
PCR                 ( addr -- pc offset  )                  
Assembler macro, expects an address on the stack and calculates its 
address offset from PC. The addressing mode is also set.  


________________________________________________________________________
RETURN              ( --  )                                 
macro for 
    pc  lk  mov


________________________________________________________________________
S                   ( --  )                                 
modifier, the instruction will set the flags according to the result.  
default for tst, teq tstp teqp cmp cmn cmpp cmnp.  


________________________________________________________________________
STM                 ( rx1 rx2 .. rxn  n#  r-adr --  )       
Store multiple registers to the address pointed to by r-adr, an 
addressing modes must be defined.  


See: LDM for more details.  


________________________________________________________________________
STR                 ( r-data r-adr operand2 --  )           
r-data is stored to memory, the default is word (32-bit) wide, but the 
modifier BYTE sets this byte-wide access.  


See: LDR 


________________________________________________________________________
SWI                 ( swi# --  )                            
assembles a swi instruction, the number is swi#.  


________________________________________________________________________
SWIX                ( swi# --  )                            
assembles a swix instruction, the number is swi#.  


________________________________________________________________________
SWP                 ( r-dest r-base r-source --  )          
assembles a swp instruction if Arm3-code is allowed by ARM3 


________________________________________________________________________
T                   ( --  )                                 
modifier, force -T pin.  


________________________________________________________________________
^                   ( --  )                                 
modifier, force access to user-mode registers.  

