INDEX
~~~~~

     1   ASM COMMAND SYNTAX

     2   SOURCE FORMAT
           Instruction Set
           Symbols in ASM
           Areas
           Close and Far Symbols
           Limited-scope labels
           Other Exotica

     3   STRINGS, CHARACTER CONSTANTS AND NUMERICS
           Arithmetic expressions
           Assembly-time functions

     4   MACROS
           Labels as macro parameters

     5   CONDITIONAL ASSEMBLY

     6   STRUCTURES
           Defining structures
           Using structures in constant expressions
           Using structures in address expressions

     7   NEW FEATURES

     8   WARNING

     9   DIRECTIVES


1    ASM COMMAND SYNTAX
     ~~~~~~~~~~~~~~~~~~
        ASM is a (mostly) full function assembler for Acorn ARM-based computers running RISCOS,
producing an AOF (Acorn Object Format) file suitable for use with the linker.

        To install ASM, simply move the file "ASM" into your Library directory.

        To assemble a file, use the command:

        ASM [options] <sourcefile>

        where <sourcefile> is the name of the file to be assembled. If no directory is specified,
ASM will look for the source file in sub-directory A. For compatibility with C, the filename
"SOURCE.A" is taken to refer to "A.SOURCE". The output will have the same leaf name as the
sourcefile, and be stored in sub-directory O.

        Options are used to modify the standard behaviour of the assembler, and are introduced by
a hyphen, the option name, and any allowed parameters. So long as enough letters of the option
name are given to preclude ambiguity, any number of letters may be given; all options are case-
insensitive. In the list of options below, the upper case part of the name indicates the minimum
that can be specified to preclude ambiguity.

Option  Effect

-Output <filename>      The assembled code is written to <filename> rather than the
                        default. If no directory is given, subdirectory O is assumed.

-List [<filename>]      An assembler listing is produced. If no filename is given, the
                        file is given the same name as the sourcefile. By default, the
                        file is produced in subdirectory L.

-Hex                    Include the generated hex code in the -List file. If -List is
                        not explicitly specified, this option implies it.

-Include <pathlist>     Specifies a search path for use in the 'INCLUDE' directive (qv).
                        <pathlist> is a comma-separated list of directories (or RISCOS
                        symbols) to be searched for the file.

                        This is a slight deviation from the method use by the C compiler,
                        which allows the -I option (which serves the same purpose) to be
                        specified several times, each of which extended the search list.
                        ASM only allows a single use of -Include, where all the paths are
                        specified at once.

-VErbose                Output informational messages describing the current phase of assembly.

-VAlidate               The List file produced contains a disassembly of the generated code
                        rather than the source code. It is not anticipated that users will
                        make much use of this. As for -Hex, -VAlidate implies -List.

-Define <flags>         Defines a number of flags to be set before beginning the assembly, for
                        use in the IFDEF/IFNDEF directives. <flags> is a comma-separated
                        list of flag names. Note that these flags are NOT assembler symbols,
                        and cannot be used as such - they are only recognized by IFDEF and
                        IFNDEF. Flags set in this way will override any DEFINE or UNDEF
                        directives in the source code.

-Undefine <flags>       As for -define, but will unset the flag. This will override any DEFINE
                        or UNDEF directives in the source code.

-Flags <flags>          A synonym for -define. Deprecated, but retained for backwards compatibility.

-Nocase                 Forces ASM to use case-insensitive name matching for user-defined
                        symbols. This is now equivalent to using -pragma c0 (see below).

-Throwback              Enables the Acorn desktop throwback error-reporting mechanism.

-PROcessor <CPU>        Specifies the target processor. This constrains which of the extended
                        instruction set operations are available. Legal values for <CPU> are:
                                ARM2, ARM3, ARM6, ARM7 and StrongARM.

-PRAgma <pragmalist>    Specify a set of flags that control the way the assembler works. For
                        a definition of the available pragmas, see the entry for PRAGMA in
                        the assembler directives part of this document.
                        eg:  -pragma c0,l0

-Help                   Produce brief help text for the ASM syntax and options.


Examples

        ASM tharg

        will assemble the code in A.tharg and produce an object file O.tharg

        ASM -hex -flags debug,check -include <C$UserLibRoot>,<ASM$MylibRoot> -pragma c0,l1 eric

        will assemble the file A.eric and produce an object file O.eric; a listing file called
L.eric will be produced containing the hex code generated. Any 'INCLUDE' statements will look in
directories <C$UserLibRoot> and <ASM$MylibRoot> for the include files. The conditional assembly
flags debug and check will be treated as set. The assembler c and l pragmas are set to off and on
respectively.


2    SOURCE FORMAT
     ~~~~~~~~~~~~~
        It is assumed that the user is fully familiar with ARM assembly language (what are you
doing with ASM if you aren't?) and therefore this note contains no tutorial-type introduction;
only the differences between the ASM format and that in use in the BASIC assembler are detailed.

        It is recommended that users of ASM be familiar with the "RISCOS Programmer's Reference
Manual" (especially those sections on the linker and Acorn Procedure Calling Standard) and Peter
Cockerell's "ARM Assembly Language Programming".

        Throughout, mnemonics and their options follow the de facto standard, that in Cockerell's
book. For certain FP coprocessor instructions this differs slightly from that used by the
disassembler (SWI "Debugger_Disassemble"), which seems to contain a bug anyway.

        The general format of a line is:

        [<label-part>]  [<mnemonic-part>]  [<comment-part>]

        Any or all of these parts are optional. Unlike the BASIC assembler, there is no facility
to put multiple mnemonics on a single line by separating them with a colon.

<label-part>    A label is any legal ASM Symbol (see below), terminated by a colon.

<mnemonic-part> This part is any ARM instruction mnemonic (eg ADDS, STMFD), generic coprocessor
                mnemonic (eg MCR, CDPEQ), or any Floating Point coprocessor mnemonic (eg MUFD,
                FIXS), plus any associated parameters.
                Alternatively, this could be the a pseudo-operation (ADR or NOP), or any ASM
                directive (eg IFDEF, MACRO).

<comment-part>  A comment is introduced by a semi-colon. Any text following a semi-colon on a
                line is ignored.

In addition to the above format, ASM will also accept constant definitions of the form:

        <label> = <constant-expression>
and     <label> = <register-expression>

where <label> is a legal ASM identifier (without the colon this time),  <constant-expression>
evaluates to an integer, and <register-expression> is one of the built-in register symbols
(see Symbols in ASM below).

The following code fragment demonstrates both possible formats (it actually performs a 64-bit
integer addition):

        result = 0
        lhs = result + 1
        rhs = lhs + 2
        
                EXPORT  Long_Add        ; Make the function external
Long_Add:       STMFD   sp!,{lhs ,lhs+1,link}
                ADDS    lhs,lhs,rhs
                ADCS    lhs+1,lhs+1,rhs+1
                BVS     overflow
                STMIA   result,{lhs ,lhs+1}
                LDMFD   sp!,{lhs ,lhs+1,pc}^
overflow:       ADR     R0,oflerr
                SWI     "OS_GenerateError"
oflerr:         DCD     &901
                DCS     "Arithmetic overflow"
                DCB     0
                ALIGN

                etc

Instruction Set
~~~~~~~~~~~~~~~
        The additional instructions introduced by the ARM3, ARM6, ARM7 and StrongARM series
processors are now supported. ASM must be informed of the architecture for which the code is 
targetted, using the '-processor <proc>' qualifier. If this qualifier is not present, ASM
assumes an ARM3 target. The use of an opcode that is not available on the specified target
processor will generate an error. The extra instructions available are:

            Mnemonic         Earliest        Description
                              Target
            SWP                ARM3     Single data swap
            MRS                ARM6     Processor Status Register transfer (Register->PSR)
            MSR                         Processor Status Register transfer (PSR->Register)
            UMULL              ARM7     Unsigned 64-bit multiply
            SMULL                       Signed 64-bit multiply
            UMLAL                       Unsigned 64-bit multiply with accumulate
            SMLAL                       Signed 64-bit multiply with accumulate
            LDRH/STRH        StrongARM  Half-word load/store
            LDRSH/STRSH                 Signed half-word load/store
            LDRSB/STRSB                 Signed byte load/store
            
        NB. The 64-bit multiply instructions only work in 32-bit modes, while RISCOS works in
26-bit mode, which makes its use in ASM somewhat moot.


Symbols in ASM
~~~~~~~~~~~~~~
        Symbols (Identifier names) in ASM may contain the following characters: A-Z, a-z, 0-9,
underscore (_) and dollar ($). These may be in any order, with the exception that an identifier
cannot begin with a digit. Alternatively, ASM will now accept the ObjAsm syntax of a vertical
bar ("|"), any sequence of printable characters (other than whitespace), and another bar.
For example, |This.is.a.legal.identifier|


        There are several classes of identifiers in ASM :

Constants       Constants are defined by the use of the <constant> = <expression> format. The
                expression must evaluate to an integer, and may contain other constants. For
                example, the symbol "result" in the example above is a constant.

Local Symbols   A local symbol is a label. The symbols "overflow" and "oflerr" in the example
                above are local symbols. Local symbols may be either "Close" or "Far": see below.

Global Symbols  A global symbol is similar to a local symbol, but one that has been made
                externally visible at the link stage by the use of the EXPORT directive. Global
                Symbols may be "Close" or "Far".

Externals       An external symbol is similar to a label, but is not defined within this assembly
                unit. It is introduced by the IMPORT directive. References to External symbols
                must be resolved by the linker. External Symbols are always "Far".

User-defined symbols are case-sensitive (ie "SYMBOL" is not the same as "symbol") unless the
-nocase qualifier has been specified on the command line or PRAGMA C0 has been used.

In addition to the above user-defined symbols, ASM provides a set of built-in symbols of special
types, which are case-insensitive. These symbols refer specifically to registers, either in the
ARM itself or in the FP coprocessor.

These built-in symbols are:

r0-r15  ARM registers 0 to 15.
sp      Stack Pointer. Set to R13.
link    Link register (R14).
pc      Program Counter/Status register (R15).
f0-f7   FP Coprocessor register 0 to 7.

a1-a4   )
v1-v6   )
ip      )  These are defined as part of the Acorn Procedure Calling Standard.
fp      )  ASM binds them to their RISCOS values (APCS-R).
sp      )
lr      )
pc      )

        ASM tries to be as flexible as possible with its parsing: where a mnemonic requires an
ARM register name, ASM will accept any of the register constants above (except the FP register
f0-f7) OR an integer expression which evaluates to a number in the range 0-15. Where a FP
coprocessor register is required, ASM will accept the register symbols f0-f7, or an integer
expression which evaluates to a number in the range 0-7. However, if a constant is required, ASM
will reject the use of any of the register symbols.


Areas
~~~~~
        When the linker is combining several AOF files, it does so on the basis of "Areas". An
area is a named chunk of contiguous memory with associated attributes (eg area contains code,
area contains data, area is readonly etc) (See the AREA directive). ASM allows multiple areas to
be created in a single source file, but the ordering of areas in the final executable image is
wholly the responsibility of the linker. This leads to the concept of Close and Far symbols.


Close and Far Symbols
~~~~~~~~~~~~~~~~~~~~~
        While some ARM mnemonics allow access to the entire address space (eg BL A_Sub) others
allow only a restricted window into the address space (eg PC-relative addressing, such as
LDR R0,Tharg). Where the entire address space is legal, ASM will accept either a symbol which is
close to the the instruction or one that is far away. Where only a restricted address space is
available, ASM will accept only a close symbol.

        ASM regards a symbol as "Close" if and only if it is a local or global symbol, AND ITS
DEFINING POINT IS IN THE SAME AREA AS THE INSTRUCTION THAT REFERENCES IT. All other symbols are
"Far".

        For example :
        AREA    ASM$$CodeA,Code
Y:      etc
        AREA    ASM$$CodeB,Code,ReadOnly
        etc
        BL      X       ; Legal.
        BL      Y       ; Legal. Y is Far, but allowed.
        ADR     R0,X    ; Legal. X is Close
        LDR     R0,Y    ; Illegal. Y is Far & address space not available
        etc
X:      MOV     r0,r1
        etc

        The user need not particularly worry about close and far symbols; ASM handles this
automatically. It is included in this documentation simply to explain the cause of the error
"Local Symbol Expected" that the above code fragment will generate, despite the fact the the
symbol 'Y' is defined locally.


Limited-scope labels
~~~~~~~~~~~~~~~~~~~~
        A symbol of the form $<number> has special scoping rules - the visibility of such varables
is delimited by global symbols. This allows the same label to be multiply-defined, if the two defining
points are separated by the defining point of a global symbol. For example:

        EXPORT  A,B

A:      STMFD   sp!,{r0,r2,link}
        <some code>
        B       $1
        <some code>
$1:     MOV     R0,R1
        LDMFD   sp!,{r0,r2,PC}
        
B:      CMP     R0,#1
        BEQ     $1
        <some code>
$1:
        <some code>
        MOV     PC,link

        As A and B are both global identifiers, the two "$1" labels are legal. The first can be
referenced anywhere after the A label and before the B label, while the second exists between B
and the end of the fragment.

        Note, however, that the use of such labels may require some caution. Firstly, if B is
not global the above code fragment will generate an "Illegal symbol redefinition" error for the
second $1. The second, and perhaps more problematical potential problem is illustrated below:

        EXPORT A
A:      STMFD   sp!,{r0,r2,link}
        <some code>
        B       $1
        <some code>
$1:     MOV     R0,R1
        LDMFD   sp!,{r0,r2,PC}
        
B:      CMP     R0,#1
        BEQ     $2
        <some code>
        B       $1
$2:
        <some code>
        MOV     PC,link

        This code fragment will assemble correctly as it stands. If, however, the source is
modified to export B as well as A it will no longer assemble - the "B  $1" instruction will fail
as $1 is no longer visible at that point (exporting B has reduced the scope of $1).

        The purpose of limited scope labels is to make it easier for a single source file to be
extended or merged with other source files - there is less chance of an identifier clash if
such labels are used rather than normal labels (which have a scope of the entire source file).

        A note for advanced users: limited scope labels are implemented by prepending the last
global identifier and a full stop to the label name (in the example above, $1 is actually stored
internally as A.$1). Should it be necessary to make B global, the code fragment will still
assemble if the

        B        $1

instruction is changed to 

        B        |A.$1|

        NB: If you do not like limited scope identifiers they can be disabled by the use of the
PRAGMA directive.

        
Other Exotica
~~~~~~~~~~~~~
        Where a PC-relative address is allowed, ASM supports an additional syntax, as in:

        B       *+12
or      LDR     R4,*+8

        In this notation, the asterisk * means "the address of the current instruction" (Not the
PC - allowing for pipelining, the PC will be at *+8). ASM resolves this as a PC-relative address.
Its use is not particularly recommended, as local code changes may require that the offset be
manually changed (whereas labels are self-adjusting). However, for short jumps it does negate the
need for a label, which is why it is provided.

        The '@' character also has a special meaning if it is immediately followed by an identifier
(no intervening spaces are allowed). It has the effect converting the identifier into a string
containing the name of the identifier. For example:

l1:     DCSZ    @FindLeaf

is equivalent to

l1:     DCSZ    "FindLeaf"

        This may seem like a fairly bizarre thing to wish to do, but it is potentialy useful in
macros. As an example, the C compiler normally generates a brief preamble to each function giving
the function name, and (in the word preceding the function entry point) a marker that points to
the name. This is used so the run-time system can find function names when generating traceback
information. This can be duplicated in ASM using a macro as follows:

        MACRO   GLOBAL,func
        DCSZ    @func
        ALIGN
        DCD     (((size(@func)>>2)+1)<<2) + &ff000000
        EXPORT  func
proc:   ENDM

(For the meaning of the complex expression in the final DCD, see the following section on
expression evaluation.)

This can then be used in the body of the code as in:

        GLOBAL crc_check

which will be expanded to

        DCSZ     "crc_check"
        ALIGN
        DCD      (((size("crc_check")>>2)+1)<<2) + &ff000000  ; = &ff00000c
        EXPORT   crc_check
crc_check:


3    STRINGS, CHARACTER CONSTANTS AND NUMERICS
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        ASM differentiates strongly between strings and character constants. A string is enclosed
in double quotation marks ("), and can be of any length. A character constant is enclosed in
single quotation marks (') and its maximum length depends on the current context:

        MOV     R0,#'A'  ; Move ASCII 'A' (65) into R0
        MOV     R0,#"A"  ; Will generate an assembly error.
        DCW     'AB'

        The last statement above will store two bytes, ASCII A followed by ASCII B.

        ASM, like C, uses the backslash (\) as an escape character to store odd values in
strings:

            \n      Newline
            \r      Carriage Return
            \t      Tab
            \'      Single Quote
            \"      Double Quote
            \\      Backslash
            \xhh    Hex code hh. eg \x0E is ASCII 14 (<CTRL>N).

        Integers may be specified in a number of ways: & treats the characters following as a
hexadecimal number, % treats it as binary, no character treats it as base 10.

        ADD     R0,R0,#&F0      ; Add F0 Hexadecimal (240 decimal) to R0
        SUB     R1,R1,#%1100    ; Subtract 1100 binary (12 decimal) from R1
        MOV     R2,#15          ; Move 15 decimal into R2

        Floating point numbers may only be specified in base 10. They may include both decimal
points and exponents. However, there must be at least one digit before either the decimal point
or the exponent.

        ADFD    f0,f0,#3.0      ; Add 3.0 to f0.
        DCFD    1.0654E-6       ; Store a double-precision constant.
        DCFE    .3421           ; Illegal - no digit before the decimal point.

        It should be remembered that there are only 8 floating point values that can be used as
immediate value constants: these are 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 0.5 and 10.0.


Arithmetic Expressions
~~~~~~~~~~~~~~~~~~~~~~
        Wherever an integer or floating point number is required, ASM will accept an expression
that evaluates to the correct type (ie integer expressions must not contain any floating point
values). ASM recognizes the following operators:

            Operator                  Precedence
            Unary +                       1
            Unary -                       1
               *                          2
               /                          2
            % (remainder)                 2
            & (bitwise AND)               2
            +                             3
            -                             3
            | (bitwise OR)                3
            ^ (bitwise Exclusive OR)      3
            >>  (Arithmetic shift right)  4
            >>> (Logical shift right)     4
            <<  (Logical shift left)      4

        Sub-expressions in parentheses are allowed, and have the highest priority. They may be
nested to an arbitrary degree.

        Within a single level of priority, expressions are evaluated left to right.

        There are a number of constraints on the use of certain operators:

1)      Only + and - operators can be applied to non-constant symbols (ie addresses).

        Thus
                BL Tharg+2*4
        is legal, but
                BL Tharg*2
        is not.

2)      Bit-based operators (&,  |, ^, >>, >>> and <<) may only be used in integer constant
expressions or sub-expressions.

        Thus
                DCFE  12345.6 + (1<<12)
        is legal, but
                DCFD  12345.6 >> 2
        is not.

Rule 1 is to ensure that addresses are always properly relocatable. Rule 2 is because it is not
very meaningful to apply bit operators to floating point values.


Assembly-time functions
~~~~~~~~~~~~~~~~~~~~~~~

        ASM now provides a number of functions that can be used in expression evaluation. Currently
supported are:

    Name     Type      Param Type     Meaning

    size    integer      string       length of the string or structure
                       or structure
    sin      float       float        Trigonemetric sin
    cos      float       float        Trigonemetric cos
    tan      float       float        Trigonemetric tan
    asn      float       float        Trigonometric inverse sin
    acs      float       float        Trigonometric inverse cos
    atn      float       float        Trigonometric inverse tan    
    exp      float       float        Exponentiation
    ln       float       float        Natural log
    log10    float       float        Napierian log
    sqrt     float       float        Square root
    fix     integer      float        Integer value (truncation)
    float    float      integer       Convert int to float
    diff    integer  address,address  The (signed) difference between the addresses

NB These functions can only be used on constants or constant expressions (ie they cannot be used
unless the input parameter can be fully evaluated at assembly time).

NNB Although addresses cannot be fully evaluated at assembly time, the difference between two
addresses is a constant. This is exploited by the diff function, which is intended for use in
jump tables:

	CMP	R0,#diff(end,start)/4
	BCS	Error
	ADD	PC,PC,R0,LSL #2
	NOP
start:	B	address1
	B	address2
	B	address3
end:

Irrespective of where the linker puts this segment of code, diff(end,start) will always be 12 bytes.
The CMP checks that R0 is less than 3 (i.e. 12/4), to ensure that R0 does not point off the end of
the jump table. If R0 is greater than 2, the "BCS Error" will be executed. Note that diff is signed:
in the example above, diff(start,end) has the value -12, so the jump table will fail.

The "NOP" instruction is a pseudo-op (introduced in ASM 4.03) that generates a "MOV R0,R0" instruction
- this is the ARM-approved method of specifying a no-op instruction, now that the "NV" 


4    MACROS
     ~~~~~~
        A Macro is a section of code defined in a single place, and then used in several places.
It differs from a Branch-with-link in that the macro is copied in whole wherever it is used; it
thus generates more code than a branch-with-link, but will usually process faster; in addition,
it does not make any use of the stack.

        Macros may be given any number of parameters, of any type (including labels - see below).
These are automatically substituted when the macro instruction is used. The handling of labels
within macros is quite straightforward: if the defining point of the label is within the limits
of the macro, it is regarded as a local label, and substituted; any labels referenced that are
not within the bounds of the array are not substituted. Local label names generated by ASM are of
the form $nnnnnn, where nnnnnn is 300000 plus a count of the number of labels generated. Thus,
the first label will be called $300001, the second $300002, etc. For obvious reasons, labels of
this form should be avoided by the user.

        A macro is defined with the MACRO directive, and terminated with the ENDM directive.

        A macro is used simply by giving its name as a standard mnemonic, followed by a list of
its parameters.

        As an example, consider a macro to take the absolute value of 64-bit integer held in two
consecutive registers. This will be defined as ABS64, and then used twice, once for register pair
R0,R1 and once for register pair R2,R3.


        MACRO   ABS64,regno
        CMP     regno+1,#0      ; Is the upper word negative?
        BPL     notneg

        ; It is negative. Subtract it from zero to get the abs value

        RSBS    regno,regno,#0
        RSC     regno+1,regno+1,#0

notneg: ENDM

        ; Now use the macro
        ABS64   R0      ; Ensure R0,R1 is positive
        ABS64   R2      ; Ensure R2,R3 is positive

        etc

        When expanded, this will be assembled as:

        ; ABS64 R0
        CMP     R0+1,#0 ; Is the upper word negative?
        BPL     $300001

        ; It is negative. Subtract it from zero to get the abs value

        RSBS    R0,R0,#0
        RSC     R0+1,R0+1,#0
$300001:
        ; ABS64 R2
        CMP     R2+1,#0 ; Is the upper word negative?
        BPL     $300002

        ; It is negative. Subtract it from zero to get the abs value

        RSBS    R2,R2,#0
        RSC     R2+1,R2+1,#0
$300002:


NB The use of the INCLUDE directive is specifically forbidden within a macro definition. It is
legal for the body of one macro to use another macro; however, it is illegal to nest the actual
definitions.


Labels as macro parameters
~~~~~~~~~~~~~~~~~~~~~~~~~~
As stated above, it is perfectly legal to pass a label into a macro. For example:


        MACRO   JMPZ,regno,a_label
        CMP     regno,#0
        BEQ     a_label
        ENDM

        JMPZ    R0,tharg

        etc

tharg:  do something


5    CONDITIONAL ASSEMBLY
     ~~~~~~~~~~~~~~~~~~~~
        ASM gives simple support to conditional assembly: ie, the code produced can be made to
depend on the value of some flags defined by the assembly command.

        The directives involved in conditional assembly are IFDEF (a specified flag is defined),
IFNDEF (a specified flag is not defined), ELSE and ENDIF.

        As described above, the flags used by IFDEF/IFNDEF are set/unset either using the -define
and -undefine command line qualifiers, or by the DEFINE and UNDEF directives. Any flags not
explicitly set are treated as unset.

        For example:

                  DEFINE  Debug

                  IFDEF   Debug
                  STMFD   sp!,{r0}
                  ADR     R0,DebugMsg
                  SWI     "OS_Write0"
                  LDMFD   sp!,{r0}
                  B       EndDBug
DebugMsg:         DCD     &902
                  DCS     "Hello there"
                  DCB     0
                  ALIGN
EndDBug:
                  ENDIF

        If the assembly command includes a "-undefine debug", the code fragment will not be assembled, as
the -undefine will override the DEFINE directive.

        IFNDEF works in exactly the same way, except the code is included if the specified flag
is not set.

        ELSE works in the expected way.

        Any number of IFDEF/IFNDEFs may be nested, they may be nested inside MACRO definitions,
and vice versa.

        The same flag may be DEFINEd and UNDEFedd any number of times within the source code, with IFDEF
and IFNDEF using the most recently set values (unless the command line qualifiers -define and -undefine
override them).


6    STRUCTURES
     ~~~~~~~~~~
        Structures offer a way to make the source code clearer; rather than reserving a block of memory
for (say) a WIMP message using RESB, a structure may be defined and individual elements of the block
referred to by name. Structures are loosely based on the C "struct" datatype.


Defining structures
~~~~~~~~~~~~~~~~~~~
        A structure must be defined before it can be used, using the "STRUCTURE" (or "STRUCT") directive.
The directive has a single parameter, the name of the structure. Individual fields (named sub-elements)
are then declared (one to a line), and the structure terminated with an ENDSTRUC directive. A field
declaration consists of the name of the field, followed by one of:

        a) A constant expression giving the number of bytes used by the field
        b) An alignment class (HALF, WORD, DOUBLE); interpreted also as a number of bytes
           (i.e. HALF is 2 bytes, WORD is 4, DOUBLE is 8).
        c) A previously-declared structure name (i.e. an embedded structure).

        For example:

        STRUCTURE WimpMessageHeader
        size      WORD
        sender    WORD
        my_ref    WORD
        your_ref  WORD
        action    WORD
        ENDSTRUC

        STRUCTURE WimpLoadSaveMessage
        header    WimpMessageHeader
        dest_wind WORD
        dest_icon WORD
        dest_x    WORD
        dest_y    WORD
        est_size  WORD
        filetype  WORD
        leafname  256 - size(WimpMessageHeader) - 4 * 6
        ENDSTRUC

        The only other directive that can be used within a structure is ALIGN: i.e. it is specifically
forbidden to nest structure definitions (although references to previously-defined structures are legal).


Using structures in constant expressions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        A structure and field can be used as a constant, as in: 

        LDR       R0,[R1,#WimpLoadSaveMessage.dest_wind]

with the value of the constant equals the offset (from the start of the structure) of the named field. The
example is therefore equivalent to:

        LDR       R0,[R1,#16]


Using structures in address expressions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        It is possible to reserve a block of memory as a structure, as in:

WimpBlock: WimpLoadSaveMessage

which is the equivalent of:

WimpBlock: RESB  size(WimpLoadSaveMessage)

However, the former also allows WimpBlock to be used as a structure to generate address expressions:

           LDR   R0,WimpBlock.header.my_ref
           STR   R0,WimpBlock.header.your_ref

etc.

However, this mechanism is of limited utility, as it requires that PC-relative addressing is legal (i.e.
the base label is in the same area as the instruction referencing it.


7    NEW FEATURES
     ~~~~~~~~~~~~

a)      ObjAsm symbol syntax recognised (see "Symbols in ASM" above).

b)      Use of '@' to convert from an identifier name to a string (see "Other exotica" above).

c)      Assembly-time functions (see "Arithmetic expressions" above).

d)      Limited-scope labels.

e)      Pragmas. See the description of the PRAGMA directive below and the -pragma qualifier.

f)      Structures. 

g)      DEFINE and UNDEF directives.

ASM 4.03, introduces the new function "diff" and the new pseudo-op "NOP".


8    WARNING
     ~~~~~~~
        ASM uses the Debugger_Disassemble SWI to decode the generated opcodes for the '-validate'
option. Sadly, this SWI does not seem to recognise the SWP instruction, and they are all returned
as "Undefined Instruction". Users should ignore these errors, as ASM has been tested fairly
extensively in this area, and the generated codes match with the Acorn documentation. Should you
be able to shed any definitive light on this situation (ie whether the SWI or the documentation
is at fault), please contact the author.

       Similarly, Debugger_Disassemble does not know about the halfword/signed byte variants of
LDR and STR. These are typically shown as BIC, TST, TEQ or CMP instructions.


9    DIRECTIVES
     ~~~~~~~~~~
        The rest of this documentation concentrates on the assembler directives provided by ASM.


=================================================================================================
ALIGN           Ensures that the code pointer is on an appropriate memory boundary.

Format: ALIGN [<size>]

Certain DC directives (DCB, DCW, DCS) store a number of bytes that is not an exact multiple of four. 
RESB may also reserve less than a complete number of words. It is therefore possible for a label to be 
given an address which is not that which is required. The use of ALIGN will bump the pointer forward 
to ensure that it is correctly aligned.

<size> is optional, or it can be one of:

  <size>   Alignment performed
   HALF    This aligns memory to a multiple of 2 bytes.
   WORD    Word-aligns memory (ie bumps the memory to a multiple of 4 bytes)
   DBLE    Aligns memory to a multiple of 8 bytes. DOUBLE may also be used.

The default alignment is WORD, for consistency with existing code.

It should be noted that the alignment sizes are not consistently named when compared to the DC and 
EQU directives (which are, of course, misnamed for consistency with the BASIC assembler). It was 
originally intended to use WORD, DBLE and QUAD, but it was felt that this extended the culture of 
misnaming too far.

NB All mnemonics and all other DC operations automatically carry out an align, and labels are
adjusted to allow for this. Thus the two code fragments below lead to different values for L1 and
L2 (L1 is NOT word-aligned, but L2 is aligned).

1)              DCB     0
        L1:
                B       Tharg

2)              DCB     0
        L2:     B       Tharg

If code fragment 1 had an ALIGN directive after the DCB, L1 and L2 would have the same value.

eg      ALIGN

See Also:       DC, RESB



=================================================================================================
AREA    Instructs ASM to start a new area.

Format: AREA    <areaname>,<attributes>

The areaname parameter is any legal ASM identifier. The attributes parameter(s) must specify one
of CODE or DATA. READONLY may be specified.

eg      AREA    ASM$$Code,Code,ReadOnly

Note that an AREA statement must precede any mnemonics or directives that cause data to be
written to the area.

Later versions of ASM may support additional AOF area types (eg Zero-Initialized, Common Block,
Common Block Reference).

See Also:       



=================================================================================================
DC      Store a constant value.

Format: DC<type>        <value>[,<value>....]

The <type> is one or two characters that define the type of the constant being stored, and hence
the type expected of <value>

     <type>  Type of Value
        B       Byte
        W       Word. Actually this is a 16-bit value (ie a half-word). This letter is used for
                consistency with the BASIC assembler and AASM.
        D       Double Word. Actually a 32-bit value. See above.
        Q       Quad Word. A 64-bit integer value.
        S       String. (NB C programmers: this is NOT zero-terminated - use DCSZ below).
        SZ      Zero-terminated string.
        FS      Single-precision floating point
        FD      Double-precision floating point
        FE      Extended-precision floating point
        FP      Packed floating point.
        
eg      DCB     &FF
        DCS     "Hello there"
        DCSZ    Hello again
        DCFE    12.32E12, 3.141592

The use of DCD allows addresses to be stored, and hence the address of Far symbols to be loaded
into registers:

        IMPORT  myaddress
        etc
        LDR     R1,addr
        etc
addr:   DCD     myaddress

Where a DCD is used that is interpreted as an address, ASM will automatically generate any
required relocations.

See Also:       EQU, RESB



=================================================================================================
DEFINE  Set a conditional assembly flag for use by IFDEF/IFNDEF.

Format: DEFINE <flag>[,<flag>...]

DEFINE sets the status of the flag identifiers to "defined", for use by later IFDEF or IFNDEF
conditional assembly directives.

eg      DEFINE   Debug,Tharg

See Also:       UNDEF, IFDEF, IFNDEF, ELSE, ENDIF



=================================================================================================
ELSE    Part of IF/ELSE/ENDIF conditional assembly sequence.

Format: ELSE

ELSE negates the current condition flag. See the section on Conditional Assembly for details of
the use of IFDEF/IFNDEF/ELSE/ENDIF.

eg      ELSE

See Also:       IFDEF, IFNDEF, ENDIF, DEFINE, UNDEF



=================================================================================================
ENDIF   Part of IF/ELSE/ENDIF conditional assembly sequence.

Format: ENDIF

ENDIF terminates the current conditional assembly statement. See the section on Conditional
Assembly for details of the use of IFDEF/IFNDEF/ELSE/ENDIF.

eg      ENDIF

See Also:       IFDEF, IFNDEF, ELSE, DEFINE, UNDEF



=================================================================================================
ENDM    Part of a MACRO/ENDM macro definition sequence.

Format: ENDM

ENDM terminates the current macro definition. See the section on Macros for details of the use
of MACRO/ENDM.

eg      ENDM

See Also:       MACRO



=================================================================================================
ENDSTRUC Part of a STRUCTURE/ENDSTRUC structure definition sequence.

Format: ENDSTRUC

ENDSTRUC terminates the current structure definition. See the section on Structures for details of
the use of STRUCTURE/ENDSTRUC.

eg      ENDSTRUC

See Also:       STRUCTURE



=================================================================================================
ENTRY   Define the entry address for a program.

Format: ENTRY   <symbol>

The linker requires that one and only one of the object files given to it contain an entry
address; this is the location that will be called when the linked program is run. The ENTRY
directive specifies a (Local or Global) symbol which will be used for the entry point.

eg      ENTRY   Start

Note, however, that there is an extensive quantity of code required to set up application memory
constraints etc, during program initialization, which is outside the scope of this documentation;
for this reason, the author does not advocate the use of the ENTRY directive unless you know
exactly what you are doing; it is much easier to use C for the majority of an application, and
just program the time-critical parts in ASM.

See Also:



=================================================================================================
EQU     Store a constant value.

Format: EQU<type>       <value>[,<value>....]

EQU is a synonym for DC, and works in exactly the same way. It is supported simply for
consistency with the BASIC assembler.

eg      EQUB    12

See Also:       DC, RESB



=================================================================================================
EXPORT  Gives a local symbol global scope.

Format: EXPORT  <label>[,<label>...]

By default, all labels used in an ASM source file are local, and hence other source files cannot
call locally-defined routines. The EXPORT directive gives the linker visibility over the
specified label.

Note that the EXPORT directive and the defining point of the label may appear in either order -
ASM will accept both. However,  it is essential that any EXPORT directive precedes the first
reference to any locally-scoped variables.

eg      EXPORT  ADD_64

See Also:       IMPORT



=================================================================================================
IFDEF   Part of IF/ELSE/ENDIF conditional assembly sequence.

Format: IFDEF   <flag>

The flag parameter is any legal ASM identifier. Note that although it follows the same naming
conventions, such a flag cannot be used in any expression evaluation, for example as an integer.
See the section on Conditional Assembly for details of the use of IFDEF/IFNDEF/ELSE/ENDIF.

eg      IFDEF   Debug

See Also:       IFNDEF, ELSE, ENDIF, DEFINE, UNDEF



=================================================================================================
IFNDEF  Part of IF/ELSE/ENDIF conditional assembly sequence.

Format: IFNDEF  <flag>

IFNDEF acts exactly as an IFDEF, but with the condition negated. See the section on Conditional
Assembly for details of the use of IFDEF/IFNDEF/ELSE/ENDIF.

eg      IFNDEF  Verbose

See Also:       IFDEF, ELSE, ENDIF, DEFINE, UNDEF



=================================================================================================
IMPORT  Declares the name of a symbol defined in a separate module.

Format: IMPORT  <symbol>[,<label>...]

IMPORT informs ASM that the specified symbol needs to be visible to the current module (ie the
name is referenced somewhere in the current source file). The symbol can then be used as a (Far)
address, and any references to it will be handled by the linker. The IMPORT directive may be
placed before or after the symbol is first used (although it is normal to declare the symbol
before it is used).

eg      IMPORT  ErrorHandler

        B       ErrorHandler

See Also:       EXPORT



=================================================================================================
INCLUDE Directs ASM to include another file in the current module.

Format: INCLUDE <filename>

The INCLUDE directive causes ASM to begin reading source text from the specified file. When that
file is exhausted, ASM will continue with the calling file.

The <filename> parameter must be specified as a quoted string. If no directory specification
exists, ASM will look in subdirectory 'I' for the file. For compatibility with C, ASM will accept
a filename of the form "FRED.I" as meaning "I.FRED".

The -Include option on the ASM command allows the specification of a path list for the given file;
if the file is not found locally, then each path will be checked in order. Only if the file is not
found on any path will ASM report an error.

NB The use of INCLUDE is forbidden in a macro definition.

eg      INCLUDE "UtilityHdr.I"

See Also:



=================================================================================================
MACRO   Part of MACRO/ENDM macro definition sequence.

Format: MACRO   <macroname>[,<parameterlist>]

MACRO begins the definition of a new macro. <macroname> can be any legal ASM identifier.
<parameterlist> is a comma-separated list of formal parameter names, which will be substituted
when the macro is instantiated. The actual parameters may be of any type (string, integer, float
etc). See the section on Macros for details of the use of MACRO/ENDM.

eg      MACRO   ERROR,errnum,errtext

A Macro is instantiated by using its name as if it were a mnemonic, and passing values for the
formal parameters.

eg      ERROR   &901,"This is an error" 

See Also:       ENDM



=================================================================================================
PRAGMA  Control special features of the assembler.

Format: PRAGMA   <pragma>[,<pragma>]

The PRAGMA directive is used to provide special controls over the way the assembler functions.
The <pragma> parameters each consist of a single (case-insensitive) character specifying the
feature to be controlled, followed by a 0 (to disable it) or 1 (to enable it). Pragmas may also
be specified at the command line by the use of the -pragma qualifier. Where there is a conflict
between a command-line pragma and one set in the source code the command line will override the
source code setting.

Currently defined pragmas are:

        Code           Meaning                            Default
        
         C      Case-sensitivity of identifiers       1 : Identifiers are case-sensitive.
                Using PRAGMA C0 is the equivalent
                of using the -nocase qualifier.

         L      Limited scope labels.                 1 : Limited scope labels are allowed.
         
         S      Include local identifiers in          0 : Only EXPORTed and IMPORTed symbols
                object code symbol table.                 appear in the symbol table.

eg      PRAGMA  C1
        PRAGMA  L0,S1

See Also:



=================================================================================================
RESB    Reserve a (zero-initialized) block of memory.

Format: RESB    <length>

The RESB directive instructs ASM to increment the code pointer by the specified length; the
block of memory so reserved will be filled with zeros. Note that <length> is specified in bytes,
and hence the code pointer may not be word-aligned after executing the RESB operation.

RESB is generally used where a large buffer is required, to save the use of many DC directives.

eg      RESB    128

See Also:       DC, EQU



=================================================================================================
STRUCTURE  Part of a STRUCTURE/ENDSTRUC sequence to define a structure.

Format: STRUCTURE    <name>

The STRUCTURE directive is used to initiate the definition of a structure. See the section on
Structures for details of the use of STRUCTURE/ENDSTRUC.

STRUCT can be used as a synonym for STRUCTURE

eg      STRUCTURE  WimpMessageHeader
        STRUCT     WimpDataLoadSaveMessage

See Also:       ENDSTRUC



=================================================================================================
UNDEF   Unset a conditional assembly flag for use by IFDEF/IFNDEF.

Format: UNDEF  <flag>[,<flag>...]

UNDEF sets the status of the flag identifiers to "not defined", for use by later IFDEF or IFNDEF
conditional assembly directives.

UNDEFINE can be used as a synonym for UNDEF.

eg      UNDEF    Debug,Tharg
        UNDEFINE Trace

See Also:       DEFINE, IFDEF, IFNDEF, ELSE, ENDIF



