Writing an ARMv2 assembler in Forth
Contents
Introduction
See
sources.
This article is about a Gforth implementation of
a simple ARMv2 macro
assembler. It is loosely related to a series of articles about an
ARM Forth dialect. (part
1 / part
2)
My idea for this assembler was to approach the conventional
ARM syntax in a Forth
way (all words) that directly assemble into the current word, side
idea was to make it as simple as possible so it can be compiled
by a very light Forth (e.g.
ARM-ForthLite) with few primitives.
Generating opcodes into words is useful as a way to extend low
level Forth primitives, they can be comfortably defined with the
assembler in some sort of inline mode.
(note : this feature apply less to "high level" multi platform
Forth such as Gforth though)
This assembler was sketched out while i was making my
ARM based Forth as some sort of side exploration, i later took
the opportunity to redo it from scratch as i felt it would be a fun
project that may be useful for Acorn computers stuff / my next ARM
based Forth.
Note : Gforth is a
standardized Forth that works well on Linux, it has many features,
the assembler code can be pasted in the Gforth command line
interpreter to test it.
Why ARMv2 ?
Mainly due to my fondness for this subset (RISC,
elegance with pragmatic quirkiness etc.) + associated interest of
early Acorn
computers, ARMv2 is a subset of late 80s up to now 32 bits ARM
(ARMv2
up to ARMv7), it works mostly unchanged along 30+ years of
history and is still widely used although slowly replaced by 64
bits ARM. The instruction set is very small.
Resources
Some of the resources i used to build this :
- ARMv1 documentation is impressively to the point (only 8 pages for all the instructions !)
- ARMv2 documentation for more details and multiply instructions
Sample code / Syntax comparison
Here is a syntax comparison between the Forth ARMv2 assembler
and a conventional assembler (BBC BASIC inline
assembly) for the same
program :
\ a bunch of predefined OS constants
(BBC BASIC bundle these constants)
: OS_WriteI $100 ; immediate
: OS_ReadMonotonicTime $42 ; immediate
: OS_ReadEscapeState $2c ; immediate
: OS_Exit $11 ; immediate
variable archismall_loop
create ARCHISMALL_ARMv2_CODE
OS_WriteI $16 + swi
\ swi OS_WriteI+22
OS_WriteI $d + swi
\ swi OS_WriteI+13
[] $2c imm r15 r9 ldr
\ ldr r9,[r15,#44]
$140 imm r4 mov
\ mov r4,#320
archismall_loop !LABEL
\ .archismall_loop
OS_ReadMonotonicTime swi \
swi OS_ReadMonotonicTime
$1 asr r3 r2 r2 add
\ add r2,r2,r3,asr #1
$1 asr r2 r3 r3 sub
\ sub r3,r3,r2,asr #1
$13 lsl r0 r2 r2 sub
\ sub r2,r2,r0,lsl #19
$18 lsr r3 r6 mov
\ mov r6,r3,lsr #24
r9 r4 r6 r7 mla
\ mla r7,r6,r4,r9
$4 lsr r0 r6 mov
\ mov r6,r0,lsr #4
[] $18 lsr r2 r7 r6 strb \
strb r6,[r7,r2,lsr #24]
OS_ReadEscapeState swi
\ swi OS_ReadEscapeState
archismall_loop @LABEL bcc \ bcc
archismall_loop
OS_Exit swi
\ swi OS_Exit
\ .screenAddr
$1fec020 l,
\ dcd &1fec020
here constant ARCHISMALL_ARMv2_CODE_ENDThe assembly code is directly inserted (inlining) into
ARCHISMALL_ARMv2_CODE word, the last line mark the end
of the assembly code so the content can be retrieved later on
through the ARCHISMALL_ARMv2_CODE_END constant.
The ARM code looks close to the conventional syntax albeit
with some differences :
- it looks reversed due to Reverse Polish Notation, this is due to Forth syntax, there would be parenthesis if it was LISP !
- there is no commas
- some immediate values are followed by the imm word as a way to distinguish them from register operands
- labels use variables so they require a definition, most annoying part of the assembler perhaps although fixable in various ways (see conclusion)
- data transfer instructions use a [] word as a way to distinguish between pre-indexed and post-indexed, doesn't differ much as it still use the same elements
Some instructions may looks verbose as there is no
optional arguments (e.g. can't do strb r0,[r1]), can be
fixed by detecting [] or looking at stack depth.
Throw away unnamed labels could be made easily by
storing here on stack.
Raw Opcode output
Here is a way to output the raw opcodes to the console :
: ARM32_OPCODE
hex 0 ?do dup i + l@ 8 u.r cr 4 +loop drop decimal
;
ARCHISMALL_ARMv2_CODE ARCHISMALL_ARMv2_CODE_END
ARCHISMALL_ARMv2_CODE - cr ARM32_OPCODE \ ARM32_OPCODE can be
replaced by "dump" word as wellImplementation
The implementation map ARMv2 mnemonic to a Forth word
(a mnemonic = a word) so it looks quite close to the conventional
syntax, some of the mnemonics definition only differ by a
word for mnemonics that just flip some bits. (e.g. andeq vs
andeqs)
90% of the assembler code is code duplication and
merging bits together with OR !
Words
Here are the Forth words used by the implementation
:
- stack : swap over rot dup drop exit
- flow controls : do loop unloop if else then i
- arithmetic : lshift rshift * - + invert negate
- memory access : allot here l@ l!
- logic : or 0= 0> u<
- misc : variable (useful for labels; imply a create word unless simpler / builtin scheme is used)
Most of them are trivial to implement as they nearly map 1:1
to corresponding CPU code, most of them can be
directly defined in Forth as well.
Loop support is optional (may remove 3 words) although useful
for a macro assembler (loop unrolling etc.), there is only two
small loops for immediate encoding and block data transfer
instruction and they can be unrolled without compromises.
Words such as here or allot are also
optional, they are useful to emit opcodes into definitions but
opcodes could be emitted elsewhere as well. (buffer etc.)
Some refactoring may help to further reduce the amount of
words used by the implementation.
Mnemonics
Mnemonics definition are very short and looks like
this :
: mov ARM2_DPI ARM2_AL
ARM2_MOV l, ; immediate
They all end up with l, which emit the computed
opcode into the current definition then advance by 4 bytes :
: l, here l! $4 allot ;Most "heavy" computation of the assembler happen in the
encoding of data transfer instructions through
ARM2_SDT (especially) and ARM2_MDT words (also
immediate value encoding), single data transfer instructions have
many options / different arguments length to support either an
immediate operand or a register which can also be shifted.
Here is four type of single data transfer instruction :
[!] $18 imm r7 r6 strb \
strb r6,[r7,#24]!
[] $18 lsr r2 r7 r6 strb \ strb r6,[r7,r2,lsr #24]
$18 imm [] r7 r6 strb \ strb r6,[r7],#24
$18 lsr [] r2 r7 r6 strb \ strb r6,[r7],r2,lsr #24
The block data transfer word consume a list of registers as
operand :
{!} r4 r3 r2 r1 r0 stmia \ equivalent
to : stmia r0!,{r1-r4}Here ARM2_MDT word (that is used by
stmia) consume stack values until 16 values are consumed or
a value isn't in the [0,15] range. (this is the case of what is
left on stack by {} which act as an exit
condition)
Quirk
Some implementation quirks as my goal was to keep it pragmatic
/ straightforward :
- had to define my own and because it collide with the corresponding assembly mnemonics
- no ARM supervisor stuff support (don't need it), missing bits are easy to add though
- lots of redundancy on some instructions definition, could have been avoided by generating them, still okay as the ARMv2 set is tiny
- didn't go for individual square brackets as it may collide with some predefined words, also use one more argument
- i don't support range syntax for block data transfer although easy to add
- no errors handling at all, even for immediate value encoding (will produce an underflow)
- many pseudo instructions are unsupported yet such as using a label with regular instructions
- some symbols such as imm does not map to the expected symbol # due to collisions again (same issue as and but less critical and an easy fix if needed)
- didn't implement data definitions yet (i use l, directly instead), an easy change though (define .word or dcd that map to l, for example)
Conclusion
This approach probably dates back to early Forth assemblers
(Gforth
ARMv4 assembler looks similar) and demonstrate how useful Forth
can be as a way to quickly get a nice and powerful macro assembler
working with minimal amount of effort from a tiny Forth core with
near zero amount of parsing, the assembler can then be used to
extend the lower level Forth primitives in a comfy way.
Also quite cool that the assembler syntax doesn't depart much
from conventional syntax, it still looks readable (in a RPN way)
although a bit verbose. (more optional arguments may be a quick way
to fix this)
It also show the elegance of the ARMv2 subset, a simple but
working subset for most tasks with powerful bits !
Only gotcha is that i couldn't find a way to make
labels looks like conventional labels in Gforth (shouldn't require
variable definition) as the label name will get
compiled into a new word and thus break the inlining, it is
certainly doable (could use a buffer instead of putting opcodes
into a definition) but it departed too much from the simplicity
goal, may be easier with a custom Forth by hacking it such as
allowing multiple dictionary space.
back to top
