comp.sys.acorn (890/892)
From: zrzm0370@rusmv1.rus.uni-stuttgart.de (Joerg Scheurich)
Newsgroups: comp.sys.acorn
Subject: integer division
Message-ID: <1991Jun10.112536.15897@rusmv1.rus.uni-stuttgart.de>
Date: 10 Jun 91 11:25:36 GMT
Sender: zrzm0370@rusmv1.rus.uni-stuttgart.de (Joerg Scheurich)
Organization: Comp.Center (RUS), U of Stuttgart, FRG
Lines: 92

Posting-number: Volume 01, Issue 07
Submitted-by: MUFTI
Archive-name: Division Macros/part01

Hi 

So i know, there is no Assembler-command for integer-division in the 
ARM. Therefor a fast division-routine is a thing of common interest.

In a earlier Posting i wrote about the bad quick hacked division-routine
for the demo of my formula-tool (math. formula ==> assemblermacros).

Now i wrote a better routine, but i don't know, if there is a better 
algorithm.

The shorter and slower 32-Bit/32-Bit-division-routine requires
382     Cycles in 100 Bytes, with usage of 4 32-Bit-registers and 4 Byte memory.
With unrolling the loop,
the faster  and longer 32-Bit/32-Bit-division-routine requires 
182     Cycles in 692 Bytes, with usage of 4 32-Bit-registers.
  
This is very faster then the old routine (which requires 2**33 Cycles in 
worst case and requires all time of the universe, if there is a division
by zero ... )

Cause the INTERNAL-assembler-command of the INTEL 8086
for a                  32-Bit/16-Bit-division  requires
165-184 Cycles in 2-4 Bytes, with usage of 3  16-Bit-registers,
i think, the result is ok.   

The result is nothing against the INTERNAL-assembler-command of the ARM
for a                  32-Bit*32-Bit-multiplication which  requires
17     Cycles in    4 Bytes, with usage of 1-3 32-Bit-registers,
but my Professor in digital electronics used to say:
"If a program contain much divisions, the programmer is a idiot" ...

Know someone a better (faster or shorter) algorithm ?

so long
MUFTI

( internet:    zrzm0111@helpdesk.rus.uni-stuttgart.de
  ( from janet: zrzm0111%de.uni-stuttgart.rus.helpdesk@NFS-RELAY )
  bitnet:      ZRZM  AT DS0RUS1I )

This is not a posting of a binary. Cause the readable routines contains very 
much linefeeds and spaces, I send them in submit-extract-format (41 lines). 

