The ModulaTor logo 7KB

The ModulaTor

Oberon-2 and Modula-2 Technical Publication


The ModulaTor
Erlangen's First Independent Modula-2 Journal! Nr. 1/Aug-1990 
_____________________________________________________________

Does Modula-2 Generate Racehorses? 

Comparison of Compiler Generated Code Quality 

for Floating Point Arithmetic 

by Guenter Dotzel, ModulaWare 

A simple example serves to demonstrate the code quality generated by two different Modula-2 compilers. In 
this paper the two compilers are called the Zuerich/Hamburger and the Erlanger. 

1. The Zuerich/Hamburger: ETH-Zuerich's DEC VAX/VMS implementation of the Modula/R compiler V1.2 from 
the LIDAS distribution kit of 1986, which itself is an extension of the DEC VAX/VMS Modula-2 compiler 
implementation V3.2 of the University of Hamburg. While the scanner and parser of Zuerich's compiler is 
based upon the Lilith Modula/R version B20, it uses the same machine code generator as the Hamburger. 

2. The Erlanger: ModulaWare's Modula-2 & Modula/R compiler is based upon the Zuerich/Hamburger. The 
current release for DEC VAX/VMS V5.3 is called MVR V2.12 and is the result of a three years work with code 
generator improvements, extensive maintenance and many language extensions in accordance with the 
upcoming ISO/ANSI/BSI/DIN-Modula-2 Standard. The Erlanger allows standard functions in module and 
procedure declaration part (constant expression), record- and array-constructors (aggregates) and has 
more standard functions, namely CAST, FLOATD, FLOATG, FLOATH, LENGTH, MAX, MIN, and SHIFT. 

First, the Modula-2 source program is given in Listing 1. Listing 2 shows an excerpt of the machine code 
listing generated by the Erlanger and Listing 3 shows that of the Zuerich/Hamburger. Then the differences in 
code quality concerning instruction count, code size and execution speed is discussed and summarized. 
Listing 4 shows the main program module used for the benchmarks. The paper closes with a short 
description of the development history of Modula-2 and Modula/R. 

Listing 1: 

    1   DEFINITION MODULE DoSetALoop;
    2     TYPE rCoord = RECORD x,y: REAL END;
    3     PROCEDURE InnerLoop(p: rCoord; iterations: INTEGER): INTEGER;
    4   END DoSetALoop.

    1   IMPLEMENTATION MODULE DoSetALoop;
    2   (* Mandelbrot set calculations inner loop *)
    3   
    4   PROCEDURE InnerLoop (p: rCoord; iterations: INTEGER): INTEGER;
    5   VAR a, a2: rCoord; count: INTEGER;
    6   BEGIN
    7     a := p; a2.x := a.x * a.x; a2.y := a.y * a.y; count:=0;
    8     REPEAT
    9       a.y := 2. * a.x * a.y + p.y;
   10       a.x := a2.x - a2.y + p.x; 
   11       a2.x := a.x * a.x;
   12       a2.y := a.y * a.y; 
   13       INC(count);
   14     UNTIL (a2.x + a2.y >= 4.0) OR (count>=iterations);
   15     RETURN count;
   16   END InnerLoop;
   17   
   18   END DoSetALoop.

Listing 2: Code generated by the Erlanger 

The numbers with a colon at the end of the line indicate the line numbers of the Modula-2 source listing 
above. Note, that the parameters p and iterations are defined to be value parameters and hence have to 
be copied on the stack. By default the VMS procedure calling standard passes a parameter by reference if no 
prefix %IMMED is specified for the formal parameter. To be compatible with VMS, actually first the address of 
iterations and then the address of p is pushed on the stack before calling the procedure InnerLoop 
using the instruction CALLS #2,DoSetALoop.InnerLoop. This explains the displacement deferred 
addressing mode, indicated by @ in the first operand of the MOVQ and MOVL instructions. These instructions 
are marked by an * in Listing 1 and 2. 

All other local variables a, a2 and count are also allocated on the stack. The function result is returned by 
register R0 which is the VMS standard method. The procedure InnerLoop uses only one working register 
R10. The literals #16 and #24 are encoded short floating literals with the values 2.0 and 4.0. All parameters 
are accessed via argument pointer AP and the local variables are accessed via frame pointer FP. The offset 
+4 addresses the y-component of the variables of record type rCoord, whereas the x-component has 
offset 0. 

                                         .TITLE DoSetALoop
                                 ; Program code section
                                         .PSECT  MODULA2.$CODE$,PIC,REL,SHR,RD,EXE,LONG
                                 ; Symbols for procedure InnerLoop
                 FFFFFFF8                p = -8
                 FFFFFFF4                iterations = -12
                 FFFFFFEC                a = -20
                 FFFFFFE4                a2 = -28
                 FFFFFFE0                count = -32
                          00000  InnerLoop:
                     0400 00000          .ENTRY  DoSetALoop.InnerLoop,^M<R10>              ;6
                 5E 20 C2 00002          SUBL2   #32, SP
           F8 AD 04 BC 7D 00005        * MOVQ    @4(AP), p(FP)
           F4 AD 08 BC D0 0000A        * MOVL    @8(AP), iterations(FP)
           EC AD F8 AD 7D 0000F          MOVQ    p(FP), a(FP)                              ;7
     E4 AD EC AD EC AD 45 00014          MULF3   a(FP), a(FP), a2(FP)
     E8 AD F0 AD F0 AD 45 0001B          MULF3   a+4(FP), a+4(FP), a2+4(FP)
                 E0 AD D4 00022          CLRL    count(FP)
           5A 10 EC AD 45 00025  2$:     MULF3   a(FP), #16, R10                           ;9
              5A F0 AD 44 0002A          MULF2   a+4(FP), R10
        F0 AD 5A FC AD 41 0002E          ADDF3   p+4(FP), R10, a+4(FP)
        5A E4 AD E8 AD 43 00034          SUBF3   a2+4(FP), a2(FP), R10                     ;10
        EC AD 5A F8 AD 41 0003A          ADDF3   p(FP), R10, a(FP)
     E4 AD EC AD EC AD 45 00040          MULF3   a(FP), a(FP), a2(FP)                      ;11
     E8 AD F0 AD F0 AD 45 00047          MULF3   a+4(FP), a+4(FP), a2+4(FP)                ;12
                 E0 AD D6 0004E          INCL    count(FP)                                 ;13
        5A E4 AD E8 AD 41 00051          ADDF3   a2+4(FP), a2(FP), R10                     ;14
                 18 5A 51 00057          CMPF    R10, #24
                    09 18 0005A          BGEQ    3$
           F4 AD E0 AD D1 0005C          CMPL    count(FP), iterations(FP)
                    02 18 00061          BGEQ    3$
                    C0 11 00063  1$:     BRB     2$
              50 E0 AD D0 00065  3$:     MOVL    count(FP), R0                             ;15
                       04 00069          RET     

Listing 3: Code generated by Zuerich/Hamburger 

                                         .TITLE DoSetALoop
                                 ; Program code section
                                         .PSECT  MODULA2.$CODE$,PIC,REL,SHR,RD,EXE,LONG
                                 ; Symbols for procedure InnerLoop
                 FFFFFFF8                p = -8
                 FFFFFFF4                iterations = -12
                 FFFFFFEC                a = -20
                 FFFFFFE4                a2 = -28
                 FFFFFFE0                count = -32
                          00000  InnerLoop:
                     0000 00000          .ENTRY  DoSetALoop.InnerLoop,^M<>                 ; 6
                 5E 20 C2 00002          SUBL2   #32,SP
           F8 AD 04 BC 7D 00005        * MOVQ    @4(AP),p(FP)
           F4 AD 08 BC D0 0000A        * MOVL    @8(AP),iterations(FP)
           EC AD F8 AD 7D 0000F          MOVQ    p(FP),a(FP)                               ;7
              7E EC AD 50 00014          MOVF    a(FP),-(SP)
              7E EC AD 50 00018          MOVF    a(FP),-(SP)
                 6E 8E 44 0001C          MULF2   (SP)+,(SP)
              E4 AD 8E 50 0001F          MOVF    (SP)+,a2(FP)
              7E F0 AD 50 00023          MOVF    a+4(FP),-(SP)
              7E F0 AD 50 00027          MOVF    a+4(FP),-(SP)
                 6E 8E 44 0002B          MULF2   (SP)+,(SP)
              E8 AD 8E 50 0002E          MOVF    (SP)+,a2+4(FP)
                 E0 AD D4 00032          CLRL    count(FP)                                 ;8
           00004100 8F DD 00035  2$:     PUSHL   #16640                                    ;9
              7E EC AD 50 0003B          MOVF    a(FP),-(SP)
                 6E 8E 44 0003F          MULF2   (SP)+,(SP)
              7E F0 AD 50 00042          MOVF    a+4(FP),-(SP)
                 6E 8E 44 00046          MULF2   (SP)+,(SP)
              7E FC AD 50 00049          MOVF    p+4(FP),-(SP)
                 6E 8E 40 0004D          ADDF2   (SP)+,(SP)
              F0 AD 8E 50 00050          MOVF    (SP)+,a+4(FP)
              7E E4 AD 50 00054          MOVF    a2(FP),-(SP)                              ;10
              7E E8 AD 50 00058          MOVF    a2+4(FP),-(SP)
                 6E 8E 42 0005C          SUBF2   (SP)+,(SP)
              7E F8 AD 50 0005F          MOVF    p(FP),-(SP)
                 6E 8E 40 00063          ADDF2   (SP)+,(SP)
              EC AD 8E 50 00066          MOVF    (SP)+,a(FP)
              7E EC AD 50 0006A          MOVF    a(FP),-(SP)                               ;11
              7E EC AD 50 0006E          MOVF    a(FP),-(SP)
                 6E 8E 44 00072          MULF2   (SP)+,(SP)
              E4 AD 8E 50 00075          MOVF    (SP)+,a2(FP)
              7E F0 AD 50 00079          MOVF    a+4(FP),-(SP)                             ;12
              7E F0 AD 50 0007D          MOVF    a+4(FP),-(SP)
                 6E 8E 44 00081          MULF2   (SP)+,(SP)
              E8 AD 8E 50 00084          MOVF    (SP)+,a2+4(FP)
                 E0 AD D6 00088          INCL    count(FP)                                 ;13
              7E E4 AD 50 0008B          MOVF    a2(FP),-(SP)                              ;14
              7E E8 AD 50 0008F          MOVF    a2+4(FP),-(SP)
                 6E 8E 40 00093          ADDF2   (SP)+,(SP)
           00004180 8F DD 00096          PUSHL   #16768
                 8E 8E 51 0009C          CMPF    (SP)+,(SP)+
                    09 15 0009F          BLEQ    3$
           F4 AD E0 AD D1 000A1          CMPL    count(FP),iterations(FP)
                    02 18 000A6          BGEQ    3$
                    8B 11 000A8  1$:     BRB     2$
              50 E0 AD D0 000AA  3$:     MOVL    count(FP),R0                              ;15
                       04 000AE          RET     

Discussion of the differences in code quality 

The Zuerich/Hamburger code generator used no working register at all and all expressions were evaluated on 
the stack. The VAX-11 architecture allows certain floating point constant values to be coded in one byte and 
to be used as immediate operand in floating point instructions as shown in Listing 2. The Erlanger makes use 
of short floating literals where possible, whereas the Zuerich/Hamburger always needed 4 bytes and pushed 
the constants and variables on the stack before use. While the code generated by the Erlanger has 14 
instructions (program counter 065H-025H) for the statement sequence of the repeat construct (source line 
numbers 9 to 14), the Zuerich/Hamburger generated 32 (program counter 0AAH-035H). The code size of the 
Erlanger is 64 bytes versus 117 bytes. The Erlanger generated 7 three-operand and 1 two-operand floating 
point instructions, whereas the Zuerich/Hamburger generated no three-operand and 8 two-operand floating 
point instructions. 

Summary 

The improvement of the Erlanger compared to the Zuerich/Hamburger is 56% less instructions and 45% less 
code. To be able to produce such tight code is due to the genuine VAX-11 architecture which allows easy 
code generation for high-level languages. Since the Erlanger produces reliable code in a straight-forward 
fashion without applying special optimizing techniques, like assigning machine registers for temporary 
variables, the VMS debugger can still be used to examine those variables. To get an instruction count of less 
than 13 (one branch instruction could be saved at label 1$: in Listing 2) for the repeat-construct of Listing 1 is 
a challenge for an experienced assembly programmer. 

Using the Mandelbrot set starting at coordinate (-0.75, -0.23) over a range of 0.04 with 256 iterations and 
196608 calls of InnerLoop (512 * 384 pixels), the code generated by the Erlanger runs about 44% faster. This 
was measured on a DEC VAXstationII, where the total running time is 907 seconds for the Erlanger versus 
1633 seconds using the Zuerich/Hamburger - a speed-up of 1.8. 

Listing 4: Main program DoSetA 

For those trying to compete using their favorite compiler, the main program DosetA is also given here. In 
order to do a fair comparison, note, that the parameters of InnerLoop are passed by value and not by 
reference which, in the case of Fortran or C would be the default parameter passing mechanism. 

    1   MODULE DoSetA;
    2   FROM DoSetALoop IMPORT InnerLoop, rCoord;
    3   
    4   TYPE iCoord = RECORD x,y: INTEGER END;
    5   VAR  c, size, step: iCoord;
    6        p, start, range, gap: rCoord;
    7        iterations, count: INTEGER;
    8   
    9   BEGIN
   10     size := iCoord [1024, 768];
   11     step := iCoord [2, 2]; 
   12     iterations := 255;
   13     start := rCoord [-0.75, -0.23];
   14     range := rCoord [0.04, 0.04];
   15     gap := rCoord [range.x/FLOAT(size.x)*FLOAT(step.x), 
   16       range.y/FLOAT(size.y)*FLOAT(step.y)];
   17   
   18     p.y := start.y + range.y;
   19     c.y:=1; WHILE c.y<=size.y DO
   20       p.x := start.x;
   21       c.x:=1; WHILE c.x<=size.x DO
   22         count:=InnerLoop(p, iterations);
   23         INC(p.x, gap.x);
   24         INC(c.x, step.x);
   25       END;
   26       DEC(p.y, gap.y);
   27       INC(c.y, step.y);
   28     END;
   29   END DoSetA.

The module DoSetA is input and output-free. The iteration count is only assigned to the variable count 
which may be used to draw a dot on the screen, e.g. by 

DrawColouredDot (drawing_mode := replace, color := count, position_x := c.x, position_y := 
c.y) 

The example uses so-called record-constructors (aggregates). The notation for an aggregates is 

type_ident "[" expression_list "]" 

which can be used to construct aggregates of user defined record or array type. The statement 

gap := rCoord [range.x/FLOAT(size.x)*FLOAT(step.x), range.y/FLOAT(size.y)*FLOAT(step.y)]; 

for example is equivalent to the statement sequence 

gap.x := range.x/FLOAT(size.x)*FLOAT(step.x);
 

gap.y := range.y/FLOAT(size.y)*FLOAT(step.y); 

History of the programming languages Modula-2 and Modula/R 

Modula-2 was designed by N. Wirth at the Institut fuer Informatik, ETH-Zuerich in Switzerland as the logical 
successor of Pascal.
 

Modula/R is a database programming language and is a true superset of Modula-2. Modula/R extends 
Modula-2 by fully embedded relational database language constructs such as data structure RELATION, by 
a predicate-oriented selection mechanism for relation elements by altering operators and by a transaction 
concept. The extension was designed and first implemented by the LIDAS Group of ETH-Zuerich at the 
Institut fuer Informatik (J. Koch, M. Mall, P. Putfarken, M. Reimer, J.W. Schmidt, C.A. Zehnder) with the 
intention not to introduce too many additional concepts into the language Modula-2. The origin of Modula/R is 
Pascal/R. Pascal/R is a database language designed at the University of Hamburg by J.W. Schmidt that 
extends Pascal by very similar concepts. A group under direction of J.W. Schmidt also derived the first 
Modula-2 Compiler for the VAX from the ETH-Zuerich's PDP-11 implementation. 


IMPRESSUM: The ModulaTor is an unrefereed journal. Technical papers are to be taken as working papers and personal rather than organizational statements. Items are printed at the discretion of the Editor based upon his judgement on the interest and relevancy to the readership. Letters, announcements, and other items of professional interest are selected on the same basis. Office of publication: The Editor of The ModulaTor is Guenter Dotzel; he can be reached by tel/fax: [removed due to abuse] or by mailto:[email deleted due to spam]
  ModulaWare home page   The ModulaTor download    [Any browser]

Webdesign by www.otolo.com/webworx, 14-Jul-1998