The ModulaTor logo 7KB

The ModulaTor

Ubaye's First Independent Modula-2 & Oberon-2 Journal! Nr. 8, Sep 1996


64 Bit Address Extension of the Alpha Oberon-2 Compiler

by Guenter Dotzel and Hartmut Goebel, ModulaWare, 11th ed. 26-Nov-1997.

The Alpha Oberon-2 compiler A2O V3.0 supports OpenVMS 7.0's 64 bit addressing capabilities for dynamic memory. A2O V3.0 was released in the 4th Q'97.

A2O supported 64 bit signed integer arithmetic (SYSTEM.SIGNED_64) from the start, but pointers and addresses were always 32 bits wide. Addresses were always sign extended (fat sign bit), as already required by pre-V7.0 OpenVMS.

Implementation Notes

To take advantage of OpenVMS Very Large Memory (VLM64) feature for stand-alone Oberon-2 programs, the Alpha Oberon-2 compiler got a new compilation option /pointersize=x,
whose option value x either 32 or 64 determines the size in bits for the pervasive type LONGINT, pointer- and procedure-types. /pointersize=32:



  SHORTINT           8 bit  =  SYSTEM.SIGNED_8
  INTEGER           16 bit  =  SYSTEM.SIGNED_16
  LONGINT           32 bit  =  SYSTEM.SIGNED_32
                    64 bit     SYSTEM.SIGNED_64
  SYSTEM.PTR        32 bit  =  SYSTEM.ADDRESS_32
                    64 bit  =  SYSTEM.ADDRESS_64

The integer type hierarchy is SHORTINT <= INTEGER <= LONGINT <= SIGNED_64.

/pointersize=64:


  SHORTINT           8 bit  =  SYSTEM.SIGNED_8
  INTEGER           16 bit  =  SYSTEM.SIGNED_16
                    32 bit     SYSTEM.SIGNED_32
  LONGINT           64 bit  =  SYSTEM.SIGNED_64
                    32 bit  =  SYSTEM.ADDRESS_32
  SYSTEM.PTR        64 bit  =  SYSTEM.ADDRESS_64

The integer type hierarchy is SHORTINT <= INTEGER <= LONGINT.

Although we are not happy with the gap in the pervasive integer types in 64 bit pointer mode (SYSTEM.SIGNED_32 is missing), this seems to be the most effective solution, because the Oberon-2 language report does not need any changes.
32 bit integers are only needed for (1) system specific data structures, which are system dependend anyway, (2) when reading/writing 32 bit integers from/to a file to map the original file-record structures, (3) to save memory in large arrays of integers.
One possibility to avoid the import of SYSTEM, where 32 bit integers are needed, another pervasive integer type such as

MIDDLEINT = SYSTEM.SIGNED_32

could be introduced. The problem is that in order to avoid too many changes in existing Oberon-2 programs, a new type such as MIDDLEINT wouldn't help much. Also, leaving LONGINT at 32 bit and introducing a new pervasive type

HUGEINT = SYSTEM.SIGNED_64

would not solve the problem that the size of LONGINT must match the type size of address and pointer.

Integer literals: Independent from the pointer size, the range of constants integer literals in decimal notation and constant integer expressions is [MIN (SYSTEM.SIGNED_64) .. MAX (SYSTEM.SIGNED_64)] i.e.: [-9223372036854775808 .. 9223372036854775807].

Unfortunately, a syntax extension is needed for 64 bit hexadecimal notation, because in all OP2 based 32 bit Oberon-2 compilers the scanner evaluates hex literals in the range of 8000000H to 0FFFFFFFFH as negative numbers, assuming that bit 31 is the sign-bit. So 0FFFFFFFFH is evaluated to the constant value -1. Simply extending the maximal number of hex digits for a literal from 8 to 16, would mean that 0FFFFFFFFH is equal to 4294967295. This constant value would be of type SIGNED_64. In order not to invalidate existing programs (e.g.: CONST mone* = 0FFFFFFFFH; VAR i: LONGINT; BEGIN i := mone;), new syntax for 64 bit hex literals is needed.


  integer = digit {digit} | digit {hexDigit} "H".

is changed to


  integer = digit {digit} | digit {hexDigit} ("H" | "S").

The H marks a 32 bit signed hexadecimal literal. The "S" marks a 64 bit signed hexadecimal literal. The "S" is taken from the German word "Sedezimal", which stands for a base 16 notation. Examples for hexadecimal number literals:


   80000000H = MIN(SIGNED_32)         80000000S = MAX(SIGNED_32)+1  
  0FFFFFFFFH = -1                    0FFFFFFFFS = 4294967295
                             0FFFFFFFFFFFFFFFFS = -1
   7FFFFFFFH = MAX(SIGNED_32) 7FFFFFFFFFFFFFFFS = MAX (SYSTEM.SIGNED_64)

Sets: Independent from the pointer size, the set size does not change. The size of the pervasive type SET is 32 bit. SET is a synonym for SYSTEM.SET_32. MAX (SET) is 31.
A new pervasive type LONGSET with size 64 bit is introduced. LONGSET is synonym for SYSTEM.SET_64. MAX (LONGSET) is 63. Set constructors for the pervasive type SET are not compatible with type LONGSET.
A set constructor can optionally be prefixed with an identifier of a set type. This is the second of a total of two syntax extension needed to cope with 64 bit in addition to 32 bit items. Examples for the declaration of a negated empty set and longset" s and l:


  CONST s = -SET{}; t = -LONGSET{};

or


  TYPE SET = LONGSET; CONST s = -SYSTEM.SET_32{}; CONST l = -SET{};

The Oberon-2 syntax for Set changes from


  Set =             "{" [Element {"," Element}] "}".

to


  Set = [Qualident] "{" [Element {"," Element}] "}".

The syntax extension in Set is necessary in order to distinguish the set size of constant set values. Otherwise, with type-less set constructors it would not be possible to evaluate for example the constant value of the negated empty set:

Is -{} = {0..MAX(SET)} or is -{} = {0..MAX(LONGSET)}?

1. Consequences of the LONGINT size change

When compiling with /pointersize=64, LONGINT is identical to SYSTEM.SIGNED_64 and in consequence the semantics of the following standard functions and procedures changes, because they have either an argument or a function result of type LONGINT:

2. Pointers and dynamic memory

- The Pointer type size is 64 bits with /pointersize=64.

- The number of elements per dimension of a dynamic array created with NEW is restricted to MAX(SIGNED_64).

- Mixed pointers: Import of 32 bit pointer types is allowed in modules compiled in 64 bit mode. Import of 64 bit pointer types is allowed in modules compiled in 32 bit mode. Mixed pointers are allowed for stand-alone programs only. The compilation qualifier /oberon_loadfile implies /pointersize=64.

- Dynamic memory: There are static and dynamic objects. A dynamic object is a pointer to a record or array and is allocated on the heap with NEW. The heap is allocated at run-time in dymamic memory. Dynamic memory is requested from the underlying operating system (OpenVMS) with LIB$GET_VM, which returns the start address of a contiguous area in the 32 bit virtual address space. Since OpenVMS V7.0, there is a new service (LIB$GET_VM_64), returning a 64 bit virtual address in "basadr". The procedure returns a status as function result (e.g. done or not enough memory available).


      PROCEDURE LIB$GET_VM_64* (
           numbyt: SYSTEM.SIGNED_64;
       VAR basadr: SYSTEM.ADDRESS_64): SYSTEM.SIGNED_32;

The "numbyt" parameter of LIB$GET_VM_64 is of type SIGNED_64, so it is theoretically possible to allocate more than MAX(UNSIGNED_32) bytes.

The AlphaOberon compiler automatically maps allocation of
- 32 bit pointers to Storage64.ALLOCATE_32 (=LIB$GET_VM), which always allocates memory below MAX(SIGNED_32).
- 64 bit pointers to Storage64.ALLOCATE (=LIB$GET_VM_64)
for both NEW() and SYSTEM.NEW().

- Assignment compatibility: 32 bit pointers may be assigned to 64 bit pointer variables, but not vice-versa.

3. Procedure types and variables

In Oberon-2, there are procedure declarations, procedures types and procedure variables. In A2O, the their sizes are determined as follows:

If the module which contains the declaration of a procedure P is compiled with /pointersize=x, the size of P is x DIV 8.

If the module which contains the declaration of a procedure type T is compiled with /pointersize=x, the size of T is x DIV 8.

If the module which contains the declaration of a procedure variable V:T is compiled with /pointersize=x, the size of V is SIZE(T).

A procedure variable on OpenVMS Alpha is a pointer to a procedure linkage pair. A procedure linkage pair consists of two entries, a pointer to the procedure descriptor and an entry address, which are both 64 bit addresses, even with /pointersize=32. Because of a OpenVMS V7.0 linker restriction, static data, like the linkage section, is still forced to reside in the 32 bit virtual address space. This means that procedure variables with 32 bit pointers would still work with stand-alone programs, even with VML64, but that would prohibit loading static data, i.e.: procedure linkage pairs into dynamic memory (see Alpha Oberon System below).

4. String descriptors

String descriptors are used for string parameters when calling a foreign procedure. The data structure of a formal parameter passed by string descriptor to a foreign procedure changes from 8 bytes in size


    RECORD strLen:  SYSTEM.SIGNED_16;
           ident:   SYSTEM.SIGNED_16 (* := 0; *)
           address: SYSTEM.SIGNED_32
    END

to a self-identifying structure of 24 bytes in size, if either the strLen is greater than 65535 or if the string address does not fulfil the MBSE (must be sign extended) condition:


    RECORD selfId1: SYSTEM.SIGNED_16 (* :=  1; must be one (MBO) *)
           ident:   SYSTEM.SIGNED_16 (* :=  0; *)
           selfId2: SYSTEM.SIGNED_32 (* := -1; must be minus one (MBMO) *)
           strLen:  SYSTEM.SIGNED_64;
           address: SYSTEM.SIGNED_64
    END

If it is not known at compile time, that both strLen and address will fit into a 8 byte string descriptor, a run-time check is performed to decide whether a 8 byte or 24 byte string descriptor is to be constructed. This is also true when compiled with /pointersize=32. The advantage is that even code compiled with /pointersize=32 is able to handle 64 bit string descriptors.

When compiled with /pointersize=32, string descriptors and SYSTEM.ADR are the only places, where more code is generated under certain conditions than with A2O V1.x,

5. Symbol files

The default symbol file extension is .syn64, if a new symbol file is to be generated and for consistency check with the old symbol file. This prevents overwriting an existing symbol file which was generated with /pointersize=32, thus avoiding version incompatiblities. For file lookup when importing modules, the compiler tries first with a symbol file extension .syn64 and only if no file is found, .syn is used. The object file extension is .obj64.

When using the object model compilation qualifier /fine_grained_symfile (see ModulaTor nr. 72), where the pointer size is always 64, the symbol file extension is always .syo and the oberon load file extension is .olf.

6. Implementation restrictions

- OpenVMS allows virtual regions to be allocated above MAX (SIGNED_64). This means that address calculations need either to be performed with unsigned 64 arithmetic or to restrict the address range allocated. Usually LONGINT is used for address calculations, because there is no unsigned whole number type available in the Oberon-2 language. Fortunately, OpenVMS allows to restrict the address range of virtual regions. The storage handler for Oberon-2 (NEW) restricts by default the highest dynamic address allocated to MAX (SIGNED_64).

- The record type size and the total size of global or local variables can not exceed the size of MAX(SIGNED_32).

- The number of elements per dimension of a static array is restricted to MAX(SIGNED_32).

7. Mixed pointer environment

Regardless of the compilation mode, formal parameters which are passed to a procedure by reference, always have 64 bit addresses (OpenVMS standard calling conventions). Even in 32 bit mode, the references passed have always a fat sign bit (see above) propagated from bit 31 into bit 63. So parameter addresses are valid 64 bit references.

This means that, except for static data which is known to reside in the 32 bit virtual address space, all address calculation (parameter copy, dereferencing, selection, indexing), must be done with 64 bit arithmetic - regardless of the compilation mode.

In a mixed 32/64 bit pointer environment, 64 bit pointer values can be propagated to a module, which was compiled with /pointersize=32. One example is a procedure with a variable parameter:


  MODULE ptr32;
  
  TYPE T* = RECORD x: INTEGER... END;
  
  PROCEDURE proc* (VAR t: T);
  BEGIN
     t.x := 0
  END proc;
  
  END ptr32.
  ------------------------------------
  MODULE ptr64;
  IMPORT ptr32;
  
  VAR p: POINTER TO ptr32.T;
  
  BEGIN
    NEW(p);
    ptr32.proc(p^);
  END ptr64;

Since all address calculations are performed with 64 bit arithmetics, even when compiled with /pointersize=32, the only problem is the function procedure SYSTEM.ADR, which requires that the argument address MBSE in 32 bit mode, because the function result type is LONGINT (SIGNED_32).

So in 32 bit mode, SYSTEM.ADR now emits code to check the MBSE condition of the argument designator in two cases: (1) for formal variable parameter, (2) for 64 bit pointer dereference. The argument address MBSE, otherwise a run-time error trap is signaled (SS$_ARG_GTR_32_BITS. This status code was introduced in OpenVMS 7.0).

This approach allows to have a single module library implementation, where the caller may be in either mode (32 or 64 bit) for most modules, where no variable pointers are contained in any part of the module interface. For example, procedure declarations, like


TYPE anyPointerType* = POINTER TO ...

PROCEDURE Get* (p: anyPointerType);

PROCEDURE ReadString*(VAR a: ARRAY OF CHAR);

PROCEDURE WriteString*(a: ARRAY OF CHAR);

can accept actual parameters from anywhere in the 64 bit virtual address space, when compiled with /pointersize=64.

Whereas with the declaration


PROCEDURE Get* (VAR p: anyPointerType);

the pointer type size of the actual parameter must match the formal.

Type safety of the Oberon-2 language pays-off here already at compile time, in order to avoid errors resulting from address truncation and writing to not-allocated memory which can cause data corruption.

According to the standard OpenVMS calling conventions, all parameters are passed by reference, even if the formal parameter is a value parameter. In the case of a value parameter, the callee is responsible to make a copy on the stack. So in either case, be it a value or variable parameter, if the parameter pointer sizes match, only the parameters address is passed.

If the formal parameter is a 64 bit pointer value parameter and the actual parameter is 32 bit pointer, the pointer value is pushed on the stack as a 64 bit address in canonical form and the stack address is passed as reference. But it is not possible to substitute a 32 bit pointer to a formal variable parameter of 64 bit pointer type, because the called procedure can not know the caller's pointer type size.

8. Alpha Oberon System 64 Bit Port (AOS_64)

To take advantage of 64 bit integers and 64 bit addressing capabilities in an application requiring VLM64 under the Alpha Oberon System (AOS), all components of AOS need to be compiled with /pointersize=64. Otherwise an application running under AOS can not use most parts of the Oberon System's module library. Many procedures of the library have objects as parameters. Many lower-level routines have buffer address parameters (of type LONGINT=SIGNED_32).

Unfortunately, the Oberon System is in many places inherently tied to the fact that address and longint have identical size and are 32 bit wide.

Although the garbage collector (GC), exception handler and array/record type descriptors of AOS were already prepared to work with 64 bit addresses when porting the Oberon System to OpenVMS Alpha, the rest of the Oberon System assumes 32 bits. Furthermore, many literal constants (like ..., -8, -4, 4, 8, 12, ...) are used in many places in the lower level part of the ETH Oberon System to allocate storage or increment/decrement addresses and for SYSTEM.GET/PUT. Other canditates are SYSTEM.ADR, SYSTEM.VAL, SHORT and LONG.

In order to bootstrap the AlphaOberon System, where LONGINT and addresses are 64 bit, all places in the Oberon System and applications need to be inspected, where 32 bit wide addresses and longints are assumed.

The goal of AOS_64 is to outperform OpenVMS stand-alone programs, where the linker restricts static data, i.e.: global data-, strings-, linkage-section, and program code must reside in the 32 bit memory. Stand-alone programs may use VLM64 only for dynamic memory (heap).

AOS contains a dynamic module linker/loader and the OpenVMS linker is not used. So everything, including program code can reside in VLM64. If OpenVMS would allow it, even the stack of coroutines (threads) could be allocated on the heap. Except for the main- and coroutine-stack, there are no 32 bit restrictions in AOS_64.

When calling OpenVMS I/O system service routines, the file and record access blocks are still required to reside in 32 bit memory with VLM64. AOS_64 automatically allocates these data structures in 32 bit memory, but there is no need to copy I/O-buffers, since the extended record management services (RMS) allow 64 bit values for user buffer address and buffer size. The extended RMS data structures are self-identifying, i.e.: a similar technique as shown with string descriptors above is used: if the 32 bit user buffer address is -1, the 64 bit buffer address is stored in a record field which is appended to the end of the old access block.


ModulaTor Forum

Book Recommendation

Ayn Rand: "Atlas Shrugged", Signet, Penguin Books, New York, 1957. (The latest edition is available as hardcover, special hardcover edition, or paperback). 1074 pages.
With this acclaimed work and its immortal query "Who is John Galt?", Ayn Rand found the perfect artistic form to express her vision of existence. This is the book that made her not only one of the most popular novelists of our century, but also one of its most influential thinkers.

German translation of Atlas Shrugged: "Wer ist John Galt?", Gesellschaft fuer Erfahrungswissenschaft und Sozialforschung, Oct-1997. This book can be ordered at http://www.der-markt.com/bm/angebotautor.asp?INDEX_AUTOR=12 (deutsche Ausgabe).


IMPRESSUM: The ModulaTor is an unrefereed journal. Technical papers are to be taken as working papers and personal rather than organizational statements. Items are printed at the discretion of the Editor based upon his judgement on the interest and relevancy to the readership. Letters, announcements, and other items of professional interest are selected on the same basis. Office of publication. The Editor of The ModulaTor is Günter Dotzel; he can be reached at mailto:[email deleted due to spam]


Home | Site_index | Legal | OpenVMS_compiler | Alpha_Oberon_System | ModulaTor | Bibliography | Oberon[-2]_links | Modula-2_links |

Amazon.com [3KB] [Any browser]

Books Music Video Enter keywords...


Amazon.com logo

Webdesign by www.otolo.com/webworx, 20-Mar-1999