
MEMORY ALLOCATION UNDER Z88DK


This file describes several dynamic and static memory allocation schemes that
work well under z88dk.  All assume a 64K memory space model, but a thing or
two will be said about >64K model memory allocation.

Besides what is described here, z88dk does have a far memory model concept
which uses 24-bit memory pointers to access up to 16MB of memory.  This has
been implemented for the z88 target and you should consult other documentation
in this directory to learn more about it.

Before getting into it, let's briefly investigate how C's variables are
treated by the compiler.


1. AUTOMATIC VARIABLES

Local variables declared in a C program are allocated on the stack as
a function is entered and are popped off the stack as functions are
exited.  They are dynamic (in that memory is allocated and deallocated
from the stack as required) and transient (in that referring to their
addresses has no meaning after the function terminates).

Example:  accessing local variables from assembler embedded in a
C function.

   int myfunc(char b, unsigned char *p)
   {
      #asm
   
      ld hl,2
      add hl,sp              ; skip over return address on stack
      ld e,(hl)
      inc hl
      ld d,(hl)              ; de = p
      inc hl
      ld a,(hl)              ; a = b, "char b" occupies 16 bits on stack
                             ;  but only the LSB is relevant
      ...

      ld hl,1                ; hl is the return parameter
   
      #endasm
   }

Example:  z88dk's special __FASTCALL__ qualifier can be used if
the C function has just one parameter.  In that case, the parameter
is passed in the HL register pair instead of being allocated on
the stack.

   int __FASTCALL__ myfunc(unsigned char *p)
   {
      #asm
   
      ; hl = p

      ...
   
      ; return value in hl
      
      #endasm
   }


2. STATIC VARIABLES

In C, static variables are those variables declared outside the
scope of any function.  They are global by default (as in they can
be referred to from anywhere) and can be made lexically local to
a particular C file by using the "static" modifier in their declaration.

Because they are declared outside the scope of any function, they
exist for the duration of the entire program.  For this reason they
cannot be allocated on the stack but instead are assigned their own
absolute locations in memory.  The assembler used by z88dk does not,
at this time, have any concept of separate code or data segments.
Therefore the compiler simply generates a single monolithic binary
that contains compiled code and space for statically declared variables.

This can be a problem if ROMable code must be generated because these
globally defined static variables will be mixed in with code in the
final binary, and will become constants when they are burned into
a ROM.

An example of a static variable being accessed from assembler:

   int a;
   
   ...
   
   #asm
   
   ld hl,(_a)
   
   #endasm

The C variable has sufficient space reserved by the compiler and
is given an assembler label which is constructed by prefixing
the C name with a "_".

There are several workarounds for accommodating ROMable code.  One
is to declare variables as pointers to fixed memory locations in RAM.
If your program statically declares an int, make it statically
declare a pointer to int instead with the pointer initialized to a
constant memory address in RAM.  This constant value is safely ROMable.
There are two ways to place a pointer at an arbitrary memory address.

The first method uses a bit of embedded assembler for assistance:

   extern int *p;
   #asm
   ._p defw 60000
   #endasm

The extern qualifier tells the compiler that the variable is
declared someplace else so that no memory is reserved for it
when the declaration is met.  Recall that the C compiler will
cause references to this variable to be made from assembler using
the C name with a prefixed "_".  We arrange in the embedded
assembler to supply that name and space with a constant memory
address safely in RAM.  The two bytes referred to by assembler
label "_p" with value 60000 will be compiled as part of the final
binary and will be burned into ROM.  References to "p" from the
C program will access the word 60000 and dereferencing the "p"
ptr will cause an int to be written or read from address 60000.

The second method accomplishes the same thing but perhaps uses
a cleaner declaration style:

   extern int *p @ 60000;   // FIXME

Another frequently used method in some of z88dk's libraries is
to have the user declare locations in memory of various necessary
data structures.  Eg, the im2 library requires a single label
to be defined in the program that holds the location of the interrupt
vector table.  Consult the im2.h header file for further information.


3. STATIC MEMORY ALLOCATION

There are two reasons why you may want to remove static variables
from the final binary.  One is that you want to make the final
binary smaller for loading into the final target by pruning out
RAM variables.  Another is that the binary must be ROMable
so that static variables must be removed and placed in RAM locations.

The methods mentioned in the last section may not be entirely
satisfactory if you have a lot of static variables to worry
about.  Each time your program changes the static variables
it uses, you will have to modify by hand the locations of each
pointer variable.

Another alternative is to write a static memory allocator:

   unsigned int static_free = 50000;
   void *static_alloc(unsigned int size)
   {
      unsigned int temp;
   
      temp = static_free;
      static_free += size;
      return temp;
   }

The variable "static_free" holds the address of freely available
memory in RAM, and is adjusted upward as static memory is allocated
during the initialization of the program.

All static variables are made into pointers and are allocated
memory for their data from the static allocator.

Example:

   int a;
   char *table[100];

Is turned into this:

   int *a;
   char **table;

   main()
   {
      a = static_alloc(sizeof(int));
      table = static_alloc(100*sizeof(char *));
   }

This will reduce the size of the final binary that needs to be
loaded into the target (the original static table array reserved
200 bytes in the final binary but after the change only 2 bytes
in the final binary would be reserved for the pointer) but it is
*still* not ROMable code.  Space for each pointer variable
(2 bytes) will be allocated within the output binary destined for
the ROM.  One must place the location of each pointer variable at
a specific memory address in RAM using the special pointer
declaration syntax mentioned in the last section so that they can
be assigned to from a call to static_alloc.


4. DYNAMIC MEMORY ALLOCATION

There are currently two dynamic memory allocators in z88dk that
are independent from each other and can be used together if desired.

A. Standard C Malloc and Free

The standard C malloc functions manage a single monolithic
chunk of memory (called the "heap") accessed by malloc() for
allocation and free() for returning memory chunks for reuse.  Any
size memory chunks can be requested and a block of sufficient
size will be returned if sufficient memory space in the heap is
present.

In your program you must include the header malloc.h and link
to the malloc library in your compile command with "-lmalloc".
In main() you must initialize the heap with a call to heapinit().

In the tiny memory space of the z80, where this free memory
chunk for the malloc functions is located is a problem.  z88dk's
malloc implementation requires the user program to declare the
location of the heap in one of two ways.

The first way is to declare the macro supplied in the header file
globally with the rest of the static variable declarations in your
main.c file:

   HEAPSIZE(bytes)
   
where "bytes" is the size of the heap in bytes.

This macro actually declares a static array of the given size,
which is something that will be compiled as part of the final
binary.  This is not good if your binary will need to be ROMable
or if you want to shrink the size of the final binary.

Instead you can declare the location of the heap using some
embedded assembler:

   #asm
      PUBLIC _heap
      DEFC _heap = 50000
   #endasm

This will place the heap at address 50000.  The initialization
call to heapinit() takes as parameter the size of the heap and
this will format a block of memory at that address to use that
size only.

B. The Dynamic Block Allocator

The standard malloc() and free() implementation is very flexible in
that it allows memory blocks of any size to be allocated and returned
to the heap.  It also has some drawbacks, namely it suffers from
speed issues (relatively speaking) and external fragmentation.

Because any size memory blocks can be requested from the heap, a
substantial amount of overhead is incurred keeping track of which
pieces of the heap are still available.  This makes it comparatively
slow.

Again because any size memory blocks can be allocated, the heap
can suffer from a condition known as external fragmentation.  Over
the course of running a program, many blocks of variable sizes can
be allocated and freed.  This can lead to a situation where there are
a lot of scattered smallish segments available in the heap for
allocation.  Memory requests may no longer be satisfiable since any
single free segment is too small, even though the total amount of
space in the heap (found by adding up all the little segments) would
be sufficient.  This situation is known as external fragmentation.
If the heap is too small (quite likely given the small 64k space
available) programs that make use of the heap may grind to a halt
after running long periods because memory requests are no longer
satisfiable due to the scattered condition of the heap.

A solution to both these problems is the dynamic block memory
allocator.

A block memory allocator maintains several memory queues, each of
which maintains a list of free blocks of fixed size.  Queue #0 might
contain 6-byte blocks, queue #1 14-byte blocks, etc.  Memory requests
are made from a specific queue, retrieving a memory block of known size.
Because the free memory blocks are maintained in a list, allocation and
freeing is very fast.  Another important advantage this allocator has to
the heap oriented one is that scattered free memory from all over the memory
map can be made available through the block allocator.

To use the block memory allocator, you must include the header balloc.h
and link to the balloc library with "-lballoc" in your compile command.

In your C program you must declare the location of the queue table
which is 2*numQ bytes in size where numQ is the number of memory queues
you want.  You can do this in one of two ways.

One way is to declare the macro from the header file with the rest of
your static variables in the main.c file:

   BAQTBL(numQ)

where numQ is the number of memory queues desired.  Again, this will
declare a static array that will be compiled as part of your final
binary, something not suitable for ROMable code.

The other method is to declare the location of the queue table using
some embedded assembler:

   #asm
      PUBLIC _ba_qtbl
      DEFC _ba_qtbl = 50000
   #endasm

which will place the queue table at address 50000.

You must initialize the queues to empty prior to use with a call
to ba_Init().  The queues will be empty and you must add memory
to the queues with calls to ba_AddMem().

The header files are a source of further documentation on both dynamic
memory allocators.


5. IMPLICIT DYNAMIC MEMORY ALLOCATION BY SOME Z88DK LIBRARIES

Several of z88dk's libraries need to allocate dynamic memory for their
own purposes.  The abstract data types library (see adt.h), for example,
has a linked list data type that needs to allocate containers for each
node in a linked list.  These libraries do not make assumptions about
how your program does memory allocation (ie whether it uses the standard
heap or the block allocator or another scheme of your own creation).
Instead they call two functions that you must provide in your C program
to perform the memory allocation on their behalf.

These two functions are declared thus:

   ( u_malloc must return carry flag set if allocation successful )
   void __FASTCALL__ *u_malloc(uint size) {
      return(malloc(size));   // lib function malloc sets carry
   }

   ( u_free must ignore addr == 0 )
   void __FASTCALL__ u_free(void *addr) {
      free(addr);             // lib function free ignores 0
   }

In this example, the standard malloc() and free() are used to
satisfy all requests, but the block memory allocator is probably
the best choice for the library functions since they frequently
allocate and free small fixed size blocks.

Here are the same two functions written in assembler:

   PUBLIC _u_malloc
   ._u_malloc                   ; HL = size in bytes
      EXTERN malloc
      jp malloc                 ; return in HL address of memory block
   
   PUBLIC _u_free
   ._u_free                     ; HL = address of memory block
      EXTERN free
      jp free

Whether or not a library makes use of implicit memory allocation
will be mentioned in its header file and the sizes of those
requests should be clear from the data structures declared there.
If you have a certain amount of programming experience, it shouldn't
be a mystery as to which libraries might need to dynamically allocate
memory.


6. WHAT IF YOU HAVE MORE THAN 64K?

Aside from the far pointers mentioned at the beginning of this article,
there are two things that can be done to take advantage of this extra
memory space.

By far the easiest thing you can do is access the extra memory through
a RAMdisk model, ie copy data to and from the extra banks to the main
64K bank where your program is designed to run, as needed.

Another is to make use of the block memory allocator.  Separate queues
can be maintained for available memory in different memory banks.
Subroutines within your program can be confined to certain banks so that
whenever they run they only access data in those banks and have those
banks paged in when they run.  Those subroutines can use the block allocator
for their memory allocation needs, making sure they allocate from the
appropriate queues.

That's the end of the overview... Any specific questions can be addressed
to the z88dk users mailing list.

- aralbrec 04.2006
