1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2025-12-25 15:00:45 -08:00

Bringing across selected design documents from mminfo for open sourcing.

Copied from Perforce
 Change: 29883
 ServerID: perforce.ravenbrook.com
This commit is contained in:
Richard Brooksby 2002-06-07 14:22:54 +01:00
commit eaf57387cd
50 changed files with 12371 additions and 0 deletions

446
mps/design/arena/index.txt Normal file
View file

@ -0,0 +1,446 @@
THE DESIGN OF THE MPS ARENA
design.mps.arena
incomplete design
pekka 1997-08-11
INTRODUCTION
.intro: This is the design of the arena structure.
.readership: MM developers.
Document History
.hist.0: Version 0 is a different document.
.hist.1: First draft written by Pekka P. Pirinen 1997-08-11, based on
design.mps.space(0) and mail.richard.1997-04-25.11-52(0).
.hist.2: Updated for separation of tracts and segments. tony 1999-04-16
OVERVIEW
.overview: The arena serves two purposes: A structure that is the top-level
state of the MPS, and as such contains a lot of fields which are considered
"global"; Provision of raw memory to pools.
An arena is of a particular arena class, the class is selected when the arena
is created. Classes encapsulate both policy (such as how pools placement
preferences map into actual placement) and mechanism (such as where the memory
originates: OS VM, client provided, via malloc). Some behaviour (most in the
former "top-level datastructure" category) is implemented by generic arena
code, some by arena class code. To some extent the arena coordinates placement
policies between different pools active in the same arena, however this
functionality is likely to be replaced by something more modular and which does
a better job: The Locus Manager.
DEFINITIONS
.def.tract: Pools request memory from the arena (using ArenaAlloc) as a block
comprising a contiguous sequence of units. The units are known as tracts. A
tract has a specific size (the arena alignment, often corresponds to the OS
page size) and all tracts are aligned to that size. Also used to mean the
datastructure used to manage tracts.
REQUIREMENTS
[copied from design.mps.arena.vm(1) and edited slightly -- drj 1999-06-23]
[Where do these come from? Need to identify and document the sources of
requirements so that they are traceable to client requirements. Most of these
come from the architectural design (design.mps.architecture) or the fix
function design (design.mps.fix). -- richard 1995-08-28]
These requirements are the responsiblity of the class implementations as well
as the generic arena. However, some classes (ANSI arena, arenaan.c, in
particular) are not intended for production use so do not have to meet all the
speed and space requirements.
Block Management
.req.fun.block.alloc: The Arena Manager must provide allocation of contiguous
blocks of memory.
.req.fun.block.free: It must also provide freeing of contiguously allocated
blocks owned by a pool - whether or not the block was allocated via a single
request.
.req.attr.block.size.min: The Arena Manager must support management of blocks
down to the size of the grain (page) provided by the virtual mapping interface
if a VM interface is being used, a comparable size otherwise.
.req.attr.block.size.max: It must also support management of blocks up to the
maximum size allowed by the combination of operating system and architecture.
This is derived from req.dylan.attr.obj.max (at least).
.req.attr.block.align.min: The alignment of blocks shall not be less than
MPS_PF_ALIGN (defined in "mpstd.h" included via "config.h") for the
architecture. This is so that pool classes can conveniently guarantee pool
allocated blocks are aligned to MPS_PF_ALIGN. (A trivial requirement)
.req.attr.block.grain.max: The granularity of allocation shall not be more than
the grain size provided by the virtual mapping interface.
Address Translation
.req.fun.trans: The Arena must provide a translation from any address to either
an indication that the address is not in any tract (if that is so) or the
following data associated with the tract containing that address:
.req.fun.trans.pool: The pool that allocated the tract.
.req.fun.trans.arbitrary: An arbitrary pointer value that the pool can
associate with the tract at any time.
.req.fun.trans.white: The tracer whiteness information. IE a bit for each
active trace that indicates whether this tract is white (contains white
objects). This is required so that the tracer resolve / preserve (aka "Fix")
protocol can run very quickly.
.req.attr.trans.time: The translation shall take no more than @@@@ [something
not very large -- drj 1999-06-23]
Iteration Protocol
.req.iter: er, there's a tract iteration protocol which is presumably required
for some reason?
Arena Partition
.req.fun.set: The Arena Manager must provide a method for approximating sets of
addresses. .req.fun.set.time: The determination of membership shall take no
more than ???? [something very small indeed]. (the non-obvious solution is
refsets)
Constraints
.req.attr.space.overhead: req.dylan.attr.space.struct implies that the arena
must limit the space overhead. The arena is not the only part that introduces
an overhead (pool classes being the next most obvious), so multiple parts must
cooperate in order to meet the ultimate requirements.
.req.attr.time.overhead: Time overhead constraint? [how can there be a time
"overhead" on a necessary component? drj 1999-06-23]
ARCHITECTURE
Statics
.static: There is no higher-level data structure than a arena, so in order to
support several arenas, we have to have some static data in impl.c.arena. See
impl.c.arena.static.
.static.init: All the static data items are initialized when the first arena is
created.
.static.serial: arenaSerial is a static Serial, containing the serial number of
the next arena to be created. The serial of any existing arena is less than
this.
.static.ring: arenaRing is the sentinel of the ring of arenas.
.static.ring.init: arenaRingInit is a bool showing whether the ring of arenas
has been initialized.
.static.ring.lock: The ring of arenas has to be locked when traversing the
ring, to prevent arenas being added or removed. This is achieved by using the
(non-recursive) global lock facility, provided by the lock module.
.static.check: The statics are checked each time any arena is checked.
Arena Classes
.class: The Arena datastructure is designed to be subclassable (see
design.mps.protocol(0)). Clients can select what arena class they'd like when
instantiating one with mps_arena_create(). The arguments to mps_arena_create
are class dependent.
.class.init: However, the generic ArenaInit is called from the class-specific
method, rather than vice versa, because the method is responsible for
allocating the memory for the arena descriptor and the arena lock in the first
place. Likewise, ArenaFinish is called from the finish method.
.class.fields: The alignment (for tract allocations) and zoneShift (for
computing zone sizes and what zone an address is in) fields in the arena are
the responsibility of the each class, and are initialized by the init method.
The responsibility for maintaining the commitLimit, spareCommitted,
spareCommitLimit fields is shared between the (generic) arena and the arena
class. commitLimit (see .commit-limit below) is changed by the generic arena
code, but arena classes are responsible for ensuring the semantics. For
spareCommitted and spareCommitLimit see .spare-committed below.
.class.abstract: The basic arena class (AbstractArenaClass) is abstract and
must not be instantiated. It provides little useful behaviour, and exists
primarily as the root of the tree of arena classes. Each concrete class must
specialize each of the class method fields, with the exception of the describe
method (which has a trivial implementation) and the extend, retract and
spareCommitExceeded methods which have non-callable methods for the benefit of
arena classes which don't implement these features. .class.abstract.null: The
abstract class does not provide dummy implementations of those methods which
must be overridden. Instead each abstract method is initialized to NULL.
Tracts
.tract: The arena allocation function (ArenaAlloc) allocates a block of memory
to pools, of a size which is aligned to the arena alignment. Each alignment
unit (grain) of allocation is represented by an object called a Tract. Tracts
are the hook on which the segment module is implemented. Pools which don't use
segments may use tracts for associating their own data with each allocation
grain.
.tract.structure: The tract structure definition looks as follows:-
typedef struct TractStruct { /* Tract structure */
Pool pool; /* MUST BE FIRST (design.mps.arena.tract.field.pool) */
void *p; /* pointer for use of owning pool */
Addr base; /* Base address of the tract */
TraceSet white : TRACE_MAX; /* traces for which tract is white */
unsigned int hasSeg : 1; /* does tract have a seg in p? */
} TractStruct;
.tract.field.pool: The pool field indicates to which pool the tract has been
allocated (.req.fun.trans.pool). Tracts are only valid when they are allocated
to pools. When tracts are not allocated to pools, arena classes are free to
reuse tract objects in undefined ways. A standard technique is for arena class
implementations to internally describe the objects as a union type of
TractStruct and some private representation, and to set the pool field to NULL
when the tract is not allocated. The pool field must come first so that the
private representation can share a common prefix with TractStruct. This permits
arena classes to determine from their private representation whether such an
object is allocated or not, without requiring an extra field.
.tract.field.p: The p field is used by pools to associate tracts with other
data (.req.fun.trans.arbitrary). It's used by the segment module to indicate
which segment a tract belongs to. If a pool doesn't use segments it may use
the p field for its own purposes. This field has the non-specific type (void *)
so that pools can use it for any purpose.
.tract.field.hasSeg: The hasSeg bit-field is a boolean which indicates whether
the p field is being used by the segment module. If this field is TRUE, then
the value of p is a Seg. hasSeg is typed as an unsigned int, rather than a
Bool. This ensures that there won't be sign conversion problems when converting
the bit-field value.
.tract.field.base: The base field contains the base address of the memory
represented by the tract.
.tract.field.white: The white bit-field indicates for which traces the tract is
white (.req.fun.trans.white). This information is also stored in the segment,
but is duplicated here for efficiency during a call to TraceFix (see
design.mps.trace.fix).
.tract.limit: The limit of the tract's memory may be determined by adding the
arena alignment to the base address.
.tract.iteration: Iteration over tracts is described in
design.mps.arena.tract-iter(0).
.tract.if.tractofaddr: Function TractOfAddr finds the tract corresponding to an
address in memory. (.req.fun.trans).
Bool TractOfAddr(Tract *tractReturn, Arena arena, Addr addr);
If addr is an address which has been allocated to some pool, then returns TRUE,
and sets *tractReturn to the tract corresponding to that address. Otherwise,
returns false. This function is similar to TractOfBaseAddr (see
design.mps.arena.tract-iter.if.contig-base) but serves a more general purpose
and is less efficient.
.tract.if.TRACT_OF_ADDR: TRACT_OF_ADDR is a macro version of TractOfAddr. It's
provided for efficiency during a call to TraceFix (see
design.mps.trace.fix.tractofaddr)
Control Pool
.pool: Each arena has a "control pool", arena->controlPoolStruct, which is used
for allocating MPS control data structures (using ControlAlloc()).
Polling
.poll: ArenaPoll is called "often" by other MM code (for instance, on buffer
fill or allocation). It is the entry point for doing tracing work. If the
polling clock exceeds a set threshold, and we're not already doing some tracing
work (i.e., insidePoll is not set), it calls TracePoll on all busy traces.
.poll.size: The actual clock is arena->fillMutatorSize. This is because
internal allocation is only significant when copy segments are being allocated,
and we don't want to have the pause times to shrink because of that. (There is
no current requirement for the trace rate to guard against running out of
memory. [clearly it really ought to though, we have a requirement to not run
out of memory (req.dylan.prot.fail-alloc, req.dylan.prot.consult), emergency
tracing should not be our only story. drj 1999-06-22]) BufferEmpty is not
taken into account, because the splinter will rarely be useable for allocation
and we are wary of the clock running backward.
.poll.clamp: Polling is disabled when the arena is "clamped", in which case
arena->clamped is TRUE. Clamping the arena prevents background tracing work,
and further new garbage collections from starting. Clamping and releasing are
implemented by the ArenaClamp and ArenaRelease methods.
.poll.park: The arena is "parked" by clamping it, then polling until there are
no active traces. This finishes all the active collections and prevents
further collection. Parking is implemented by the ArenaPark method.
Commit Limit
.commit-limit: The arena supports a client configurable "commit limit" which is
a limit on the total amount of committed memory. The generic arena structure
contains a field to hold the value of the commit limit and the implementation
provides two functions for manipulating it (ArenaCommitLimit to read it and
ArenaSetCommitLimit to set it). Actually abiding by the contract of not
committing more memory than the commit limit is left up to the individual arena
classes.
.commit-limit.err: When allocation from the arena would otherwise succeed but
cause the MPS to use more committed memory than specified by the commit limit
ArenaAlloc should refuse the request and return ResCOMMIT_LIMIT.
.commit-limit.err.multi: In the case where an ArenaAlloc request cannot be
fulfilled for more than one reason including exceeding the commit limit then
class implementations should strive to return a result code other than
ResCOMMIT_LIMIT (ie ResCOMMIT_LIMIT should only be returned if the _only_
reason for failing the ArenaAlloc request is that the commit limit would be
exceeded). The (client) documentation allows implementations to be ambiguous
with respect to which result code in returned in such a situation however.
Spare Committed (aka "hysteresis")
.spare-committed: (See symbol.mps.c.mps_arena_spare_committed(0)) The generic
arena structure contains two fields for the spare committed memory fund:
spareCommitted records the total number of spare committed bytes;
spareCommitLimit records the limit (set by the user) on the amount of spare
committed memory. spareCommitted is modified by the arena class but its value
is used by the generic arena code. There are two uses: a getter function for
this value is provided through the MPS interface
(mps_arena_spare_commit_limit_set), and by the SetSpareCommitLimit function to
determine whether the amount of spare committed memory needs to be reduced.
spareCommitLimit is mainpulated by generic arena code, however the associated
semantics are the reponsibility of the class. It is the class's resposibility
to ensure that it doesn't use more spare committed bytes than the value in
spareCommitLimit.
.spare-commit-limit: The function ArenaSetSpareCommitLimit sets the
spareCommitLimit field. If the limit is set to a value lower than the amount
of spare committed memory (stored in spareCommitted) then the class specific
function spareCommitExceeded is called.
Locks
.lock.ring: ArenaAccess is called when we fault on a barrier. The first thing
it does is claim the non-recursive global lock to protect the arena ring (see
design.mps.lock(0)). .lock.arena: After the arena ring lock is claimed,
ArenaEnter is called on one or more arenas. This claims the lock for that
arena. When the correct arena is identified or we run out of arenas, the lock
on the ring is released.
.lock.avoid: Deadlocking is avoided as follows:
.lock.avoid.mps: Firstly we require the MPS not to fault (i.e., when any of
these locks are held by a thread, that thread does not fault).
.lock.avoid.thread: Secondly, we require that in a multi-threaded system,
memory fault handlers do not suspend threads (although the faulting thread
will, of course, wait for the fault handler to finish).
.lock.avoid.conflict: Thirdly, we avoid conflicting deadlock between the arena
and global locks by ensuring we never claim the arena lock when the recursive
global lock is already held, and we never claim the binary global lock when the
arena lock is held.
Location Dependencies
.ld: Location dependencies use fields in the arena to maintain a history of
summaries of moved objects, and to keep a notion of time, so that the staleness
of location dependency can be determined.
Finalization
.final: There is a pool which is optionally (and dynamically) instantiated to
implement finalization. The fields finalPool and isFinalPool are used.
IMPLEMENTATION
Tract Cache
.tract.cache: When tracts are allocated to pools (by ArenaAlloc), the first
tract of the block and it's base address are cached in arena fields lastTract
and lastTractBase. The function TractOfBaseAddr (see
design.mps.arena.tract-iter.if.block-base(0)) checks against these cached
values and only calls the class method on a cache miss. This optimizes for the
common case where a pool allocates a block and then iterates over all its
tracts (e.g. to attach them to a segment).
.tract.uncache: When blocks of memory are freed by pools, ArenaFree checks to
see if the cached value for the most recently allocated tract (see
.tract.cache) is being freed. If so, the cache is invalid, and must be reset.
The lastTract and lastTractBase fields are set to NULL.
Control Pool
.pool.init: The control pool is initialized by a call to PoolInit() during
ArenaCreate().
.pool.ready: All the other fields in the arena are made checkable before
calling PoolInit(), so PoolInit can call ArenaCheck(arena). The pool itself
is, of course, not checkable, so we have a field arena->poolReady, which is
false until after the return from PoolInit. ArenaCheck only checks the pool
if(poolReady).
Traces
.trace: arena->trace[ti] is valid if and only if
TraceSetIsMember(arena->busyTraces, ti).
.trace.create: Since the arena created by ArenaCreate has arena->busyTraces =
TraceSetEMPTY, none of the traces are meaningful.
.trace.invalid: Invalid traces have signature SigInvalid, which can be checked.
Polling
.poll.fields: There are three fields of a arena used for polling:
pollThreshold, insidePoll, and clamped (see above). pollThreshold is the
threshold for the next poll: it is set at the end of ArenaPoll to the current
polling time plus ARENA_POLL_MAX.
Location Dependencies
.ld.epoch: arena->epoch is the "current epoch". This is the number of 'flips'
of traces in the arena since the arena was created. From the mutator's point
of view locations chanage atomically at flip.
.ld.history: arena->history is an array of ARENA_LD_LENGTH RefSets. These are
the summaries of moved objects since the last ARENA_LD_LENGTH epochs. If e is
one of these recent epochs, arena->history[e % ARENA_LD_LENGTH] is a summary of
(the original locations of) objects moved since epoch e.
.ld.prehistory: arena->prehistory is a RefSet summarizing the original
locations of all objects ever moved. When considering whether a really old
location dependency is stale, it is compared with this summary.
Roots
.root-ring: The arena holds a member of a ring of roots in the arena. It holds
an incremental serial which is the serial of the next root.

View file

@ -0,0 +1,167 @@
VIRTUAL MEMORY ARENA
design.mps.arena.vm
incomplete doc
drj 1996-07-16
INTRODUCTION
.intro: This document describes the detailed design of the Virtual Memory Arena
Class of the Memory Pool System. The VM Arena Class is just one class
available in the MPS. The generic arena part is described in design.mps.arena.
OVERVIEW
.overview: VM arenas provide blocks of memory to all other parts of the MPS in
the form of "tracts" using the virtual mapping interface (design.mps.vm) to the
operating system. The VM Arena Class is not expected to be provided on
platforms that do not have virtual memory (like MacOS, os.s7(1)).
.overview.gc: The VM Arena Class provides some special services on these blocks
in order to facilitate garbage collection:
.overview.gc.zone: Allocation of blocks with specific zones. This means that
the generic fix function (design.mps.fix) can use a fast refset test to
eliminate references to addresses that are not in the condemned set. This
assumes that a pool class that uses this placement appropriately is being used
(such as the generation placement policy used by AMC, see design.mps.poolamc(1)
) and that the pool selects the condemned sets to coincide with zone stripes.
.overview.gc.tract: A fast translation from addresses to tract. (See
design.mps.arena.req.fun.trans)
NOTES
.note.refset: Some of this document simply assumes that RefSets (see the
horribly incomplete design.mps.refset) have been chosen as the solution for
design.mps.arena.req.fun.set. It's a lot simpler that way. Both to write and
understand.
REQUIREMENTS
Most of the requirements are in fact on the generic arena (See design.mps.arena
.req.*). However, many of those requirements can only be met by a suitable
arena class design.
Requirements particular to this arena class:
Placement
.req.fun.place: It must be possible for pools to obtain tracts at particular
addresses. Such addresses shall be declared by the pool specifying what refset
zones the tracts should lie in and what refset zones the tracts should not lie
in. It is acceptable for the arena to not always honour the request in terms
of placement if it has run out of suitable addresses.
Arena Partition
.req.fun.set: See design.mps.arena.req.fun.set. The approximation to sets of
address must cooperate with the placement mechanism in the way required by
.req.fun.place (above).
ARCHITECTURE
.arch.memory: The underlying memory is obtained from whatever Virtual Memory
interface (see design.mps.vm). Explain why this is used ###
SOLUTION IDEAS
.idea.grain: Set the arena granularity to the grain provided by the virtual
mapping module.
.idea.mem: Get a single large contiguous address area from the virtual mapping
interface and divide that up.
.idea.table: Maintain a table with one entry per grain in order to provide fast
mapping (shift and add) between addresses and table entries.
.idea.table.figure:
.idea.map: Store the pointers (.req.fun.trans) in the table directly for every
grain.
.idea.zones: Partition the managed address space into zones (see idea.zones)
and provide the set approximation as a reference signature.
.idea.first-fit: Use a simple first-fit allocation policy for tracts within
each zone (.idea.zones). Store the freelist in the table (.idea.table).
.idea.base: Store information about each contiguous area (allocated of free) in
the table entry (.idea.table) corresponding to the base address of the area.
.idea.shadow: Use the table (.idea.table) as a "shadow" of the operating
system's page table. Keep information such as last access, protection, etc. in
this table, since we can't get at this information otherwise.
.idea.barrier: Use the table (.idea.table) to implement the software barrier.
Each segment can have a read and/or write barrier placed on it by each
process. (.idea.barrier.bits: Store a bit-pattern which remembers which
process protected what.) This will give a fast translation from a
barrier-protected address to the barrier handler via the process table.
.idea.demand-table: For a 1Gb managed address space with a 4Kb page size, the
table will have 256K-entries. At (say) four words per entry, this is 4Mb of
table. Although this is only an 0.4%, the table shouldn't be preallocated or
initially it is an infinite overhead, and with 1Mb active, it is a 300%
overhead! The address space for the table should be reserved, but the pages
for it mapped and unmapped on demand. By storing the table in a tract, the
status of the table's pages can be determined by looking at it's own entries in
itself, and thus the translation lookup (.req.fun.trans) is slowed to two
lookups rather than one.
.idea.pool: Make the Arena Manager a pool class. Arena intialization becomes
pool creation. Tract allocation becomes PoolAlloc. Other operations become
class-specific operations on the "arena pool".
DATA STRUCTURES
.tables: There are two table data structures: a page table, and an alloc table.
.table.page.map: Each page in the VM has a corresponding page table entry.
.table.page.linear: The table is a linear array of PageStruct entries; there is
a simple mapping between the index in the table and the base address in the VM
(viz. base-address = arena-base + (index * page-size), one way, index =
(base-address - arena-base) / page-size, the other).
.table.page.partial: The table is partially mapped on an "as-needed" basis.
The function unusedTablePages identifies entirely unused pages occupied by the
page table itself (ie those pages of the page table which are occupied by
PageStructs which all describe free pages). Tract allocation and freeing use
this function to map and unmap the page table with no hysteresis. (there is
restriction on the parameters you may pass to unusedTablePages)
.table.page.tract: Each page table entry contains a tract, which is only valid
if it is allocated to a pool. If it is not allocated to a pool, the fields of
the tract are used for other purposes. (See design.mps.arena.tract.field.pool)
.table.alloc: The alloc table is a simple bit table (implemented using the BT
module, design.mps.bt).
.table.alloc.map: Each page in the VM has a corresponding alloc table entry.
.table.alloc.semantics: The bit in the alloc table is set iff the corresponding
page is allocated (to a pool).
NOTES
.fig.page: How the pages in the arena area are represented in the tables.
.fig.count: How a count table can be used to partially map the page table, as
proposed in request.dylan.170049.sol.map.
- arenavm diagrams
ATTACHMENT
"arenavm diagrams"

610
mps/design/bt/index.txt Normal file
View file

@ -0,0 +1,610 @@
BIT TABLES
design.mps.bt
draft doc
drj 1997-03-04
INTRODUCTION
.readership: Any MPS developer.
.intro: This is the design of the Bit Tables module. A Bit Table is a linear
array of bits. A Bit Table of length n is indexed using an integer from 0 to
(but not including) n. Each bit in a Bit Table can hold either the value 0
(aka FALSE) or 1 (aka TRUE). A variety of operations are provided including:
set, reset, and retrieve, individual bits; set and reset a contiguous range of
bits; search for a contiguous range of reset bits; making a "negative image"
copy of a range.
HISTORY
.history.0-3: The history for versions 0-3 is lost pending possible
reconstruction.
.history.4: Prepared for review. Added full requirements section. Made
notation more consistent throughout. Documented all functions. drj 1999-04-29
DEFINITIONS
.def.set: Set. Used as a verb meaning to assign the value 1 or TRUE to a bit.
Used descriptively to denote a bit containing the value 1. Note 1 and TRUE are
synonyms in MPS C code (see design.mps.type(0).bool.value).
.def.reset: Reset. Used as a verb meaning to assign the value 0 or FALSE to a
bit. Used descriptively to denote a bit containing the value 0. Note 0 and
FALSE are synonyms in MPS C code (see design.mps.type(0).bool.value).
[consider using "fill/empty" or "mark/clear" instead of "set/reset", set/reset
is probably a hangover from drj's z80 hacking days -- drj 1999-04-26]
.def.bt: Bit Table. A Bit Table is a mapping from [0,n) to {0,1} for some n
represented as a linear array of bits. .def.bt.justify: They are called bit
tables because a single bit is used to encode whether the image of a particular
integer under the map is 0 or 1.
.def.range: Range. A contiguous sequence of bits in a Bit Table. Ranges are
typically specified as a base--limit pair where the range includes the position
specified by the base, but excludes that specified by the limit. The
mathematical interval notation for half-open intervals, [base, limit), is used.
REQUIREMENTS
.req.bit: The storage for a Bit Table of n bits shall take no more than a small
constant addition to the storage required for n bits. .req.bit.why: This is so
that clients can make some predictions about how much storage their algorithms
use. A small constant is allowed over the minimal for two reasons: inevitable
implementation overheads (such as only being able to allocate storage in
multiples of 32 bits), extra storage for robustness or speed (such as signature
and length fields).
.req.create: A means to create Bit Tables. .req.create.why: Obvious.
.req.destroy: A means to destroy Bit Tables. .req.destroy.why: Obvious.
.req.ops: The following operations shall be supported:
.req.ops.get: Get. Get the value of a bit at a specified index.
.req.ops.set: Set. Set a bit at a specified index.
.req.ops.reset: Reset. Reset a bit at a specified index.
.req.ops.minimal.why: Get, Set, Reset, are the minimal operations. All
possible mappings can be created and inspected using these operations.
.req.ops.set.range: SetRange. Set a range of bits. .req.ops.set.range.why:
It's expected that clients will often want to set a range of bits; providing
this operation allows the implementation of the BT module to make the operation
efficient.
.req.ops.reset.range: ResetRange. Reset a range of bits.
.req.ops.reset.range.why: as for SetRange, see .req.ops.set.range.why.
.req.ops.test.range.set: IsSetRange. Test whether a range of bits are all
set. .req.ops.test.range.set.why: Mostly for checking. For example, often
clients will know that a range they are about to reset is currently all set,
they can use this operation to assert that fact.
.req.ops.test.range.reset: IsResetRange. Test whether a range of bits are
all reset. .req.ops.test.range.reset.why: As for IsSetRange, see
.req.ops.test.range.set.why.
.req.ops.find: Find a range (which we'll denote [i,j)) of at least L reset
bits that lies in a specified subrange of the entire Bit Table. Various find
operations are required according to the (additional) properties of the
required range:
.req.ops.find.short.low: FindShortResetRange. Of all candidate ranges,
find the range with least j (find the leftmost range that has at least L reset
bits and return just enough of that). .req.ops.find.short.low.why: Required by
client and VM arenas to allocate segments. The arenas implement definite
placement policies (such as lowest addressed segment first) so they need the
lowest (or highest) range that will do. It's not currently useful to allocate
segments larger than the requested size, so finding a short range is
sufficient.
.req.ops.find.short.high: FindShortResetRangeHigh. Of all candidate
ranges, find the range with greatest i (find the rightmost range that has at
least L reset bits and return just enough of that).
.req.ops.find.short.high.why: Required by arenas to implement a specific
segment placement policy (highest addressed segment first).
.req.ops.find.long.low: FindLongResetRange. Of all candidate ranges,
identify the ranges with least i and of those find the one with greatest j
(find the leftmost range that has at least L reset bits and return all of it).
.req.ops.find.long.low.why: Required by the mark and sweep Pool Classes (AMS,
AWL, LO) for allocating objects (filling a buffer). It's more efficient to
fill a buffer with as much memory as is conveniently possible. There's no
strong reason to find the lowest range but it's bound to have some beneficial
(small) cache effect and makes the algorithm more predictable.
.req.ops.find.long.high: FindLongResetRangeHigh. Provided, but not
required, see .non-req.ops.find.long.high.
.req.ops.copy: Copy a range of bits from one Bit Table to another Bit Table .
Various copy operations are required:
.req.ops.copy.simple: Copy a range of bits from one Bit Table to the same
position in another Bit Table. .req.ops.copy.why: Required to support copying
of the tables for the "low" segment during segment merging and splitting, for
pools using tables (e.g. PoolClassAMS).
.req.ops.copy.offset: Copy a range of bits from one Bit Table to an offset
position in another Bit Table. .req.ops.copy.why: Required to support copying
of the tables for the "high" segment during segment merging and splitting, for
pools which support this (currently none, as of 2000-01-17).
.req.ops.copy.invert: Copy a range of bits from one Bit Table to the same
position in another Bit Table inverting all the bits in the target copy.
.req.ops.copy.invert.why: Required by colour manipulation code in PoolClassAMS
and PoolClassLO.
.req.speed: Operations shall take no more than a few memory operations per bit
manipulated. .req.speed.why: Any slower would be gratuitous.
.req.speed.fast: The following operations shall be very fast:
.req.speed.fast.find.short:
FindShortResRange (the operation used to meet .req.ops.find.short.low)
FindShortResRangeHigh (the operation used to meet .req.ops.find.short.high)
.req.speed.fast.find.short.why: These two are used by the client arena
(design.mps.arena.client) and the VM arena (design.mps.arena.vm) for finding
segments in page tables. The operation will be used sufficiently often that
its speed will noticeably affect the overall speed of the MPS. They will be
called with a length equal to the number of pages in a segment. Typical values
of this length depend on the pool classes used and their configuration, but we
can expect length to be small (1 to 16) usually. We can expect the Bit Table
to be populated densely where it is populated at all, that is set bits will
tend to be clustered together in subranges.
.req.speed.fast.find.long:
FindLongResRange (the operation used to meet .req.ops.find.long.low)
.req.speed.fast.find.long.why:
Used in the allocator for PoolClassAWL (design.mps.poolawl(1)), PoolClassAMS
(design.mps.poolams(2)), PoolClassEPVM (design.mps.poolepvm(0)). Of these AWL
and EPVM have speed requirements. For AWL the length of range to be found will
be the length of a Dylan table in words. According to
mail.tony.1999-05-05.11-36(0), only <entry-vector> objects are allocated in AWL
(though not all <entry-vector> objects are allocated in AWL), and the mean
length of an <entry-vector> object is 486 Words. No data for EPVM alas.
.req.speed.fast.other.why: We might expect mark and sweep pools to make use of
Bit Tables, the MPS has general requirements to support efficient mark and
sweep pools, so that imposes general speed requirements on Bit Tables.
NON REQUIREMENTS
The following are not requirements but the current design could support them
with little modification or does support them. Often they used to be
requirements, but are no longer, or were added speculatively or experimentally
but aren't currently used.
.non-req.ops.test.range.same: RangesSame. Test whether two ranges that
occupy the same positions in different Bit Tables are the same. This used to
be required by PoolClassAMS, but is no longer. Currently (1999-05-04) the
functionality still exists.
.non-req.ops.find.long.high: FindLongResetRangeHigh. (see .req.ops.find) Of
all candidate ranges, identify the ranges with greatest j and of those find the
one with least i (find the rightmost range that has at least L reset bits and
return all of it). Provided for symmetry but only currently used by the BT
tests and cbstest.c.
BACKGROUND
.background: Originally Bit Tables were used and implemented by PoolClassLO
(design.mps.poollo). It was decided to lift them out into a separate module
when designing the Pool to manage Dylan Weak Tables which is also a mark and
sweep pool and will make use of Bit Tables (see design.mps.poolawl).
.background.analysis: analysis.mps.bt(0) contains some of the analysis of the
design decisions that were and were not made in this document.
CLIENTS
.clients: Bit Tables are used throughout the MPS but the important uses are: In
the client and VM arenas (design.mps.arena.client(0) and
design.mps.arena.vm(1)) a bit table is used to record whether each page is free
or not; several pool classes (PoolClassLO, PoolClassEPVM, PoolClassAMS) use bit
tables to record which locations are free and also to store colour.
OVERVIEW
.over: Mostly, the design is as simple as possible. The significant
complications are iteration (see .iteration below) and searching (see
.fun.find-res-range below) because both of these are required to be fast.
INTERFACE
.if.representation.abstract: A Bit Table is represented by the type BT.
.if.declare: The module declares a type BT and a prototype for each of the
functions below. The type is declared in impl.h.mpmtypes, the prototypes are
declared in impl.h.mpm. Some of the functions are in fact implemented as
macros in the usual way (doc.mps.ref-man.if-conv(0).macro.std).
.if.general.index: Many of the functions specified below take indexes. If
otherwise unspecified an index must be in the interval [0,n) (note, up to, but
not including, n) where n is the number of bits in the relevant Bit Table (as
passed to the BTCreate function). .if.general.range: Where a range is
specified by two indexes base and limit, base, which specifies the beginning of
the range, must be in the interval [0,n), limit, which specifies the end of the
range, must be in the interval [1,n] (note can be n), and base must be strictly
less than limit (empty ranges are not allowed). Sometimes i and j are used
instead of base and limit.
.if.create:
Res BTCreate(BT *btReturn, Arena arena, Count n)
Attempts to create a table of length n in the arena control pool, putting the
table in '*btReturn'. Returns ResOK if and only if the table is created OK.
The initial values of the bits in the table are undefined (so the client should
probably call BTResRange on the entire range before using the BT). Meets
.req.create.
.if.destroy:
void BTDestroy(BT t, Arena arena, Count n);
Destroys the table t, which must have been created with BTCreate (.if.create).
The value of argument n must be same as the value of the argument passed to
BTCreate. Meets .req.destroy.
.if.size:
size_t BTSize(unsigned long n);
BTSize(n) returns the number of bytes needed for a Bit Table of n bits. It is
a checked error (an assertion will fail) for n to exceed ULONG_MAX -
MPS_WORD_WIDTH + 1. This is used by clients that allocate storage for the BT
themselves. Before BTCreate and BTDestroy were implemented that was the only
way to allocate a Bit Table, but is now deprecated.
.if.get:
int BTGet(BT t, Index i);
BTGet(t, i) returns the ith bit of the table t (i.e. the image of i under the
mapping). Meets .req.ops.get.
.if.set:
void BTSet(BT t, Index i);
BTSet(t, i) sets the ith bit of the table t (to 1). BTGet(t, i) will now
return 1. Meets .req.ops.set.
.if.res:
void BTRes(BT t, Index i);
BTRes(t, i) resets the ith bit of the table t (to 0). BTGet(t, i) will now
return 0. Meets .req.ops.res.
.if.set-range:
void BTSetRange(BT t, Index base, Index limit);
BTSetRange(t, base, limit) sets the range of bits [base, limit) in the table
t. BTGet(t, x) will now return 1 for base<=x<limit. Meets .req.ops.range.set.
.if.res-range:
void BTResRange(BT t, Index base, Index limit);
BTResRange(t, base, limit) resets the range of bits [base, limit) in the table
t. BTGet(t, x) will now return 0 for base<=x<limit. Meets .req.ops.range.res.
.if.test.range.set:
Bool BTIsSetRange(BT bt, Index base, Index limit);
Returns TRUE if all the bits in the range [base, limit) are set, FALSE
otherwise. Meets .req.ops.test.range.set.
.if.test.range.reset:
Bool BTIsResRange(BT bt, Index base, Index limit);
Returns TRUE if all the bits in the range [base, limit) are reset, FALSE
otherwise. Meets .req.ops.test.range.reset.
.if.test.range.same:
Bool BTRangesSame(BT BTx, BT BTy, Index base, Index limit);
returns TRUE if BTGet(BTx,i) equals BTGet(BTy,i) for i in [base, limit), and
false otherwise. Meets .req.ops.test.range.same
.if.find.general: There are four functions (below) to find reset ranges. All
the functions have the same prototype (for symmetry):
Bool find(Index *baseReturn, Index *limitReturn,
BT bt,
Index searchBase, Index searchLimit,
unsigned long length);
bt is the Bit Table in which to search. searchBase and searchLimit specify a
subset of the Bit Table to use, the functions will only find ranges that are
subsets of [searchBase, searchLimit) (when set *baseReturn will never be less
than searchBase and *limitReturn will never be greater than searchLimit).
searchBase, searchLimit specify a range that must conform to the general range
requirements for a range [i,j), as per .if.general.range modified
appropriately. length is the number of contiguous reset bits to find; it must
not be bigger than searchLimit - searchBase (that would be silly). If a
suitable range cannot be found the function returns FALSE (0) and leaves
*baseReturn and *limitReturn untouched. If a suitable range is found then the
function returns the range's base in *baseReturn and its limit in *limitReturn
and returns TRUE (1).
.if.find-short-res-range:
Bool BTFindShortResRange(Index *baseReturn, Index *limitReturn,
BT bt,
Index searchBase, Index searchLimit,
unsigned long length);
BTFindShortResRange(&base, &limit, table, searchBase, searchLimit, length)
finds a range of reset bits in the table, starting at searchBase and working
upwards. This function is intended to meet .req.ops.find.short.low so it will
find the leftmost range that will do, and never finds a range longer than the
requested length (the intention is that it will not waste time looking).
.if.find-short-res-range-high:
Bool BTFindShortResRangeHigh(Index *baseReturn, Index *limitReturn,
BT bt,
Index searchBase, Index searchLimit,
unsigned long length);
BTFindShortResRangeHigh(&base, &limit, table, searchBase, searchLimit, length)
finds a range of reset bits in the table, starting at searchLimit and working
downwards. This function is intended to meet .req.ops.find.short.high so it
will find the rightmost range that will do, and never finds a range longer than
the requested length.
.if.find-long-res-range:
Bool BTFindLongResRange(Index *baseReturn, Index *limitReturn,
BT bt,
Index searchBase, Index searchLimit,
unsigned long length);
BTFindLongResRange(&base, &limit, table, searchBase, searchLimit, length) finds
a range of reset bits in the table, starting at searchBase and working
upwards. This function is intended to meet .req.ops.find.long.low so it will
find the leftmost range that will do and returns all of that range (which can
be longer than the requested length).
.if.find-long-res-range-high:
Bool BTFindLongResRangeHigh(Index *baseReturn, Index *limitReturn,
BT bt,
Index searchBase, Index searchLimit,
unsigned long length);
BTFindLongResRangeHigh(&base, &limit, table, searchBase, searchLimit, length)
finds a range of reset bits in the table, starting at searchLimit and working
downwards. This function is intended to meet .req.ops.find.long.high so it
will find the rightmost range that will do and returns all that range (which
can be longer than the requested length).
.if.copy-range:
extern void BTCopyRange(BT fromBT, BT toBT, Index base, Index limit);
overwrites the ith bit of toBT with the ith bit of fromBT, for all i in [base,
limit). Meets .req.ops.copy.simple.
.if.copy-offset-range:
extern void BTCopyOffsetRange(BT fromBT, BT toBT, Index fromBase, Index
fromLimit, Index toBase, Index toLimit);
overwrites the ith bit of toBT with the jth bit of fromBT, for all i in [toB
ase, toLimit) and corresponding j in [fromBase, fromLimit). Each of these 2
ranges must be the same size. This might be significantly less efficient than
BTCopyRange. Meets .req.ops.copy.offset.
.if.copy-invert-range:
extern void BTCopyInvertRange(BT fromBT, BT toBT, Index base, Index limit);
overwrites the ith bit of toBT with the inverse of the ith bit of fromBT, for
all i in [base, limit). Meets .req.ops.copy.invert.
DETAILED DESIGN
DataStructures
.datastructure: Bit Tables will be represented as (a pointer to) an array of
Words. A plain array is used instead of the more usual design convention of
implementing an ADT as a structure with a signature etc (see
guide.impl.c.adt(0)). .datastructure.words.justify: Words are used as these
will probably map to the object that can be most efficiently accessed on any
particular platform. .datastructure.non-adt.justify: The usual ADT conventions
are not followed because a) The initial designed (drj) was lazy, b) Bit Tables
are more likely to come in convenient powers of two with the extra one or two
words overhead. However, the loss of checking is severe. Perhaps it would be
better to use the usual ADT style.
Functions
.fun.size: BTSize.
Since Bit Tables are an array of Words, the size of a Bit Table of n bits is
simply the number of Words that it takes to store n bits times the number of
bytes in a Word. This is ceiling(n/MPS_WORD_WIDTH)*sizeof(Word).
.fun.size.justify: Since there can be at most MPS_WORD_WIDTH-1 unused bits in
the entire table, this satisfies .req.bit.
.index: The designs for the following functions use a decomposition of a
bit-index, i, into two parts, iw, ib. .index.word: iw is the "word-index"
which is the index into the word array of the word that contains the bit
referred to by the bit-index. iw = i / MPS_WORD_WIDTH. Since MPS_WORD_WIDTH
is a power-of-two, this is the same as iw = i >> MPS_WORD_SHIFT. The latter
expression is used in the code. .index.word.justify: The compiler is more
likely to generate good code without the divide. .index.sub-word: ib is the
"sub-word-index" which is the index of the bit referred to by the bit-index in
the above word. ib = i % MPS_WORD_WIDTH. Since MPS_WORD_WIDTH is a
power-of-two, this is the same as ib = i & ~((Word)-1<<MPS_WORD_SHIFT). The
latter expression is used in the code. .index.sub-word.justify: The compiler
is more likely to generate good code without the modulus.
.index.justify.dubious: The above justifications are dubious; gcc 2.7.2 (with
-O2) running on a sparc (zaphod) produces identical code for the following two
functions:
unsigned long f(unsigned long i)
{ return i/32 + i%32; }
unsigned long g(unsigned long i)
{ return (i>>5) + (i&31); }
.iteration: Many of the following functions involve iteration over ranges in a
Bit Table. This is performed on whole words rather than individual bits,
whenever possible (to improve speed). This is implemented internally by the
macros ACT_ON_RANGE & ACT_ON_RANGE_HIGH for iterating over the range forwards
and backwards respectively. These macros do not form part of the interface of
the module, but are used extensively in the implementation. The macros are
often used even when speed is not an issue because it simplifies the
implementation and makes it more uniform. The iteration macros take the
parameters (base, limit, single_action, bits_action, word_action).
base, limit are of type Index and define the range of the iteration
single_action is the name of a macro which will be used for iterating over
bits in the table individually. This macro must take a single Index parameter
corresponding to the index for the bit. The macro must not use break or
continue because it will be called from within a loop from the expansion of
ACT_ON_RANGE.
bits_action is the name of a macro which will be used for iterating over
part-words. This macro must take parameters (wordIndex, base, limit) where
wordIndex is the index into the array of words, and base & limit define a range
of bits within the indexed word.
word_action is the name of a macro which will be used for iterating over
whole-words. This macro must take the parameter (wordIndex) where wordIndex is
the index of the whole-word in the array. The macro must not use break or
continue because it will be called from within a loop from the expansion of
ACT_ON_RANGE.
.iteration.exit: The code in the single_action, bits_action, and word_action
macros is allowed to use 'return' or 'goto' to terminate the iteration early.
This is used by the test (.fun.test.*) and find (.fun.find.*) operations.
.iteration.small: If the range is sufficiently small only the single_action
macro will be used as this is more efficient in practice. The choice of what
constitutes a small range is made entirely on the basis of experimental
performance results (and currently, 1999-04-27, a "small range" is 6 bits or
fewer. See change.mps.epcore.brisling.160181 for some justification).
Otherwise (for a bigger range) bits_action is used on the part words at either
end of the range (or the whole of the range it if it fits in a single word),
and word_action is used on the words that comprise the inner portion of the
range.
The implementation of ACT_ON_RANGE (and ACT_ON_RANGE_HIGH) is simple enough.
It decides which macros it should invoke and invokes them. single_action and
word_action are invoked inside loops.
.fun.get: BTGet.
The bit-index will be converted in the usual way, see .index. The relevant
Word will be read out of the Bit Table and shifted right by the sub-Word index
(this brings the relevant bit down to the least significant bit of the Word),
the Word will then be masked with 1 producing the answer.
.fun.set: BTSet
.fun.res: BTRes
In both BTSet and BTRes a mask is constructed by shifting 1 left by the
sub-word-index (see .index). For BTSet the mask is ORed into the relevant word
(thereby setting a single bit). For BTRes the mask is inverted and ANDed into
the relevant word (thereby resetting a single bit).
.fun.set-range: BTSetRange
ACT_ON_RANGE (see .iteration above) is used with macros that set a single bit
(using BTSet), set a range of bits in a word, and set a whole word.
.fun.res-range: BTResRange
This is implemented similarly to BTSetRange (.fun.set-range) except using BTRes
& reverse bit masking logic.
.fun.test.range.set: BTIsSetRange
ACT_ON_RANGE (see .iteration above) is used with macros that test whether all
the relevant bits are set; if some of the relevant bits are not set then
'return FALSE' is used to terminate the iteration early and return from the
BTIsSetRange function. If the iteration completes then TRUE is returned.
.fun.test.range.reset: BTIsResRange
As for BTIsSetRange (.fun.test.range.set above) but testing whether the bits
are reset.
.fun.test.range.same: BTRangesSame
As for BTIsSetRange (.fun.test.range.set above) but testing whether
corresponding ranges in the two Bit Tables are the same. Note there are no
speed requirements, but ACT_ON_RANGE is used for simplicitly and uniformity.
.fun.find: The four external find functions (BTFindShortResRange,
BTFindShortResRangeHigh, BTFindLongResRange, BTFindLongResRangeHigh) simply
call through to one of the two internal functions: BTFindResRange,
BTFindResRangeHigh. BTFindResRange and BTFindResRangeHigh both have the
following prototype (with a different name obviously):
Bool BTFindResRange(Index *baseReturn, Index *limitReturn,
BT bt,
Index searchBase, Index searchLimit,
unsigned long minLength,
unsigned long maxLength)
There are two length parameters, one specifying the minimum length of the range
to be found, the other the maximum length. For BTFindShort* maxLength is equal
to minLength when passed; for BTFindLong* maxLength is equal to the maximum
possible range (searchLimit - searchBase).
.fun.find-res-range: BTFindResRange
Iterate within the search boundaries, identifying candidate ranges by searching
for a reset bit. The Boyer-Moore algorithm (reference please?) is used (it's
particularly easy when there are only two symbols, 0 and 1, in the alphabet).
For each candidate range, iterate backwards over the bits from the end of the
range towards the beginning. If a set bit is found, this candidate has failed
and a new candidate range is selected. If when scanning for the set bit a
range of reset bits was found before finding the set bit, then this (small)
range of reset bits is used as the start of the next candidate. Additionally
the end of this small range of reset bits (the end of the failed candidate
range) is remembered so that we don't have to iterate over this range again.
But if no reset bits were found in the candidate range, then iterate again
(starting from the end of the failed candidate) to look for one. If during the
backwards search no set bit is found, then we have found a sufficiently large
range of reset bits; now extend the valid range as far as possible up to the
maximum length by iterating forwards up to the maximum limit looking for a set
bit. The iterations make use of the ACT_ON_RANGE & ACT_ON_RANGE_HIGH macros
using of 'goto' to effect an early termination of the iteration when a
set/reset (as appropriate) bit is found. The macro ACTION_FIND_SET_BIT is used
in the iterations, it efficiently finds the first (that is, with lowest index
or weight) set bit in a word or subword.
.fun.find-res-range.improve: Various other performance improvements have been
suggested in the past, including some from request.epcore.170534. Here is a
list of potential improvements which all sound plausible, but which have not
led to performance improvements in practice:
.fun.find-res-range.improve.step.partial: When the top index in a candidate
range fails, skip partial words as well as whole words, using
e.g. lookup tables.
.fun.find-res-range.improve.lookup: When testing a candidate run,
examine multiple bits at once (e.g. 8), using lookup tables for (e.g)
index of first set bit, index of last set bit, number of reset bits,
length of maximum run of reset bits.
.fun.find-res-range-high: BTFindResRangeHigh
Exactly the same algorithm as in BTFindResRange (see .fun.find-res-range
above), but moving over the table in the opposite direction.
.fun.copy-simple-range: BTCopyRange.
Uses ACT_ON_RANGE (see .iteration above) with the obvious implementation.
Should be fast.
.fun.copy-offset-range: BTCopyOffsetRange.
Uses a simple iteration loop, reading bits with BTGet and setting them with
BTSet. Doesn't use ACT_ON_RANGE because the two ranges will not, in general, be
similarly word-aligned.
.fun.copy-invert-range: BTCopyInvertRange.
Uses ACT_ON_RANGE (see .iteration above) with the obvious implementation.
Should be fast - although there are no speed requirements.
TESTING
.test: The following tests are available / have been used during development.
.test.btcv: MMsrc!btcv.c. This is supposed to be a coverage test, intended to
execute all of the module's code in at least some minimal way.
.test.cbstest: MMsrc!cbstest.c. This was written as a test of the CBS module
(design.mps.cbs(2)). It compares the functional operation of a CBS with that
of a BT so is a good functional test of either module.
.test.mmqa.120: MMQA_test_function!210.c. This is used because it has a fair
amount of segment allocation and freeing so exercises the arena code that uses
Bit Tables.
.test.bttest: MMsrc!bttest.c. This is an interactive test that can be used to
exercise some of the BT functionality by hand.
.test.dylan: It is possible to modify Dylan so that it uses Bit Tables more
extensively. See change.mps.epcore.brisling.160181 TEST1 and TEST2.

699
mps/design/buffer/index.txt Normal file
View file

@ -0,0 +1,699 @@
ALLOCATION BUFFERS AND ALLOCATION POINTS
design.mps.buffer
incomplete design
richard 1996-09-02
INTRODUCTION
.scope: This document describes the design of allocation buffers and allocation
points.
.purpose: The purpose of this document is to record design decisions made
concerning allocation buffers and allocation points and justify those decisions
in terms of requirements.
.readership: The document is intended for reading by any Memory Management
Group developer.
HISTORY
.history.0-1: The history for versions 0-1 is lost pending possible
reconstruction.
.history.2: Added class hierarchy and subclassing information.
SOURCE
.source.mail: Much of the juicy stuff about buffers is only floating around in
mail discussions. You might like to try searching the archives if you can't
find what you want here.
.source.synchronize: For a discussion of the syncronization issues:
mail.richard.1995-05-24.10-18, mail.ptw.1995-05-19.19-15,
mail.richard.1995-05-19.17-10
[drj - I believe that the sequence for flip in ptw's message is incorrect. The
operations should be in the other order]
.source.interface: For a description of the buffer interface in C prototypes:
mail.richard.1997-04-28.09-25(0)
.source.qa: Discussions with QA were useful in pinning down the semantics and
understanding of some obscure but important boundary cases. See
mail.richard.tucker.1997-05-12.09-45(0) et seq (mail subject: "notes on our
allocation points discussion").
REQUIREMENTS
.req.fast: Allocation must be very fast
.req.thread-safe: Must run safely in a multi-threaded environment
.req.no-synch: Must avoid the use of thread-synchronization (.req.fast)
.req.manual: Support manual memory management
.req.exact: Support exact collectors
.req.ambig: Support ambiguous collectors
.req.count: Must record (approximately) the amount of allocation (in bytes).
Actually not a requirement any more, but once was put forward as a Dylan
requirement. Bits of the code still reflect this requirement. See
request.dylan.170554.
CLASSES
.class.hierarchy: The Buffer datastructure is designed to be subclassable (see
design.mps.protocol).
.class.hierarchy.buffer: The basic buffer class (BufferClass) supports basic
allocation-point buffering, and is appropriate for those manual pools which
don't use segments (.req.manual). The Buffer class doesn't support reference
ranks (i.e. the buffers have RankSetEMPTY). Clients may use BufferClass
directly, or create their own subclasses (see .subclassing).
.class.hierarchy.segbuf: Class SegBufClass is also provided for the use of
pools which additionally need to associate buffers with segments. SegBufClass
is a subclass of BufferClass. Manual pools may find it convenient to use
SegBufClass, but it is primarily intended for automatic pools (.req.exact,
.req.ambig). An instance of SegBufClass may be attached to a region of memory
that lies within a single segment. The segment is associated with the buffer,
and may be accessed with the BufferSeg function. SegBufClass also supports
references at any rank set. Hence this class or one of its subclasses should be
used by all automatic pools (with the possible exception of leaf pools). The
rank sets of buffers and the segments they are attached to must match. Clients
may use SegBufClass directly, or create their own subclasses (see
.subclassing).
.class.hierarchy.rankbuf: Class RankBufClass is also provided as a subclass of
SegBufClass. The only way in which this differs from its superclass is that
the rankset of a RankBufClass is set during initialization to the singleton
rank passed as an additional parameter to BufferCreate. Instances of
RankBufClass are of the same type as instances of SegBufClass, i.e., SegBuf.
Clients may use RankBufClass directly, or create their own subclasses (see
.subclassing).
.class.create: The buffer creation functions (BufferCreate and BufferCreateV)
take a class parameter, which determines the class of buffer to be created.
.class.choice: Pools which support buffered allocation should specify a default
class for buffers. This class will be used when a buffer is created in the
normal fashion by MPS clients (for example by a call to mps_ap_create). Pools
specify the default class by means of the bufferClass field in the pool class
object. This should be a pointer to a function of type PoolBufferClassMethod.
The normal class "Ensure" function (e.g. Ensure*Buffer*Class) has the
appropriate type.
.subclassing: Pools may create their own subclasses of the standard buffer
classes. This is sometimes useful if the pool needs to add an extra field to
the buffer. The convenience macro DEFINE_BUFFER_CLASS may be used to define
subclasses of buffer classes. See design.mps.protocol.int.define-special.
.replay: To work with the allocation replayer (see
design.mps.telemetry.replayer), the buffer class has to emit an event for each
call to an external interface, containing all the parameters passed by the
user. If a new event type is required to carry this information, the replayer
(impl.c.eventrep) must then be extended to recreate the call.
.replay.pool-buffer: The replayer must also be updated if the association of
buffer class to pool or the buffer class hierarchy is changed.
.class.method: Buffer classes provide the following methods (these should not
be confused with the pool class methods related to the buffer protocol,
described in .method.*):
.class.method.init: "init" is a class-specific initialization method called
from BufferInitV. It receives the optional (vararg) parameters passed to
BufferInitV. Client-defined methods must call their superclass method (via a
next-method call) before performing any class-specific behaviour. .replay.init
: The init method should emit a BufferInit<foo> event (if there aren't any
extra parameters, <foo> = "").
.class.method.finish: "finish" is a class-specific finish method called from
BufferFinish. Client-defined methods must call their superclass method (via a
next-method call) after performing any class-specific behaviour.
.class.method.attach: "attach" is a class-specific method called whenever a
buffer is attached to memory, via BufferAttach. Client-defined methods must
call their superclass method (via a next-method call) before performing any
class-specific behaviour.
.class.method.detach: "detach" is a class-specific method called whenever a
buffer is detached from memory, via BufferDetach. Client-defined methods must
call their superclass method (via a next-method call) after performing any
class-specific behaviour.
.class.method.seg: "seg" is a class-specific accessor method which returns the
segment attached to a buffer (or NULL if there isn't one). It is called from
BufferSeg. Clients should not need to define their own methods for this.
.class.method.rankSet: "rankSet" is a class-specific accessor method which
returns the rank set of a buffer. It is called from BufferRankSet. Clients
should not need to define their own methods for this.
.class.method.setRankSet: "setRankSet" is a class-specific setter method which
sets the rank set of a buffer. It is called from BufferSetRankSet. Clients
should not need to define their own methods for this.
.class.method.describe: "describe" is a class-specific method called to
describe a buffer, via BufferDescribe. Client-defined methods must call their
superclass method (via a next-method call) before describing any class-specific
state.
NOTES
.logging.control: Buffers have a separate control for whether they are logged
or not, this is because they are particularly high volume. This is a boolean
flags (bufferLogging) in the ArenaStruct.
.count: Counting the allocation volume is done by maintaining two fields in the
buffer struct: .count.fields: fillSize, emptySize. .count.monotonic: both of
these fields are monotonically increasing. .count.fillsize: fillSize is an
accumulated total of the size of all the fills (as a result of calling the
PoolClass BufferFill method) that happen on the buffer. .count.emptysize:
emptySize is an accumulated total of the size of all the empties than happen on
the buffer (which are notified to the pool using the PoolClass BufferEmpty
method). .count.generic: These fields are maintained by the generic buffer
code (in BufferAttach and BufferDetach).
.count.other: Similar count fields are maintained in the pool and the arena.
They are maintained on an internal (buffers used internally by MPS) and
external (buffers used for mutator APs) basis. The fields are also updated by
the buffer code. The fields are: in the pool,
{fill|empty}{Mutator|Internal}Size (4 fields); in the arena,
{fill|empty}{Mutator|Internal}Size allocMutatorSize (5 fields).
.count.alloc.how: The amount of allocation in the buffer just after an empty
is (fillSize - emptySize). At other times this computation will include space
that the buffer has the use of (between base and init) but which may not get
allocated in (because the remaining space may be too large for the next reserve
so some or all of it may get emptied). The arena field allocMutatorSize is
incremented by the allocated size (between base and init) whenever a buffer is
detached. Symmetrically this field is decremented by by the pre-allocated size
(between base and init) whenever a buffer is attached. The overall count is
asymptotically correct.
.count.type: All the count fields are type double. .count.type.justify: This
is because double is the type most likely to give us enough precision. Because
of the lack of genuine requirements the type isn't so important. It's nice to
have it more precise than long. Which double usually is.
From the whiteboard:
REQ
atomic update of words
guarantee order of reads and write to certain memory locations.
FLIP
limit:=0
record init for scanner
COMMIT
init:=alloc
if(limit = 0) ...
L written only by MM
A \ written only by client (except during synchronized MM op)
I /
I read by MM during flip
States
BUSY
READY
TRAPPED
RESET
[drj: there are many more states]
Misc
.misc: During buffer ops all field values can change. Might trash perfectly
good ("valid"?) object if pool isn't careful.
Not from the whiteboard.
SYNCHRONIZATION
Buffers provide a loose form of synchronization between the mutator and the
collector.
The crucial synchronization issues are between the operation the pool performs
on flip and the mutator's commit operation.
Commit
read init
write init
Memory Barrier
read limit
Flip
write limit
Memory Barrier
read init
Commit consists of two parts. The first is the update to init. This is a
declaration that the new object just before init is now correctly formatted and
can be scanned. The second is a check to see if the buffer has been
"tripped". The ordering of the two parts is crucial.
Note that the declaration that the object is correctly formatted is independent
of whether the buffer has been tripped or not. In particular a pool can scan
up to the init pointer (including the newly declared object) whether or not the
pool will cause the commit to fail. In the case where the pool scans the
object, but then causes the commit to fail (and presumably the allocation to
occur somewhere else), the pool will have scanned a "dead" object, but this is
just another example of conservatism in the general sense.
Not that the read of init in the Flip sequence can in fact be arbitrarily
delayed (as long as it is read before a buffered segment is scanned).
On processors with Relaxed Memory Order (such as the DEC Alpha), Memory
Barriers will need to be placed at the points indicated.
* DESIGN
*
* design.mps.buffer.
*
* An allocation buffer is an interface to a pool which provides
* very fast allocation, and defers the need for synchronization in
* a multi-threaded environment.
*
* Pools which contain formatted objects must be synchronized so
* that the pool can know when an object is valid. Allocation from
* such pools is done in two stages: reserve and commit. The client
* first reserves memory, then initializes it, then commits.
* Committing the memory declares that it contains a valid formatted
* object. Under certain conditions, some pools may cause the
* commit operation to fail. (See the documentation for the pool.)
* Failure to commit indicates that the whole allocation failed and
* must be restarted. When using a pool which introduces the
* possibility of commit failing, the allocation sequence could look
* something like this:
*
* do {
* res = BufferReserve(&p, buffer, size);
* if(res != ResOK) return res; // allocation fails, reason res
* initialize(p); // p now points at valid object
* } while(!BufferCommit(buffer, p, size));
*
* Pools which do not contain formatted objects can use a one-step
* allocation as usual. Effectively any random rubbish counts as a
* "valid object" to such pools.
*
* An allocation buffer is an area of memory which is pre-allocated
* from a pool, plus a buffer descriptor, which contains, inter
* alia, four pointers: base, init, alloc, and limit. Base points
* to the base address of the area, limit to the last address plus
* one. Init points to the first uninitialized address in the
* buffer, and alloc points to the first unallocated address.
*
* L . - - - - - . ^
* | | Higher addresses -'
* | junk |
* | | the "busy" state, after Reserve
* A |-----------|
* | uninit |
* I |-----------|
* | init |
* | | Lower addresses -.
* B `-----------' v
*
* L . - - - - - . ^
* | | Higher addresses -'
* | junk |
* | | the "ready" state, after Commit
* A=I |-----------|
* | |
* | |
* | init |
* | | Lower addresses -.
* B `-----------' v
*
* Access to these pointers is restricted in order to allow
* synchronization between the pool and the client. The client may
* only write to init and alloc, but in a restricted and atomic way
* detailed below. The pool may read the contents of the buffer
* descriptor at _any_ time. During calls to the fill and trip
* methods, the pool may update any or all of the fields
* in the buffer descriptor. The pool may update the limit at _any_
* time.
*
* Access to buffers by these methods is not synchronized. If a buffer
* is to be used by more than one thread then it is the client's
* responsibility to ensure exclusive access. It is recommended that
* a buffer be used by only a single thread.
*
* [Only one thread may use a buffer at once, unless the client
* places a mutual exclusion around the buffer access in the usual
* way. In such cases it is usually better to create one buffer for
* each thread.]
*
* Here are pseudo-code descriptions of the reserve and commit
* operations. These may be implemented in-line by the client.
* Note that the client is responsible for ensuring that the size
* (and therefore the alloc and init pointers) are aligned according
* to the buffer's alignment.
*
* Reserve(buf, size) ; size must be aligned to pool
* if buf->limit - buf->alloc >= size then
* buf->alloc +=size ; must be atomic update
* p = buf->init
* else
* res = BufferFill(&p, buf, size) ; buf contents may change
*
* Commit(buf, p, size)
* buf->init = buf->alloc ; must be atomic update
* if buf->limit == 0 then
* res = BufferTrip(buf, p, size) ; buf contents may change
* else
* res = True
* (returns True on successful commit)
*
* The pool must allocate the buffer descriptor and initialize it by
* calling BufferInit. The descriptor this creates will fall
* through to the fill method on the first allocation. In general,
* pools should not assign resources to the buffer until the first
* allocation, since the buffer may never be used.
*
* The pool may update the base, init, alloc, and limit fields when
* the fallback methods are called. In addition, the pool may set
* the limit to zero at any time. The effect of this is either:
*
* 1. cause the _next_ allocation in the buffer to fall through to
* the buffer fill method, and allow the buffer to be flushed
* and relocated;
*
* 2. cause the buffer trip method to be called if the client was
* between reserve and commit.
*
* A buffer may not be relocated under other circumstances because
* there is a race between updating the descriptor and the client
* allocation sequence.
.method.create:
BufferCreate
Create an allocation buffer in a pool.
The buffer is created in the "ready" state.
A buffer structure is allocated from the space control pool and partially
initialized (in particularly neither the signature nor the serial field are
initialized). The pool class's bufferCreate method is then called. This
method can update (some undefined subset of) the fields of the structure; it
should return with the buffer in the "ready" state (or fail). The remainder of
the initialization then occurs.
If and only if successful then a valid buffer is returned.
.method.destroy:
BufferDestroy
Destroy frees a buffer descriptor. The buffer must be in the "ready" state,
i.e. not between a Reserve and Commit. Allocation in the area of memory to
which the descriptor refers must cease after Destroy is called.
Destroying an allocation buffer does not affect objects which have been
allocated, it just frees resources associated with the buffer itself.
The pool class's bufferDestroy method is called and then the buffer structure
is uninitialized and freed.
.method.check:
BufferCheck
The check method is straightforward, the non-trivial dependencies checked are:
The ordering constraints between base, init, alloc, and limit.
The alignment constraints on base, init, alloc, and limit.
That the buffer's rank is identical to the segment's rank.
.method.set-reset:
/* BufferSet/Reset -- set/reset a buffer
*
* Set sets the buffer base, init, alloc, and limit fields so that
* the buffer is ready to start allocating in area of memory. The
* alloc field is a copy of the init field.
*
* Reset sets the seg, base, init, alloc, and limit fields to
* zero, so that the next reserve request will call the fill
* method.
*/
.method.set.unbusy: BufferSet must only be applied to buffers that are not busy.
.method.reset.unbusy: BufferReset must only be applied to buffers that are not
busy.
.method.accessors:
/* Buffer Information
*
* BufferIsReset returns TRUE if and only if the buffer is in the
* reset state, i.e. with base, init, alloc, and limit set to zero.
*
* BufferIsReady returns TRUE iff the buffer is not between a
* reserve and commit. The result is only reliable if the client is
* not currently using the buffer, since it may update the alloc and
* init pointers asynchronously.
*
* BufferAP returns the APStruct substructure of a buffer.
*
* BufferOfAP is a thread-safe (impl.c.mpsi.thread-safety) method of
* getting the buffer which owns an APStruct.
*
* BufferSpace is a thread-safe (impl.c.mpsi.thread-safety) method of
* getting the space which owns a buffer.
*
* BufferPool returns the pool to which a buffer is attached.
*/
.method.ofap:
.method.ofap.thread-safe:
BufferOfAP must be thread safe (see impl.c.mpsi.thread-safety). This is
achieved simply because the underlying operation involved is simply a
subtraction.
.method.space:
.method.space.thread-safe:
BufferSpace must be thread safe (see impl.c.mpsi.thread-safety). This is
achieved simple because the underlying operation is a read of
shared-non-mutable data (see design.mps.thread-safety).
.method.reserve:
/* BufferReserve -- reserve memory from an allocation buffer
*
* This is a provided version of the reserve procedure described
* above. The size must be aligned according to the buffer
* alignment. Iff successful, ResOK is returned and
* *pReturn updated with a pointer to the reserved memory.
* Otherwise *pReturn is not touched. The reserved memory is not
* guaranteed to have any particular contents. The memory must be
* initialized with a valid object (according to the pool to which
* the buffer belongs) and then passed to the Commit method (see
* below). Reserve may not be applied twice to a buffer without a
* Commit in-between. In other words, Reserve/Commit pairs do not
* nest.
*/
Res BufferReserve(Addr *pReturn, Buffer buffer, Size size)
{
Addr next;
AVER(pReturn != NULL);
AVERT(Buffer, buffer);
AVER(size > 0);
AVER(SizeIsAligned(size, BufferPool(buffer)->alignment));
AVER(BufferIsReady(buffer));
/* Is there enough room in the unallocated portion of the buffer to */
/* satisfy the request? If so, just increase the alloc marker and */
/* return a pointer to the area below it. */
next = AddrAdd(buffer->ap.alloc, size);
if(next > buffer->ap.alloc && next <= buffer->ap.limit)
{
buffer->ap.alloc = next;
*pReturn = buffer->ap.init;
return ResOK;
}
/* If the buffer can't accommodate the request, fall through to the */
/* pool-specific allocation method. */
return BufferFill(pReturn, buffer, size);
}
.method.fill:
/* BufferFill -- refill an empty buffer
*
* If there is not enough space in a buffer to allocate in-line,
* BufferFill must be called to "refill" the buffer. (See the
* description of the in-line Reserve method in the leader comment.)
*/
Res BufferFill(Addr *pReturn, Buffer buffer, Size size)
{
Res res;
Pool pool;
AVER(pReturn != NULL);
AVERT(Buffer, buffer);
AVER(size > 0);
AVER(SizeIsAligned(size, BufferPool(buffer)->alignment));
AVER(BufferIsReady(buffer));
pool = BufferPool(buffer);
res = (*pool->class->bufferFill)(pReturn, pool, buffer, size);
AVERT(Buffer, buffer);
return res;
}
.method.commit:
/* BufferCommit -- commit memory previously reserved
*
* Commit notifies the pool that memory which has been previously
* reserved (see above) has been initialized with a valid object
* (according to the pool to which the buffer belongs). The pointer
* p must be the same as that returned by Reserve, and the size must
* match the size passed to Reserve.
*
* Commit may not be applied twice to a buffer without a reserve
* in-between. In other words, objects must be reserved,
* initialized, then committed only once.
*
* Commit returns TRUE iff successful. If commit fails and returns
* FALSE, the client may try to allocate again by going back to the
* reserve stage, and may not use the memory at p again for any
* purpose.
*
* Some classes of pool may cause commit to fail under rare
* circumstances.
*/
Bool BufferCommit(Buffer buffer, Addr p, Size size)
{
AVERT(Buffer, buffer);
AVER(size > 0);
AVER(SizeIsAligned(size, BufferPool(buffer)->alignment));
/* Buffer is "busy" */
AVER(!BufferIsReady(buffer));
/* See design.mps.collection.flip.
* If a flip occurs before this point, when the pool reads
* buffer->init it will point below the object, so it will be trashed
* and the commit must fail when trip is called. The pool will also
* read p (during the call to trip) which points to the invalid
* object at init.
*/
AVER(p == buffer->ap.init);
AVER(AddrAdd(buffer->ap.init, size) == buffer->ap.alloc);
/* Atomically update the init pointer to declare that the object */
/* is initialized (though it may be invalid if a flip occurred). */
buffer->ap.init = buffer->ap.alloc;
/* .improve.memory-barrier: Memory barrier here on the DEC Alpha
* (and other relaxed memory order architectures). */
/* If a flip occurs at this point, the pool will see init */
/* above the object, which is valid, so it will be collected. */
/* The commit must succeed when trip is called. The pointer */
/* p will have been fixed up. */
/* Trip the buffer if a flip has occurred. */
if(buffer->ap.limit == 0)
return BufferTrip(buffer, p, size);
/* No flip occurred, so succeed. */
return TRUE;
}
.method.trip:
BufferTrip -- act on a tripped buffer
The pool which owns a buffer may asynchronously set the buffer limit to zero in
order to get control over the buffer. If this occurs after a Reserve (but
before the corresponding commit), then the Commit method calls BufferTrip and
the Commit method returns with BufferTrip's return value. (See the description
of Commit.)
.method.trip.precondition:
At the time trip is called (see Commit), the following are true:
.method.trip.precondition.limit: limit == 0
.method.trip.precondition.init: init == alloc
.method.trip.precondition.p: p+size == alloc
.method.expose-cover:
Expose / Cover
BufferExpose/Cover are used by collectors that want to allocate in a forwarding
buffer. Since the forwarding buffer may be Shielded the potential problem of
handling a recursive fault appears (mutator causes a page fault, collector
fixes some objects causing allocation, the allocation takes place is a
protected area of memory which cause a page fault). BufferExpose guarantees
that allocation can take place in the buffer without causing a page fault,
BufferCover removes this guarantee.
[The following paragraph is reverse-constructed conjecture]
BufferExpose puts the buffer in an "exposed" state, BufferCover puts the buffer
in a "covered" state. BufferExpose can only be called if the buffer is
"covered". BufferCover can only be called if the buffer is "exposed".
[Is this part of the "Protection/Suspension Protocol"? (see mail with that
subject)]
-------------------------------
Here are a number of diagrams showing how buffers behave. In general, the
horizontal axis corresponds to mutator action (reserve, commit) and the
vertical axis corresponds to collector action. I'm not sure which of the
diagrams are the same as each other, and which are best or most complete when
they are different, but they all attempt to show essentially the same
information. It's very difficult to get all the details in. These diagrams were
drawn by richard, rit, gavinm, &c, &c in April 1997. In general, the later
diagrams are, I suspect, more correct, complete and useful than the earlier
ones. I have put them all here for the record. rit 1998-02-09
Buffer Diagram:
Buffer States
Buffer States (3-column)
Buffer States (4-column)
Buffer States (gavinised)
Buffer States (interleaved)
Buffer States (richardized)

559
mps/design/cbs/index.txt Normal file
View file

@ -0,0 +1,559 @@
DESIGN FOR COALESCING BLOCK STRUCTURE
design.mps.cbs
incomplete doc
gavinm 1998-05-01
INTRODUCTION
.intro: This is the design for impl.c.cbs, which implements a data structure
for the management of non-intersecting memory ranges, with eager coalescence.
.readership: This document is intended for any MM developer.
.source: design.mps.poolmv2, design.mps.poolmvff.
.overview: The "coalescing block structure" is a set of addresses (or a subset
of address space), with provision for efficient management of contiguous
ranges, including insertion and deletion, high level communication with the
client about the size of contiguous ranges, and detection of protocol
violations.
Document History
.hist.0: This document was derived from the outline in design.mps.poolmv2(2).
Written by Gavin Matthews 1998-05-01.
.hist.1: Updated by Gavin Matthews 1998-07-22 in response to approval comments
in change.epcore.anchovy.160040 There is too much fragmentation in trapping
memory.
.hist.2: Updated by Gavin Matthews (as part of change.epcore.brisling.160158:
MVFF cannot be instantiated with 4-byte alignment) to document new alignment
restrictions.
DEFINITIONS
.def.range: A (contiguous) range of addresses is a semi-open interval on
address space.
.def.isolated: A contiguous range is isolated with respect to some property it
has, if adjacent elements do not have that property.
.def.interesting: A block is interesting if it is of at least the minimum
interesting size specified by the client.
REQUIREMENTS
.req.set: Must maintain a set of addresses.
.req.fast: Common operations must have a low amortized cost.
.req.add: Must be able to add address ranges to the set.
.req.remove: Must be able to remove address ranges from the set.
.req.size: Must report concisely to the client when isolated contiguous ranges
of at least a certain size appear and disappear.
.req.iterate: Must support the iteration of all isolated contiguous ranges.
This will not be a common operation.
.req.protocol: Must detect protocol violations.
.req.debug: Must support debugging of client code.
.req.small: Must have a small space overhead for the storage of typical subsets
of address space and not have abysmal overhead for the storage of any subset of
address space.
.req.align: Must support an alignment (the alignment of all addresses
specifying ranges) of down to sizeof(void *) without losing memory.
INTERFACE
.header: CBS is used through impl.h.cbs.
External Types
.type.cbs: CBS is the main data-structure for manipulating a CBS. It is
intended that a CBSStruct be embedded in another structure. No convenience
functions are provided for the allocation or deallocation of the CBS.
typedef struct CBSStruct CBSStruct, *CBS;
.type.cbs.block: CBSBlock is the data-structure that represents an isolated
contiguous range held by the CBS. It is returned by the new and delete methods
described below.
typedef struct CBSBlockStruct CBSBlockStruct, *CBSBlock;
.type.cbs.method: The following methods are provided as callbacks to advise the
client of certain events. The implementation of these functions should not
cause any CBS function to be called on the same CBS. In this respect, the CBS
module is not re-entrant.
.type.cbs.change.size.method: CBSChangeSizeMethod is the function pointer type,
four instances of which are optionally registered via CBSInit.
typedef void (*CBSChangeSizeMethod)(CBS cbs, CBSBlock block, Size oldSize,
SizeNewSize);
These callbacks are invoked under CBSInsert, CBSDelete, or CBSSetMinSize in
certain circumstances. Unless otherwise stated, oldSize and newSize will both
be non-zero, and different. The accessors CBSBlockBase, CBSBlockLimit, and
CBSBlockSize may be called from within these callbacks, except within the
delete callback when newSize is zero. See .impl.callback for implementation
details.
.type.cbs.iterate.method: CBSIterateMethod is a function pointer type is a
client method invoked by the CBS module for every isolated contiguous range in
address order, when passed to the CBSIterate or CBSIterateLarge functions. The
function returns a boolean indicating whether to continue with the iteration.
typedef Bool (*CBSIterateMethod)(CBS cbs, CBSBlock block, void *closureP,
unsigned long closureS);
External Functions
.function.cbs.init: CBSInit is the function that initialises the CBS
structure. It performs allocation in the supplied arena. Four methods are
passed in as function pointers (see .type.* above), any of which may be NULL.
It receives a minimum size, which is used when determining whether to call the
optional methods. The mayUseInline boolean indicates whether the CBS may use
the memory in the ranges as a low-memory fallback (see .impl.low-mem). The
alignment indicates the alignment of ranges to be maintained. An initialised
CBS contains no ranges.
Res CBSInit(Arena arena, CBS cbs, CBSChangeSizeMethod new,
CBSChangeSizeMethod delete, CBSChangeSizeMethod grow, CBSChangeSizeMethod
shrink, Size minSize, Align alignment, Bool mayUseInline);
.function.cbs.init.may-use-inline: If mayUseInline is set, then alignment must
be at least sizeof(void *). In this mode, the CBS will never fail to insert or
delete ranges, even if memory for control structures becomes short. Note that,
in such cases, the CBS may defer notification of new/grow events, but will
report available blocks in CBSFindFirst and CBSFindLast. Such low memory
conditions will be rare and transitory. See .align for more details.
.function.cbs.finish: CBSFinish is the function that finishes the CBS structure
and discards any other resources associated with the CBS.
void CBSFinish(CBS cbs);
.function.cbs.insert: CBSInsert is the function used to add a contiguous range
specified by [base,limit) to the CBS. If any part of the range is already in
the CBS, then ResFAIL is returned, and the CBS is unchanged. This function may
cause allocation; if this allocation fails, and any contingency mechanism
fails, then ResMEMORY is returned, and the CBS is unchanged.
Res CBSInsert(CBS cbs, Addr base, Addr limit);
.function.cbs.insert.callback: CBSInsert will invoke callbacks as follows:
new: when a new block is created that is interesting. oldSize == 0; newSize
>= minSize.
new: when an uninteresting block coalesces to become interesting. 0 <
oldSize < minSize <= newSize.
delete: when two interesting blocks are coalesced. grow will also be invoked
in this case on the larger of the two blocks. newSize == 0; oldSize >= minSize.
grow: when an interesting block grows in size. minSize <= oldSize < newSize.
.function.cbs.delete: CBSDelete is the function used to remove a contiguous
range specified by [base,limit) from the CBS. If any part of the range is not
in the CBS, then ResFAIL is returned, and the CBS is unchanged. This function
may cause allocation; if this allocation fails, and any contingency mechanism
fails, then ResMEMORY is returned, and the CBS is unchanged.
Res CBSDelete(CBS cbs, Addr base, Addr limit);
.function.cbs.delete.callback: CBSDelete will invoke callbacks as follows:
delete: when an interesting block is entirely removed. newSize == 0; oldSize
>= minSize.
delete: when an interesting block becomes uninteresting. 0 < newSize <
minSize <= oldSize.
new: when a block is split into two blocks, both of which are interesting.
shrink will also be invoked in this case on the larger of the two blocks.
oldSize == 0; newSize >= minSize.
shrink: when an interesting block shrinks in size, but remains interesting.
minSize <= newSize < oldSize.
.function.cbs.iterate: CBSIterate is the function used to iterate all isolated
contiguous ranges in a CBS. It receives a pointer, unsigned long closure pair
to pass on to the iterator method, and an iterator method to invoke on every
range in address order. If the iterator method returns FALSE, then the
iteration is terminated.
void CBSIterate(CBS cbs, CBSIterateMethod iterate, void *closureP, unsigned
long closureS);
.function.cbs.iterate.large: CBSIterateLarge is the function used to iterate
all isolated contiguous ranges of size greater than or equal to the client
indicated minimum size in a CBS. It receives a pointer, unsigned long closure
pair to pass on to the iterator method, and an iterator method to invoke on
every large range in address order. If the iterator method returns FALSE, then
the iteration is terminated.
void CBSIterateLarge(CBS cbs, CBSIterateMethod iterate, void *closureP,
unsigned long closureS);
.function.cbs.set.min-size: CBSSetMinSize is the function used to change the
minimum size of interest in a CBS. This minimum size is used to determine
whether to invoke the client callbacks from CBSInsert and CBSDelete. This
function will invoke either the new or delete callback for all blocks that are
(in the semi-open interval) between the old and new values. oldSize and
newSize will be the same in these cases.
void CBSSetMinSize(CBS cbs, Size minSize);
.function.cbs.describe: CBSDescribe is a function that prints a textual
representation of the CBS to the given stream, indicating the contiguous ranges
in order, as well as the structure of the underlying splay tree
implementation. It is provided for debugging purposes only.
Res CBSDescribe(CBS cbs, mps_lib_FILE *stream);
.function.cbs.block.base: The CBSBlockBase function returns the base of the
range represented by the CBSBlock. This function may not be called from the
delete callback when the block is being deleted entirely.
Addr CBSBlockBase(CBSBlock block);
Note that the value of the base of a particular CBSBlock is not guaranteed to
remain constant across calls to CBSDelete and CBSInsert, regardless of whether
a callback is invoked.
.function.cbs.block.limit: The CBSBlockLimit function returns the limit of the
range represented by the CBSBlock. This function may not be called from the
delete callback when the block is being deleted entirely.
Addr CBSBlockLimit(CBSBlock block);
Note that the value of the limit of a particular CBSBlock is not guaranteed to
remain constant across calls to CBSDelete and CBSInsert, regardless of whether
a callback is invoked.
.function.cbs.block.size: The CBSBlockSize function returns the size of the
range represented by the CBSBlock. This function may not be called from the
delete callback when the block is being deleted entirely.
Size CBSBlockSize(CBSBlock block);
Note that the value of the size of a particular CBSBlock is not guaranteed to
remain constant across calls to CBSDelete and CBSInsert, regardless of whether
a callback is invoked.
.function.cbs.block.describe: The CBSBlockDescribe function prints a textual
representation of the CBSBlock to the given stream. It is provided for
debugging purposes only.
Res CBSBlockDescribe(CBSBlock block, mps_lib_FILE *stream);
.function.cbs.find.first: The CBSFindFirst function locates the first block (in
address order) within the CBS of at least the specified size, and returns its
range. If there are no such blocks, it returns FALSE. It optionally deletes
the top, bottom, or all of the found range, depending on the findDelete
argument (this saves a separate call to CBSDelete, and uses the knowledge of
exactly where we found the range).
Bool CBSFindFirst(Addr *baseReturn, Addr *limitReturn, CBS cbs, Size size,
CBSFindDelete findDelete);
enum {
CBSFindDeleteNONE, /* don't delete after finding */
CBSFindDeleteLOW, /* delete precise size from low end */
CBSFindDeleteHIGH, /* delete precise size from high end */
CBSFindDeleteENTIRE /* delete entire range */
};
.function.cbs.find.last: The CBSFindLast function locates the last block (in
address order) within the CBS of at least the specified size, and returns its
range. If there are no such blocks, it returns FALSE. Like CBSFindFirst, it
optionally deletes the range.
Bool CBSFindLast(Addr *baseReturn, Addr *limitReturn, CBS cbs, Size size,
CBSFindDelete findDelete);
.function.cbs.find.largest: The CBSFindLargest function locates the largest
block within the CBS, and returns its range. If there are no blocks, it
returns FALSE. Like CBSFindFirst, it optionally deletes the range (specifying
CBSFindDeleteLOW or CBSFindDeleteHIGH has the same effect as
CBSFindDeleteENTIRE).
Bool CBSFindLargest(Addr *baseReturn, Addr *limitReturn, CBS cbs,
CBSFindDelete findDelete)
Alignment
.align: When mayUseInline is specified to permit inline data structures and
hence avoid losing memory in low memory situations, the alignments that the CBS
supports are constrained by three requirements:
- The smallest possible range (namely one that is the alignment in size) must
be large enough to contain a single void * pointer (see
.impl.low-mem.inline.grain);
- Any larger range (namely one that is at least twice the alignment in size)
must be large enough to contain two void * pointers (see
.impl.low-mem.inline.block);
- It must be valid on all platforms to access a void * pointer stored at the
start of an aligned range.
All alignments that meet these requirements are aligned to sizeof(void *), so
we take that as the minimum alignment.
IMPLEMENTATION
.impl: Note that this section is concerned with describing various aspects of
the implementation. It does not form part of the interface definition.
Size Change Callback Protocol
.impl.callback: The size change callback protocol concerns the mechanism for
informing the client of the appearance and disappearance of interesting
ranges. The intention is that each range has an identity (represented by the
CBSBlock). When blocks are split, the larger fragment retains the identity.
When blocks are merged, the new block has the identity of the larger fragment.
.impl.callback.delete: Consider the case when the minimum size is <minSize>,
and CBSDelete is called to remove a range of size <middle>. The two (possibly
non-existant) neighbouring ranges have (possibly zero) sizes <left> and
<right>. <middle> is part of the CBSBlock <middleBlock>.
.impl.callback.delete.delete: The delete callback will be called in this case
if and only if:
left + middle + right >= minSize && left < minSize && right < minSize
That is, the combined range is interesting, but neither remaining fragment is.
It will be called with the following parameters:
block: middleBlock
oldSize: left + middle + right
newSize: left >= right ? left : right
.impl.callback.delete.new: The new callback will be called in this case if and
only if:
left >= minSize && right >= minSize
That is, both remaining fragments are interesting. It will be called with the
following parameters:
block: a new block
oldSize: 0
newSize: left >= right ? right : left
.impl.callback.delete.shrink: The shrink callback will be called in this case
if and only if:
left + middle + right >= minSize && (left >= minSize || right >= minSize)
That is, at least one of the remaining fragments is still interesting. It will
be called with the following parameters:
block: middleBlock
oldSize: left + middle + right
newSize: left >= right ? left : right
.impl.callback.insert: Consider the case when the minimum size is <minSize>,
and CBSInsert is called to add a range of size <middle>. The two (possibly
non-existant) neighbouring blocks are <leftBlock> and <rightBlock>, and have
(possibly zero) sizes <left> and <right>.
.impl.callback.insert.delete: The delete callback will be called in this case
if and only if:
left >= minSize && right >= minSize
That is, both neighbours were interesting. It will be called with the
following parameters:
block: left >= right ? rightBlock : leftBlock
oldSize: left >= right ? right : left
newSize: 0
.impl.callback.insert.new: The new callback will be called in this case if and
only if:
left + middle + right >= minSize && left < minSize && right < minSize
That is, the combined block is interesting, but neither neighbour was. It will
be called with the following parameters:
block: left >= right ? leftBlock : rightBlock
oldSize: left >= right ? left : right
newSize: left + middle + right
.impl.callback.insert.grow: The grow callback will be called in this case if
and only if:
left + middle + right >= minSize && (left >= minSize || right >= minSize)
That is, at least one of the neighbours was interesting. It will be called
with the following parameters:
block: left >= right ? leftBlock : rightBlock
oldSize: left >= right ? left : right
newSize: left + middle + right
Splay Tree
.impl.splay: The CBS is principally implemented using a splay tree (see
design.mps.splay). Each splay tree node is embedded in a CBSBlock that
represents a semi-open address range. The key passed for comparison is the
base of another range.
.impl.splay.fast-find: CBSFindFirst and CBSFindLast use the update/refresh
facility of splay trees to store, in each CBSBlock, an accurate summary of the
maximum block size in the tree rooted at the corresponding splay node. This
allows rapid location of the first or last suitable block, and very rapid
failure if there is no suitable block.
.impl.find-largest: CBSFindLargest simply finds out the size of the largest
block in the CBS from the root of the tree (using SplayRoot), and does
SplayFindFirst for a block of that size. This is O(log(n)) in the size of the
free list, so it's about the best you can do without maintaining a separate
priority queue, just to do CBSFindLargest. Except when the emergency lists
(see .impl.low-mem) are in use, they are also searched.
Low Memory Behaviour
.impl.low-mem: Low memory situations cause problems when the CBS tries to
allocate a new CBSBlock structure for a new isolated range as a result of
either CBSInsert or CBSDelete, and there is insufficient memory to allocation
the CBSBlock structure:
.impl.low-mem.no-inline: If mayUseInline is FALSE, then the range is not added
to the CBS, and the call to CBSInsert or CBSDelete returns ResMEMORY.
.impl.low-mem.inline: If mayUseInline is TRUE:
.impl.low-mem.inline.block: If the range is large enough to contain an inline
block descriptor consisting of two pointers, then it is kept on an emergency
block list. The CBS will eagerly attempt to add this block back into the splay
tree during subsequent calls to CBSInsert and CBSDelete. The CBS will also
keep its emergency block list in address order, and will coalesce this list
eagerly. Some performance degradation will be seen when the emergency block
list is in use. Ranges on this emergency block list will not be made available
to the CBS's client via callbacks. CBSIterate* will not iterate over ranges on
this list.
.impl.low-mem.inline.block.structure: The two pointers stored are to the next
such block (or NULL), and to the limit of the block, in that order.
.impl.low-mem.inline.grain: Otherwise, the range must be large enough to
contain an inline grain descriptor consisting of one pointer, then it is kept
on an emergency grain list. The CBS will eagerly attempt to add this grain
back into either the splay tree or the emergency block list during subsequent
calls to CBSInsert and CBSDelete. The CBS will also keep its emergency grain
list in address order. Some performance degradation will be seen when the
emergency grain list is in use. Ranges on this emergency grain list will not
be made available to the CBS's client via callbacks. CBSIterate* will not
iterate over ranges on this list.
.impl.low-mem.inline.grain.structure: The pointer stored is to the next such
grain, or NULL.
The CBS Block
.impl.cbs.block: The block contains a base-limit pair and a splay tree node.
.impl.cbs.block.special: The base and limit may be equal if the block is
halfway through being deleted.
.impl.cbs.block.special.just: This conflates values and status, but is
justified because block size is very important.
TESTING
.test: The following testing will be performed on this module:
.test.cbstest: There is a stress test for this module in impl.c.cbstest. This
allocates a large block of memory and then simulates the allocation and
deallocation of ranges within this block using both a CBS and a BT. It makes
both valid and invalid requests, and compares the CBS response to the correct
behaviour as determined by the BT. It also iterates the ranges in the CBS,
comparing them to the BT. It also invokes the CBS describe method, but makes
no automatic test of the resulting output. It does not currently test the
callbacks.
.test.pool: Several pools (currently MV2 and MVFF) are implemented on top of a
CBS. These pool are subject to testing in development, QA, and are/will be
heavily exercised by customers.
NOTES FOR FUTURE DEVELOPMENT
.future.not-splay: The initial implementation of CBSs is based on splay trees.
It could be revised to use any other data structure that meets the requirements
(especially .req.fast).
.future.hybrid: It would be possible to attenuate the problem of .risk.overhead
(below) by using a single word bit set to represent the membership in a
(possibly aligned) word-width of grains. This might be used for block sizes
less than a word-width of grains, converting them when they reach all free in
the bit set. Note that this would make coalescence slightly less eager, by up
to (word-width - 1).
RISKS
.risk.overhead: Clients should note that the current implementation of CBSs has
a space overhead proportional to the number of isolated contiguous ranges. [
Four words per range. ] If the CBS contains every other grain in an area, then
the overhead will be large compared to the size of that area. [ Four words per
two grains. ] See .future.hybrid for a suggestion to solve this problem. An
alternative solution is to use CBSs only for managing long ranges.
---
The following relates to a pending re-design and does not yet relate to any
working source version. GavinM 1998-09-25
The CBS system provides its services by combining the services provided by
three subsidiary CBS modules:
- CBSST -- Splay Tree: Based on out-of-line splay trees; must allocate to
insert isolated, which may therefore fail.
- CBSBL -- Block List: Based on a singly-linked list of variable sized ranges
with inline descriptors; ranges must be at least large enough to store the
inline descriptor.
- CBSGL -- Grain List: Based on a singly-linked list of fixed size ranges
with inline descriptors; the ranges must be the alignment of the CBS.
The three sub-modules have a lot in common. Although their methods are not
invoked via a dispatcher, they have been given consistent interfaces, and
consistent internal appearance, to aid maintenance.
Methods supported by sub-modules (not all sub-modules support all methods):
- MergeRange -- Finds any ranges in the specific CBS adjacent to the supplied
one. If there are any, it extends the ranges, possibly deleting one of them.
This cannot fail, but should return FALSE if there is an intersection between
the supplied range and a range in the specific CBS.
- InsertIsolatedRange -- Adds a range to the specific CBS that is not
adjacent to any range already in there. Depending on the specific CBS, this
may be able to fail for allocation reasons, in which case it should return
FALSE. It should AVER if the range is adjacent to or intersects with a range
already there.
- RemoveAdjacentRanges -- Finds and removes from the specific CBS any ranges
that are adjacent to the supplied range. Should return FALSE if the supplied
range intersects with any ranges already there.
- DeleteRange -- Finds and deletes the supplied range from the specific CBS.
Returns a tri-state result:
- Success -- The range was successfully deleted. This may have involved
the creation of a new range, which should be done via CBSInsertIsolatedRange.
- ProtocolError -- Either some non-trivial strict subset of the supplied
range was in the specific CBS, or a range adjacent to the supplied range was in
the specific CBS. Either of these indicates a protocol error.
- NoIntersection -- The supplied range was not found in the CBS. This may
or not be a protocol error, depending on the invocation context.
- FindFirst -- Returns the first (in address order) range in the specific CBS
that is at least as large as the supplied size, or FALSE if there is no such
range.
- FindFirstBefore -- As FindFirst, but only finds ranges prior to the
supplied address.
- FindLast -- As FindFirst, but finds the last such range in address order.
- FindLastAfter -- FindLast's equivalent of FindFirstBefore.
- Init -- Initialise the control structure embedded in the CBS.
- Finish -- Finish the control structure embedded in the CBS.
- InlineDescriptorSize -- Returns the aligned size of the inline descriptor.
- Check -- Checks the control structure embedded in the CBS.
The CBS supplies the following utilities:
- CBSAlignment -- Returns the alignment of the CBS.
- CBSMayUseInline -- Returns whether the CBS may use the memory in the ranges
stored.
- CBSInsertIsolatedRange -- Wrapper for CBS*InsertIsolatedRange.
Internally, the CBS* sub-modules each have an internal structure CBS*Block that
represents an isolated range within the module. It supports the following
methods (for sub-module internal use):
- BlockBase -- Returns the base of the associated range;
- BlockLimit
- BlockRange
- BlockSize

View file

@ -0,0 +1,73 @@
DESIGN OF CHECKING IN MPS
design.mps.check
incomplete design
gavinm 1996-08-05
INTRODUCTION:
This documents the design of structure checking within the MPS
IMPLEMENTATION:
.level: There are three levels of checking:
.level.sig: The lowest level checks only that the structure has a valid
Signature (see design.mps.sig).
.level.shallow: Shallow checking checks all local fields (including
signature) and also checks the signatures of any parent or child structures.
.level.deep: Deep checking checks all local fields (including signatures),
the signatures of any parent structures, and does full recursive checking on
any child structures.
.level.control: control over the levels of checking is via the definition of
at most one of the macros TARGET_CHECK_SHALLOW (which if defined gives
.level.shallow), TARGET_CHECK_DEEP (which if defined gives .level.deep). If
neither macro is defined then .level.sig is used. These macros are not
intended to be manipulated directly by developers, they should use the
interface in impl.h.target.
.order: Because deep checking (.level.deep) uses unchecked recursion, it is
important that child relationships are acyclic (.macro.down).
.fun: Every abstract data type which is a structure pointer should have a
function <type>Check which takes a pointer of type <type> and returns a Bool.
It should check all fields in order, using one of the macros in .macro, or
document why not.
.fun.omit: The only fields which should be omitted from a check function are
those for
which there is no meaningful check (e.g. unlimited unsigned integer with no
relation to
other fields).
.fun.return: Although the function returns a Bool, if the assert handler
returns (or there is no assert handler), then this is taken to mean "ignore and
continue", and the check function hence returns TRUE.
.macro: Checking is implemented by invoking four macros in impl.h.assert:
.macro.sig: CHECKS(type, val) checks the signature only, and should be called
precisely on type <type> and the received object pointer.
.macro.local: CHECKL(cond) checks a local field (depending on level (see
.level)), and should be called on each local field that is not an abstract data
type structure pointer itself (apart from the signature), with an appropriate
normally-true test condition.
.macro.up: CHECKU(type, val) checks a parent abstract data type structure
pointer, performing at most signature checks (depending on level (see
.level)). It should be called with the parent type and pointer.
.macro.down: CHECKD(type, val) checks a child abstract data type structure
pointer, possibly invoking <type>Check (depending on level (see .level)). It
should be called with the child type and pointer.
.full-type: CHECKS, CHECKD, CHECKU, all operate only on fully fledged types.
This means the type has to provide a function Bool TypeCheck(Type type) where
Type is substituted for the name of the type (eg, PoolCheck), and the
expression obj->sig must be a valid value of type Sig whenever obj is a valid
value of type Type.
.type.no-sig: This tag is to be referenced in implementations whenver the form
CHECKL(ThingCheck(thing)) is used instead of CHECK{U,D}(Thing, thing) because
Thing is not a fully fledged type (.full-type).

View file

@ -0,0 +1,287 @@
THE COLLECTION FRAMEWORK
design.mps.collection
incomplete design
pekka 1998-03-20
INTRODUCTION
.intro: This document describes the Collection Framework. It's a framework for
implementing garbage collection techniques and integrating them into a system
of collectors that all cooperate in recycling garbage.
Document History
.hist.0: Version 0 was a different document.
.hist.1: Version 1 was a different document.
.hist.2: Written in January and February 1998 by Pekka P. Pirinen on the basis
of the current implementation of the MPS, analysis.async-gc, [that note on the
independence of collections] and analysis.tracer.
OVERVIEW
.framework: MPS provides a framework that allows the integration of many
different types of GC strategies and provides many of the basic services that
those strategies use. .framework.cover: The framework subsumes most major GC
strategies and allows many efficient techniques, like in-line allocation or
software barriers.
.framework.overhead: The overhead due to cooperation is low. [But not
non-existent. Can we say something useful about it?]
.framework.benefits: The ability to combine collectors contributes
significantly to the flexibility of the system. The reduction in code
duplication contributes to reliability and integrity. The services of the
framework make it easier to write new MM strategies and collectors.
.framework.mpm: The Collection Framework is merely a part of the structure of
the MPM. See design.mps.architecture and design.mps.arch [Those two documents
should be combined into one. Pekka 1998-01-15] for the big picture. Other
notable components that the MPM manages to integrate into a single framework
are manually-managed memory [another missing document here?] and finalization
services (see design.mps.finalize).
.see-also: This document assumes basic familiarity with the ideas of pool (see
design.mps.arch.pools) and segment (see design.mps.seg.over.*).
COLLECTION ABSTRACTIONS
Colours, scanning and fixing
.state: The framework knows about the three colours of the tri-state
abstraction and free blocks. Recording the state of each object is the
responsibility of the pool, but the framework gets told about changes in the
states and keeps track of colours in each segment. Specifically, it records
whether a segment might contain white, grey and black objects wrt. each active
trace (see .tracer) [black not currently implemented -- Pekka 1998-01-04]. (A
segment might contain objects of all colours at once, or none.) This
information is approximate, because when an object changes colour, or dies, it
usually is too expensive to determine if it was the last object of its former
colour.
.state.transitions: The possible state transitions are as follows:
free ---alloc--> black (or grey) or white or none
none --condemn-> white
none --refine--> grey
grey ---scan---> black
white ----fix---> grey (or black)
black --revert--> grey
white --reclaim-> free
black --reclaim-> none
.none-is-black: Outside of a trace, objects don't really have colour, but
technically, the colour is black. Objects are only allocated grey or white
during a trace, and by the time the trace has finished, they are either dead or
black, like the other surviving objects. We might then reuse the colour field
for another trace, so it's convenient to set the colour to black when
allocating outside a trace. This means that refining the foundation
(analysis.tracer.phase.condemn.refine), actually turns black segments grey,
rather than vice versa, but the principle is the same.
.scan-fix: "Scanning" an object means applying the "fix" function to all
references in that object. Fixing is the generic name for the operation that
takes a reference to a white object and makes it non-white (usually grey, but
black is a possibility, and so is changing the reference as we do for weak
references). Typical examples of fix methods are copying the object into
to-space or setting its mark bit.
.cooperation: The separation of scanning and fixing is what allows different GC
techniques to cooperate. The scanning is done by a method on the pool that the
scanned object resides in, and the fixing is done by a method on the pool that
the reference points to.
.scan-all: Pools provide a method to scan all the grey objects in a segment.
Reference sets
.refsets: The cost of scanning can be significantly reduced by storing
remembered sets. We have chosen a very compact and efficient implementation,
called reference sets, or refsets for short (see idea.remember
[design.mps.refset is empty! Perhaps some of this should go there. -- Pekka
1998-02-19]). This makes the cost of maintaining them low, so we maintain them
for all references out of all scannable segments.
.refsets.approx: You might describe refsets as summaries of all references out
of an area of memory, so they are only approximations of remembered sets. When
a refset indicates that an interesting reference might be present in a segment,
we still have to scan the segment to find it.
.refsets.scan: The refset information is collected during scanning. The scan
state protocol provides a way for the pool and the format scan methods to
cooperate in this, and to pass this information to the tracer module which
checks it and updates the segment (see design.mps.scan [Actually, there's very
little doc there. Pekka 1998-02-17]).
.refsets.maintain: The MPS tries to maintain the refset information when it
moves or changes object.
.refsets.pollution: Ambiguous references and pointers outside the arena will
introduce spurious zones into the refsets. We put up with this to keep the
scanning costs down. Consistency checks on refsets have to take this into
account.
.refsets.write-barrier: A write-barrier are needed to keep the mutator from
invalidating the refsets when writing to a segment. We need one on any
scannable segment whose refset is not a superset of the mutator's (and that the
mutator can see). If we know what the mutator is writing and whether it's a
reference, we can just add that reference to the refset (figuring out whether
anything can be removed from the refset is too expensive). If we don't know or
if we cannot afford to keep the barrier up, the framework can union the
mutator's refset to the segment's refset.
.refset.mutator: The mutator's refset could be computed during root scanning in
the usual way, and then kept up to date by using a read-barrier. It's not a
problem that the mutator can create new pointers out of nothing behind the
read-barrier, as they won't be real references. However, this is probably not
cost-effective, since it would cause lots of barrier hits. We'd need a
read-barrier on every scannable segment whose refset is not a subset of the
mutator's (and that the mutator can see). So instead we approximate the
mutator's refset with the universal refset.
THE TRACER
.tracer: The tracer is an engine for implementing multiple garbage collection
processes. Each process (called a "trace") proceeds independently of the
others through five phases as described in analysis.tracer. The following
sections describe how the action of each phase fits into the framework. See
design.mps.trace for details [No, there's not much there, either. Possibly
some of this section should go there. Pekka 1998-02-18]).
.combine: The tracer can also combine several traces for some actions, like
scanning a segment or a root. The methods the tracer calls to do the work get
an argument that tells them which traces they are expected to act for. [extend
this@@@@]
.trace.begin: Traces are started by external request, usually from a client
function or an action (see design.mps.action).
.trace.progress: The tracer gets time slices from the arena to work on a given
trace [This is just a provisional arrangement, in lieu of real progress
control. Pekka 1998-02-18]. In each slice, it selects a small amount of work
to do, based on the state of the trace, and does it, using facilities provided
by the pools. .trace.scan: A typical unit of work is to scan a single
segment. The tracer can choose to do this for multiple traces at once,
provided the segment is grey for more than one trace.
.trace.barrier: Barrier hits might also cause a need to scan a segment (see
.hw-barriers.hit). Again, the tracer can choose to combine traces, when it
does this.
.mutator-colour: The framework keeps track of the colour of the mutator
separately for each trace.
The Condemn Phase
.phase.condemn: The agent that creates the trace (see .trace.begin) determines
the condemned set and colours it white. The tracer then examines the refsets
on all scannable segments, and if it can deduce some segment cannot refer to
the white set, it's immediately coloured black, otherwise the pool is asked to
grey any objects in the segment that might need to be scanned (in copying
pools, this is typically the whole segment).
.phase.condemn.zones: To get the maximum benefit from the refsets, we try to
arrange that the zones are a minimal superset (e.g., generations uniquely
occupy zones) and a maximal subset (there's nothing else in the zone) of the
condemned set. This needs to be arranged at allocation time (or when copying
during collection, which is much like allocation) [soon, this will be handled
by segment loci, see design.mps.locus].
.phase.condemn.mutator: At this point, the mutator might reference any objects,
i.e., it is grey. Allocation can be in any colour, most commonly white [more
could be said about this].
The Grey Mutator Phase
.phase.grey-mutator: Grey segments are chosen according to some sort of
progress control and scanned by the pool to make them black. Eventually, the
tracer will decide to flip or it runs out of grey segments, and proceeds to the
next phase. [Currently, this phase has not been implemented; all traces flip
immediately after condemn. Pekka 1998-02-18]
.phase.grey-mutator.copy: At this stage, we don't want to copy condemned
objects, because we would need an additional barrier to keep the mutator's view
of the heap consistent (see analysis.async-gc.copied.pointers-and-new-copy).
.phase.grey-mutator.ambig: This is a good time to get all ambiguous scanning
out of the way, because we usually can't do any after the flip [write a
detailed explanation of this some day] and because it doesn't cause any copying.
The Flip Phase
.phase.flip: The roots (see design.mps.root) are scanned. This has to be an
atomic action as far as the mutator is concerned, so all threads are suspended
for the duration.
.phase.flip.mutator: After this, the mutator is black: if we use a strong
barrier (analysis.async-gc.strong), this means it cannot refer to white
objects. Allocation will be in black (could be grey as well, but there's no
point to it).
The Black Mutator Phase
.phase.black-mutator: Grey segments are chosen according to some sort of
progress control and scanned by the pool to make them black. Eventually, the
tracer runs out of segments that are grey for this trace, and proceeds to the
next phase.
.phase.black-mutator.copy: At this stage white objects can be relocated,
because the mutator cannot see them (as long as a strong barrier is used, as we
must do for a copying collection, see analysis.async-gc.copied.pointers).
The Reclaim Phase
.phase.reclaim: The tracer finds the remaining white segments and asks the pool
to reclaim any white objects in them.
.phase.reclaim.barrier: Once a trace has started reclaiming objects, the others
shouldn't try to scan any objects that are white for it, because they might
have dangling pointers in them [xref doc yet to be written]. [Currently, we
reclaim atomically, but it could be incremental, or even overlapped with a new
trace on the same condemned set. Pekka 1997-12-31]
BARRIERS
[An introduction and a discussion of general principles should go here. This
is a completely undesigned area.]
Hardware Barriers
.hw-barriers: Hardware barrier services cannot, by their very nature, be
independently provided to each trace. A segment is either protected or not,
and we have to set the protection on a segment if any trace needs a hardware
barrier on it.
.hw-barriers.supported: The framework currently supports segment-oriented
Appel-Ellis-Li barriers (analysis.async-gc.barrier.appel-ellis-li), and
write-barriers for keeping the refsets up-to-date. It would not be hard to add
Steele barriers (analysis.async-gc.barrier.steele.scalable).
.hw-barriers.hit: When a barrier hit happens, the arena determines which
segment it was on. The segment colour info is used to determine whether it had
trace barriers on it, and if so, the appropriate barrier action is performed,
using the methods of the owning pool. If the segment was write-protected, its
refset is unioned with the refset of the mutator [in practice, RefSetUNIV].
.hw-barriers.hit.multiple: Fortunately, if we get a barrier hit on a segment
with multiple trace barriers on it, we can scan it for all the traces that it
had a barrier for, see .combine.@@@@
Software barriers
[@@@@Have to say something about software barriers]

433
mps/design/config/index.txt Normal file
View file

@ -0,0 +1,433 @@
THE DESIGN OF MPS CONFIGURATION
design.mps.config
incomplete design
richard 1997-02-19
INTRODUCTION
.intro: This document describes how the MPS configuration is parameterized so
that it can target different architectures, operating systems, build
environments, varieties, and products.
.bg: For background see [build system mail, configuration mail,
meeting.general.something]
Document History
.hist.0: Initial draft created by Richard Brooksby <richard> on 1997-02-19
based on discussions of configuration at meeting.general.1997-02-05.
.hist.1: Various improvements and clarifications to the draft discussed between
Richard and Nick Barnes <nickb> at meeting.general.1997-02-19.
REQUIREMENTS
.req.arch: Allow architecture specific configurations of the MPS.
.req.os: Allow operating system specific configurations of the MPS.
.req.builder: Allow build environment (compiler, etc.) specific configurations
of the MPS.
.req.prod: Allow product specific configurations of the MPS.
.req.var: Allow configurations with different amounts of instrumentation
(assertions, metering, etc.).
.req.impact: The configuration system should have a minimal effect on
maintainability of the implementation.
.req.port: The system should be easy to port across operating systems.
.req.maint: Maintenance of the configuration and build system should not
consume much developer time.
DEFINITIONS
.def.platform: A platform is a combination of an architecture (.def.arch), an
operating system (.def.os), and a builder (.def.builder). The set of supported
platforms is platform.*.
.def.arch: An architecture is processor type with associated calling
conventions and other binary interface stuff.
.def.os: An operating system is the interface to external resources.
.def.builder: A builder is the tools (C compiler, etc.) used to make the target
(.def.target).
.def.var: A variety is a combination of annotations such as assertions,
metering, etc.
.def.prod: A product is the intended product into which the MPS will fit, e.g.
ScriptWorks, Dylan, etc.
.def.target: The target is the result of the build.
OVERVIEW
- No automatically generated code. Use only C compiler and linker.
- Simple build function (design.mps.buildsys.????)
- Avoid conditional code spaghetti in implementations.
- Dependency on a particular configuration should be minimized and localized
when developing code.
THE BUILD SYSTEM
Abstract Build Function
.build.fun: The MPS implementation assumes only a simple "build function" which
takes a set of sources, possibly in several languages, compiles them with a set
of predefined preprocessor symbols, and links the result with a set of
libraries to form the target:
target := build(<defs>, <srcs>, <libs>)
.build.sep: Separate compilation and linkage can be seen as a memoization of
this function, and is not strictly necessary for the build.
.build.cc: A consequence of this approach is that it should always be possible
to build a complete target with a single UNIX command line calling the compiler
driver (usually "cc" or "gcc"), for example:
cc -o main -DCONFIG_VAR_DF foo.c bar.c baz.s -lz
.build.defs: The "defs" are the set of preprocessor macros which are to be
predefined when compiling the module sources.
CONFIG_VAR_<variety-code>
CONFIG_PROD_<product-code>
The variety-codes are the 2 letter code that appears after "variety." in the
tag of the relevant variety document (see variety.*) converted to upper case.
Currently (1998-11-09): HI, CI, TI, HE, CE, WI, WE, II
The product-codes are currently (1998-11-09) MPS, DYLAN, EPCORE.
Exactly one CONFIG_VAR define must be present.
Exactly one CONFIG_PROD define must be present.
.build.srcs: The "srcs" are the set of sources that must be compiled in order
to build the target. The set of sources may vary depending on the
configuration. For example, different sets of sources may be required to build
different products. [This is a dependency between the makefile (or whatever)
and the module configuration in config.h.]
.build.libs: The "libs" are the set of libraries to which the compiled sources
must be linked in order to build the target. For example, when building a test
program, it might include the ANSI C library and an operating system interface
library.
File Structure
.file.dir: Each product consists of a single directory (corresponding to a HOPE
compound) containing all the sources for the whole family of targets.
.file.base: The names of sources must be unique in the first eight characters
in order to conform to FAT filesystem naming restrictions. .file.ext: The
extension may be up to three characters and directly indicates the source
language.
[Where is the set of valid extensions and languages defined?]
Modules and Naming
.mod.unique: Each module has an identifier which is unique within the MPS.
.mod.impls: Each module has one or more implementations which may be in any
language supported by the relevant build environment. .mod.primary: The
primary implementation of a module is written in target-independent ANSI C in a
source file with the same name as the module. [This seems to be with an "an"
suffix now. GavinM 1997-08-07] .mod.secondary: The names of other
implementations should begin with the same prefix (the module id or a shortened
version of it) and be suffixed with on or more target parameter codes (defined
below). In particular, the names of assembly language sources must include the
target parameter code for the relevant architecture.
Build System Rationale
.build.rat: This simple design makes it possible to build the MPS using many
different tools. Microsoft Visual C++, Metrowerks Codewarrior, and other
graphical development tools do not support much in the way of generated
sources, staged building, or other such stuff. The Visual C and Metrowerks
"project" files correspond closely to a closure of the build function
(.build.fun). The simplicity of the build function has also made it easy to
set up builds using NMAKE (DOS), MPW (Macintosh), and to get the MPS up and
running on other platforms such as FreeBSD and Linux in very little time. The
cost of maintaining the build systems on these various platforms is also
reduced to a minimum, allowing the MM Group to concentrate on primary
development. The source code is kept simple and straightforward. When looking
at MPS sources you can tell exactly what is going to be generated with very
little context. The sources are not munged beyond the standard ANSI C
preprocessor.
.build.port: The portability requirement (.req.port) implies that the build
system must use only standard tools that will be available on all conceivable
target platforms. Experience of development environments on the Macintosh
(Metrowerks Codewarrior) and Windows NT (Visual C++) indicates that we cannot
assume much sophistication in the use of file structure by development
environments. The best that we can hope for is the ability to combine a fixed
list of source files, libraries, and predefined preprocessor symbols into a
single target.
.build.maint: The maintainability requirement (.req.maint) implies that we
don't spend time trying to develop a set of tools to support anything more
complicated than the simple build function described above. The effort in
constructing and maintaining a portable system of this kind is considerable.
Such efforts have failed in EP.
IMPLEMENTATION
[ Now in impl.h.config, may symbols out of date. GavinM 1997-08-07 ]
.impl: The two implementation files impl.h.config and impl.h.mpstd can be seen
as preprocessor programs which "accept" build parameters and "emit"
configuration parameters (.fig.impl). The build parameters are defined either
by the builder (in the case of target detection) or by the build function (in
the case of selecting the variety and product).
.fig.impl:
build parameters configuration parameters
CONFIG_VAR_DF --> config.h --> MPS_VAR_DF, ASSERT_MPM, etc.
CONFIG_PROD_EPCORE --> config.h --> ARENA_CLIENT, PROT_NONE,
JUNKBYTE=0x39, etc.
_WIN32 --> mpstd.h --> MPS_OS_W3, etc.
.impl.dep: No source code, other than the directives in impl.h.config and
impl.h.mpstd, should depend on any build parameters. That is, identifers
beginning "CONFIG_" should only appear in impl.h.config. Code may depend on
configuration parameters in certain, limited ways, as defined below (.conf).
Target Platform Detection
.pf: The target platform is "detected" by the preprocessor directives in
impl.h.mpstd.
.pf.form: This file consists of sets of directives of the form:
#elif <conjunction of builder predefinitions>
#define MPS_PF_<platform code>
#define MPS_OS_<operating system code>
#define MPS_ARCH_<architecture code>
#define MPS_BUILD_<builder code>
#define MPS_T_WORD <word type>
#define MPS_WORD_SHIFT <word shift>
#define MPS_PF_ALIGN <minimum alignment>
.pf.detect: The conjunction of builder predefinitions is a constant expression
which detects the target platform. It is a logical AND of expressions which
look for preprocessor symbols defined by the build environment to indicate the
target. These must be accompanied by a reference to the build tool
documentation from which the symbols came. For example:
/* Visual C++ 2.0, Books Online, C/C++ Book, Preprocessor Reference, */
/* Chapter 1: The Preprocessor, Macros, Predefined
#elif defined(_MSC_VER) && defined(_WIN32) && defined(_M_IX86)
.pf.codes: The declarations of the platform, operating system, architecture,
and builder codes define preprocessor macros corresponding the the target
detected (.pfm.detect). For example:
#define MPS_PF_W3I3MV
#define MPS_OS_W3
#define MPS_ARCH_I3
#define MPS_BUILD_MV
.pf.word: The declaration of MPS_T_WORD defines the unsigned integral type
which corresponds, on the detected target, to the machine word. It is used to
defined the MPS Word type (design.mps.type.word). [Insert backwards ref
there.] For example:
#define MPS_T_WORD unsigned long
.pf.word-width: The declaration of MPS_WORD_WIDTH defines the number of bits in
the type defined by MPS_T_WORD (.pf.word) on the target. For example:
#define MPS_WORD_WIDTH 32
.pf.word-shift: The declaration of MPS_WORD_SHIFT defines the log to the base 2
of MPS_WORD_WIDTH. For example:
#define MPS_WORD_SHIFT 5
.pf.pf-align: The declaration of MPS_PF_ALIGN defines the minimum alignment
which must be used for a memory block to permit any normal processor memory
access. In other words, it is the maximum alignment required by the processor
for normal memory access. For example:
#define MPS_PF_ALIGN 4
Target Varieties
.var: The target variety is handled by preprocessor directives in
impl.h.config. .var.form: The file contains sets of directives of the form:
#elif defined(CONFIG_VAR_DF)
#define MPS_VAR_DF
#define ASSERT_MPSI
#define ASSERT_MPM
etc.
.var.detect: The configured variety is one of the variety preprocessor
definitions passed to the build function (.build.defs), e.g. CONFIG_VAR_DF.
[These are decoupled so that it's possible to tell the difference between
overridden settings etc. Explain.]
.var.symbols: The directives should define whatever symbols are necessary to
control annotations. These symbols parameterize other parts of the code, such
as the declaration of assertions, etc. The symbols should all begin with the
prefix "MPS_VAR_".
Target Product
.prod: The target product is handled by preprocessor directives in
impl.h.config. .prod.form: The file contains sets of directives of the form:
#elif defined(CONFIG_PROD_EPCORE)
#define PROT_NONE
#define THREAD_NONE
#define ARENA_CLIENT
etc.
[Tidy this up:]
Note, anything which can be configured, is configured, even if it's just
configured to "NONE" meaning nothing. This makes sure that you can't choose
something by omission. Where these symbols are used there will be a #error to
catch the unused case.
[This is a general principle which applies to other configuration stuff too.]
SOURCE CODE CONFIGURATION
.conf: This section describes how the configuration may affect the source code
of the MPS.
.conf.limit: The form of dependency allowed is carefully limited to ensure that
code remains maintainable and portable (.req.impact).
.conf.min: The dependency of code on configuration parameters should be kept to
a minimum in order to keep the system maintainable (.req.impact).
Configuration Parameters
.conf.params: The compilation of a module is parameterized by:
MPS_ARCH_<arch-code>
MPS_OS_<os-code>
MPS_BUILDER_<builder-code>
MPS_PF_<platform-code>
MPS_VAR_<variety-code>
MPS_PROD_<product-code>
Abstract and Concrete Module Interfaces
Basic principle: the caller musn't be affected by configuration of a module.
This reduces complexity and dependency of configuration.
All callers use the same abstract interface. Caller code does not change.
Abstract interface includes:
- method definitions (logical function prototypes which may be macro methods)
- names of types
- names of constants
- names of structures and fields which form part of the interface, and
possibly their types, depending on the protocol defined
- the protocols
The abstract interface to a module may not be altered by a configuration
parameter. However, the concrete interface may vary.
Configuring Module Implementations
For example, this isn't allowed, because there is a change in the interface.
#if defined(PROT_FOO)
void ProtSpong(Foo foo, Bar bar);
#else
int ProtSpong(Bar bar, Foo foo);
#endif
This example shows how:
#ifdef PROTECTION
void ProtSync(Space space);
/* more decls. */
#else /* PROTECTION not */
#define ProtSync(space) NOOP
/* more decls. */
#endif /* PROTECTION */
or
#if defined(PROT_FOO)
typedef struct ProtStruct {
int foo;
} ProtStruct;
#define ProtSpong(prot) X((prot)->foo)
#elif defined(PROT_BAR)
typedef struct ProtStruct {
float bar;
} ProtStruct;
#define ProtSpong(prot) Y((prot)->bar)
#else
#error "No PROT_* configured."
#endif
Configuration parameters may not be used to vary implementations in .c files.
For example, this sort of thing:
int map(void *base, size_t size)
{
#if defined(MPS_OS_W3)
VirtualAlloc(foo, bar, base, size);
#elif defined(MPS_OS_SU)
mmap(base, size, frob);
#else
#error "No implementation of map."
#endif
}
This leads to extreme code spaghetti. In effect, it's a "candy machine
interface" on source code. This kind of thing should be done by having several
implementations of the same interface in separate source files. If this leads
to duplication of code then that code should be placed in a separate, common
module.
PROCEDURES
[Adding an architecture, etc.]
NOTES
What about constants?
To do:
- Renaming of some stuff.
- Introduce product selection.
- Change makefiles.
- Eliminate mpmconf.h by moving stuff to config.h.
- Update files to refer to this design document.

View file

@ -0,0 +1,100 @@
FINALIZATION
design.mps.finalize
incomplete design
drj 1997-02-14
OVERVIEW:
Finalization is implemented internally using the Guardian Pool Class
(design.mps.poolmrg). Objects can be registered for finalization using an
interface function (called mps_finalize). Notification of finalization is
given to the client via the messaging interface. PoolClassMRG
(design.mps.poolmrg) implements a Message Class which implements the
finalization messages.
REQUIREMENTS:
.req: Currently only Dylan has requirements for finalization, see
req.dylan.fun.final.
ARCHITECTURE:
External Interface
.if.register:
mps_res_t mps_finalize(mps_arena_t arena, mps_addr_t obj);
increases the number of times that the object located at obj has been
registered for finalization by one. The object must have been allocated from
the arena (space). Any finalization messages that are created for this object
will appear on the arena's message queue. The MPS will attempt to finalize the
object that number of times.
.if.deregister:
void mps_definalize(mps_arena_t arena, mps_addr_t obj);
mps_definalize reduces the number of times that the object located at obj has
been registered for finalization by one. It is an error to definalize that has
not been registered for finalization.
.if.deregister.not: At the moment (1997-08-20) mps_definalize is not implemented
.if.get-ref:
void mps_message_finalization_ref(mps_addr_t *mps_addr_return,
mps_arena_t mps_arena,
mps_message_t mps_message)
mps_message_finalization_ref returns the reference to the finalized object
stored in the finalization message.
IMPLEMENTATION:
.int.over: Registering an object for finalization corresponds to allocating a
reference of rank FINAL to that object. This reference is allocated in a
guardian object in a pool of PoolClassMRG (see design.mps.poolmrg).
.int.arena.struct: The MRG pool used for managing final references is kept in
the Arena (Space), referred to as the "final pool". .int.arena.lazy: The pool
is lazily created, it will not be created until the first object is registered
for finalization. .int.arena.flag: There is a flag in the Arena that indicates
whether the final pool has been created yet or not.
.int.finalize:
Res ArenaFinalize(Arena arena, Ref addr)
.int.finalize.create: Creates the final pool if it has not been created yet.
.int.finalize.alloc: Allocates a guardian in the final pool.
.int.finalize.write: Writes a reference to the object into the guardian
object. .int.finalize.all: That's all. .int.finalize.error: if either the
creation of the pool or the allocation of the object fails then the error will
be reported back to the caller. .int.finalize.error.no-unwind: This function
does not need to do any unwinding in the error cases because the creation of
the pool is not something that needs to be undone.
.int.arena-destroy.empty: ArenaDestroy empties the message queue by calling
MessageEmpty.
.int.arena-destroy.final-pool: If the final pool has been created then
ArenaDestroy destroys the final pool.
.access: mps_message_finalization_ref needs to access the finalization message
to retrieve the reference and then write it to where the client asks. This
must be done carefully, in order to avoid breaking the invariants or creating a
hidden root.
.access.invariants: We protect the invariants by using special routines
ArenaRead and ArenaPoke to read and write the reference. This works as long as
there's no write-barrier collection. [Instead of ArenaPoke, we could put in an
ArenaWrite that would be identical to ArenaPoke, except for AVERring the
invariant (or it can just AVER there are no busy traces unflipped). When we
get write-barrier collection, we could change it to do the real thing, but in
the absence of a write-barrier, it's functionally identical to ArenaPoke.
Pekka 1997-12-09]

View file

@ -53,6 +53,397 @@
<table>
<tr valign="top">
<td> <code><a href="arena/index.txt">arena/index.txt</a></code> </td>
<td> The design of the MPS arena </td>
</tr>
<tr valign="top">
<td> <code><a href="arenavm/index.txt">arenavm/index.txt</a></code> </td>
<td> Virtual memory arena </td>
</tr>
<tr valign="top">
<td> <code><a href="bt/index.txt">bt/index.txt</a></code> </td>
<td> Bit tables </td>
</tr>
<tr valign="top">
<td> <code><a href="buffer/index.txt">buffer/index.txt</a></code> </td>
<td> Allocation buffers and allocation points </td>
</tr>
<tr valign="top">
<td> <code><a href="cbs/index.txt">cbs/index.txt</a></code> </td>
<td> Design for coalescing block structure </td>
</tr>
<tr valign="top">
<td> <code><a href="check/index.txt">check/index.txt</a></code> </td>
<td> Design of checking in MPS </td>
</tr>
<tr valign="top">
<td> <code><a href="collection/index.txt">collection/index.txt</a></code> </td>
<td> The collection framework </td>
</tr>
<tr valign="top">
<td> <code><a href="config/index.txt">config/index.txt</a></code> </td>
<td> The design of MPS configuration </td>
</tr>
<tr valign="top">
<td> <code><a href="finalize/index.txt">finalize/index.txt</a></code> </td>
<td> Finalization </td>
</tr>
<tr valign="top">
<td> <code><a href="interface-c/index.txt">interface-c/index.txt</a></code> </td>
<td> The design of the Memory Pool System interface to C </td>
</tr>
<tr valign="top">
<td> <code><a href="io/index.txt">io/index.txt</a></code> </td>
<td> The design of the MPS i/o subsystem </td>
</tr>
<tr valign="top">
<td> <code><a href="lib/index.txt">lib/index.txt</a></code> </td>
<td> The design of the Memory Pool System library interface </td>
</tr>
<tr valign="top">
<td> <code><a href="lock/index.txt">lock/index.txt</a></code> </td>
<td> The design of the lock module </td>
</tr>
<tr valign="top">
<td> <code><a href="locus/index.txt">locus/index.txt</a></code> </td>
<td> The design for the locus manager </td>
</tr>
<tr valign="top">
<td> <code><a href="message/index.txt">message/index.txt</a></code> </td>
<td> MPS to client message protocol </td>
</tr>
<tr valign="top">
<td> <code><a href="pool/index.txt">pool/index.txt</a></code> </td>
<td> The design of the pool and pool class mechanisms </td>
</tr>
<tr valign="top">
<td> <code><a href="poolamc/index.txt">poolamc/index.txt</a></code> </td>
<td> The design of the automatic mostly-copying memory pool class </td>
</tr>
<tr valign="top">
<td> <code><a href="poolams/index.txt">poolams/index.txt</a></code> </td>
<td> The design of the automatic mark-and-sweep pool class </td>
</tr>
<tr valign="top">
<td> <code><a href="poolawl/index.txt">poolawl/index.txt</a></code> </td>
<td> Automatic weak linked </td>
</tr>
<tr valign="top">
<td> <code><a href="poollo/index.txt">poollo/index.txt</a></code> </td>
<td> Leaf object pool class </td>
</tr>
<tr valign="top">
<td> <code><a href="poolmfs/index.txt">poolmfs/index.txt</a></code> </td>
<td> The design of the manual fixed small memory pool class </td>
</tr>
<tr valign="top">
<td> <code><a href="poolmrg/index.txt">poolmrg/index.txt</a></code> </td>
<td> Guardian poolclass </td>
</tr>
<tr valign="top">
<td> <code><a href="poolmv/index.txt">poolmv/index.txt</a></code> </td>
<td> The design of the manual variable memory pool class </td>
</tr>
<tr valign="top">
<td> <code><a href="poolmv2/index.txt">poolmv2/index.txt</a></code> </td>
<td> The design of a new manual-variable memory pool class </td>
</tr>
<tr valign="top">
<td> <code><a href="poolmvff/index.txt">poolmvff/index.txt</a></code> </td>
<td> Design of the manually-managed variable-size first-fit pool </td>
</tr>
<tr valign="top">
<td> <code><a href="prot/index.txt">prot/index.txt</a></code> </td>
<td> Generic design of the protection module </td>
</tr>
<tr valign="top">
<td> <code><a href="protan/index.txt">protan/index.txt</a></code> </td>
<td> ANSI implementation of protection module </td>
</tr>
<tr valign="top">
<td> <code><a href="protli/index.txt">protli/index.txt</a></code> </td>
<td> Linux implementation of protection module </td>
</tr>
<tr valign="top">
<td> <code><a href="protocol/index.txt">protocol/index.txt</a></code> </td>
<td> The design for protocol inheritance in MPS </td>
</tr>
<tr valign="top">
<td> <code><a href="protsu/index.txt">protsu/index.txt</a></code> </td>
<td> SunOS 4 implementation of protection module </td>
</tr>
<tr valign="top">
<td> <code><a href="pthreadext/index.txt">pthreadext/index.txt</a></code> </td>
<td> Design of the Posix thread extensions for MPS </td>
</tr>
<tr valign="top">
<td> <code><a href="reservoir/index.txt">reservoir/index.txt</a></code> </td>
<td> The design of the low-memory reservoir </td>
</tr>
<tr valign="top">
<td> <code><a href="ring/index.txt">ring/index.txt</a></code> </td>
<td> The design of the ring data structure </td>
</tr>
<tr valign="top">
<td> <code><a href="root/index.txt">root/index.txt</a></code> </td>
<td> The design of the root manager </td>
</tr>
<tr valign="top">
<td> <code><a href="scan/index.txt">scan/index.txt</a></code> </td>
<td> The design of the generic scanner </td>
</tr>
<tr valign="top">
<td> <code><a href="seg/index.txt">seg/index.txt</a></code> </td>
<td> The design of the MPS segment data structure </td>
</tr>
<tr valign="top">
<td> <code><a href="sig/index.txt">sig/index.txt</a></code> </td>
<td> The design of the Memory Pool System signature system </td>
</tr>
<tr valign="top">
<td> <code><a href="splay/index.txt">splay/index.txt</a></code> </td>
<td> Design of splay trees </td>
</tr>
<tr valign="top">
<td> <code><a href="sso1al/index.txt">sso1al/index.txt</a></code> </td>
<td> Stack scanner for Digital Unix / Alpha systems </td>
</tr>
<tr valign="top">
<td> <code><a href="telemetry/index.txt">telemetry/index.txt</a></code> </td>
<td> The design of the MPS telemetry mechanism </td>
</tr>
<tr valign="top">
<td> <code><a href="trace/index.txt">trace/index.txt</a></code> </td>
<td> Tracer </td>
</tr>
<tr valign="top">
<td> <code><a href="type/index.txt">type/index.txt</a></code> </td>
<td> The design of the general MPS types </td>
</tr>
<tr valign="top">
<td> <code><a href="version-library/index.txt">version-library/index.txt</a></code> </td>
<td> Design of the MPS library version mechanism </td>
</tr>
<tr valign="top">
<td> <code><a href="version/index.txt">version/index.txt</a></code> </td>
<td> Design of MPS software versions </td>
</tr>
<tr valign="top">
<td> <code><a href="vm/index.txt">vm/index.txt</a></code> </td>
<td> The design of the virtual mapping interface </td>
</tr>
<tr valign="top">
<td> <code><a href="vman/index.txt">vman/index.txt</a></code> </td>
<td> ANSI fake VM </td>
</tr>
<tr valign="top">
<td> <code><a href="vmo1/index.txt">vmo1/index.txt</a></code> </td>
<td> VM Module on DEC Unix </td>
</tr>
<tr valign="top">
<td> <code><a href="vmso/index.txt">vmso/index.txt</a></code> </td>
<td> VM Design for Solaris </td>
</tr>
<tr valign="top">
<td> <code><a href="writef/index.txt">writef/index.txt</a></code> </td>
<td> The design of the MPS writef function </td>
</tr>
</table>

View file

@ -0,0 +1,263 @@
THE DESIGN OF THE MEMORY POOL SYSTEM INTERFACE TO C
design.mps.interface.c
incomplete doc
richard 1996-07-29
INTRODUCTION
Scope
.scope: This document is the design for the Memory Pool System (MPS) interface
to the C Language, impl.h.mps.
Background
.bg: See mail.richard.1996-07-24.10-57.
Document History
.hist.0: The first draft of this document was generated in response to
review.impl.h.mps.10 which revealed the lack of a detailed design document and
also the lack of conventions for external interfaces. The aim of the draft was
to record this information, even if it isn't terribly well structured.
ANALYSIS
Goals
.goal.c: The file impl.h.mps is the C external interface to the MPS. It is the
default interface between client code written in C and the MPS. .goal.cpp:
impl.h.mps is not specifically designed to be an interface to C++, but should
be usable from C++.
Requirements
.req: The interface must provide an interface from client code written in C to
the functionality of the MPS required by the product (see req.product), Dylan
(req.dylan), and the Core RIP (req.epcore).
mps.h may not include internal MPS header files such as "pool.h" etc.
It is essential that the interface cope well with change, in order to avoid
restricting possible future MPS developments. This means that the interface
must be "open ended" in its definitions. This accounts for some of the
apparently tortuous methods of doing things (fmt_A, for example). The
requirement is that the MPS should be able to add new functionality, or alter
the implementation of existing functionality, without affecting existing client
code. A stronger requirement is that the MPS should be able to change without
_recompiling_ client code. This is not always possible.
[.naming.global wqas presumable done in response to an unwritten requirement
regarding the use of the name spaces in C, perhaps something like
".req.name.iso: The interface shall not conflict in terms of naming with any
interfaces specified by ISO C and all reasonable future versions" and
".req.name.general: The interface shall use a documented and reasonably small
portion of the namespace so that clients can interoperate easily" drj
1998-10-01]
ARCHITECTURE
.fig.arch: The architecture of the MPS Interface
Just behind mps.h is the file mpsi.c, the "MPS interface layer" which does the
job of converting types and checking parameters before calling through to the
MPS proper, using internal MPS methods.
GENERAL CONVENTIONS
.naming: The external interface names should adhere to the documented interface
conventions; these are found in doc.mps.ref-man.if-conv(0).naming.
(paraphrased/recreated here) .naming.unixy: The external interface does not
follow the same naming conventions as the internal code. The interface is
designed to resemble a more conventional C, Unix, or Posix naming convention.
.naming.case: Identifiers are in lower case, except non-function-like macros,
which are in upper case. .naming.global: All publicised identifiers are
prefixed "mps_" or "MPS_". .naming.all: All identifiers defined by the MPS
should begin "mps_" or "MPS_" or "_mps_". .naming.type: Types are suffixed
"_t". .naming.struct: Structure types and tags are suffixed "_s".
.naming.union: Unions types and tags are suffixed "_u".
.naming.scope: The naming conventions apply to all identifiers (see ISO C
clause 6.1.2); this includes names of functions, variables, types (through
typedef), structure and union tags, enumeration members, structure and union
members, macros, macro parameters, labels. .naming.scope.labels: labels (for
goto statements) should be rare, only in special block macros and probably not
even then. .naming.scope.other: The naming convention would also extend to
enumeration types and parameters in functions prototypes but both of those are
prohibited from having names in an interface file.
.type.gen: The interface defines memory addresses as "void *" and sizes as
"size_t" for compatibility with standard C (in particular, with malloc etc.).
These types must be binary compatible with the internal types "Addr" and "Size"
respectively. Note that this restricts the definitions of the internal types
"Addr" and "Size" when the MPS is interfaced with C, but does not restrict the
MPS in general.
.type.opaque: Opaque types are defined as pointers to structures which are
never defined. These types are cast to the corresponding internal types in
mpsi.c.
.type.trans: Some transparent structures are defined. The client is expected
to read these, or poke about in them, under restrictions which should be
documented. The most important is probably the allocation point (mps_ap_s)
which is part of allocation buffers. The transparent structures must be binary
compatible with corresponding internal structures. For example, the fields of
mps_ap_s must corredpond with APStruct internally. This is checked by mpsi.c
in mps_check().
.type.pseudo: Some pseudo-opaque structures are defined. These only exist so
that code can be inlined using macros. The client code shouldn't mess with
them. The most important case of this is the scan state (mps_ss_s) which is
accessed by the in-line scanning macros, MPS_SCAN_* and MPS_FIX*.
.type.enum: There should be no enumeration types in the interface. Note that
enum specifiers (to declare integer constants) are fine as long as no type is
declared. See guide.impl.c.misc.enum.type.
.type.fun: Whenever function types or derived function types (such as pointer
to function) are declared a prototype should be used and the parameters to the
function should not be named. This includes the case where you are declaring
the prototype for an interface function. .type.fun.example: So use "extern
mps_res_t mps_alloc(mps_addr_t *, mps_pool_t, size_t, ...);" rather than
"extern mps_res_t mps_alloc(mps_addr_t *addr_return, mps_pool_t pool , size_t
size, ...);" and "typedef mps_addr_t (*mps_fmt_class_t)(mps_addr_t);" rather
then "typedef mps_addr_t (*mps_fmt_class_t)(mps_addr_t object);". See
guide.impl.c.misc.prototype.parameters.
Checking
.check.space: When the space needs to be recovered from a parameter it is check
using AVER(CHECKT(Foo, foo)) before any attempt to access FooSpace(foo).
CHECKT (impl.h.assert) performs simple thread-safe checking of foo, so it can
be called outside of SpaceEnter/SpaceLeave. [perhaps this should be a special
macro. "AVER(CHECKT(" can look like the programmer made a mistake. drj
1998-11-05]
.check.types: We use definitions of types in both our external interface and
our internal code, and we want to make sure that they are compatible. (The
external interface changes less often and hides more information.) At first,
we were just checking their sizes, which wasn't very good, but I've come up
with some macros which check the assignment compatibility of the types too.
This is a sufficiently useful trick that I thought I'd send it round. It may
be useful in other places where types and structures need to be checked for
compatibility at compile time.
These macros don't generate warnings on the compilers I've tried.
This macro checks the assignment compatibility of two lvalues. The hack is
that it uses sizeof to guarantee that the assignments have no effect.
#define check_lvalues(_e1, _e2) \
(sizeof((_e1) = (_e2)), sizeof((_e2) = (_e1)), sizeof(_e1) == sizeof(_e2))
This macro checks that two types are compatible and equal in size. The hack
here is that it generates an lvalue for each type by casting zero to a pointer
to the type.
#define check_types(_t1, _t2) check_lvalues(*((_t1 *)0), *((_t2 *)0))
This macro just checks that the offset and size of two fields are the same.
#define check_fields_approx(_s1, _f1, _s2, _f2) \
(offsetof(_s1, _f1) == offsetof(_s2, _f2) && \
sizeof(((_s1 *)0)->_f1) == sizeof(((_s2 *)0)->_f2))
This macro checks the offset, size, and compatibility of fields.
#define check_fields(_s1, _f1, _s2, _f2) \
(check_lvalues(((_s1 *)0)->_f1, ((_s2 *)0)->_f2) && \
check_fields_approx(_s1, _f1, _s2, _f2))
Binary Compatibility Issues
As in Enumeration types are not allowed (see mail.richard.1995-09-08.09-28).
There are two main aspects to run-time compatibility: binary interface and
protocol. The binary interface is all the information needed to correctly use
the library, and includes external symbol linkage, calling conventions, type
representation compatibility, structure layouts, etc. The protocol is how the
library is actually used by the client code -- whether this is called before
that -- and determines the semantic correctness of the client with respect to
the library.
The binary interface is determined completely by the header file and the
target. The header file specifies the external names and the types, and the
target platform specifies calling conventions and type representation. There
is therefore a many-to-one mapping between the header file version and the
binary interface.
The protocol is determined by the implementation of the library.
Constraints
.cons: The MPS C Interface constrains the MPS in order to provide useful memory
management services to a C or C++ program.
.cons.addr: The interface constrains the MPS address type, Addr
(design.mps.type.addr), to being the same as C's generic pointer type, void *,
so that the MPS can manage C objects in the natural way.
.pun.addr: We pun the type of mps_addr_t (which is void *) into Addr (an
incomplete type, see design.mps.type.addr). This happens in the call to the
scan state's fix function, for example.
.cons.size: The interface constrains the MPS size type, Size
(design.mps.type.size), to being the same as C's size type, size_t, so that the
MPS can manage C objects in the natural way.
.pun.size: We pun the type of size_t in mps.h into Size in the MPM, as an
argument to the format methods. We assume this works.
.cons.word: The MPS assumes that Word (design.mps.type.word) and Addr
(design.mps.type.addr) are the same size, and the interface constrains Word to
being the same size as C's generic pointer type, void *.
NOTES
The file mpstd.h is the MPS target detection header. It decodes
preprocessor symbols which are predefined by build environments in order to
determine the target platform, and then defines uniform symbols, such as
MPS_ARCH_I3, for use internall by the MPS.
There is a design document for the mps interface, design.mps.interface, but
it was written before we had the idea of having a C interface layer. It is
quite relevant, though, and could be updated. We should use it during the
review.
All exported identifiers and file names should begin with mps_ or mps so
that they don't clash with other systems.
We should probably have a specialized set of rules and a special checklist
for this interface.
.fmt.extend: This paragraph should be an explanation of why mps_fmt_A is so
called. The underlying reason is future extensibility.
.thread-safety: Most calls through this interface lock the space and therefore
make the MPM single-threaded. In order to do this they must recover the space
from their parameters. Methods such as ThreadSpace() must therefore be
callable when the space is _not_ locked. These methods are tagged with the tag
of this note.
.lock-free: Certain functions inside the MPM are thread-safe and do not need to
be serialized by using locks. They are marked with the tag of this note.
.form: Almost all functions in this implementation simply cast their arguments
to the equivalent internal types, and cast results back to the external type,
where necessary. Only exceptions are noted in comments.

381
mps/design/io/index.txt Normal file
View file

@ -0,0 +1,381 @@
THE DESIGN OF THE MPS I/O SUBSYSTEM
design.mps.io
incomplete design
richard 1996-08-30
INTRODUCTION
.intro: This document is the design of the MPS I/O Subsystem, a part of the
plinth.
.readership: This document is intended for MPS developers.
History
.hist.1: Document created from paper notes by Richard Brooksby, 1996-08-30.
.hist.2: Updated with mail.richard.1997-05-30.16-13 and subsequent discussion
in the Pool Hall at Longstanton. (See also mail.drj.1997-06-05.15-20.)
richard 1997-06-10
Background
.bg: This design is partly based on the design of the Internet User Datagram
Protocol (UDP). Mainly I used this to make sure I hadn't left out anything
which we might need.
PURPOSE
.purpose: The purpose of the MPS I/O Subsystem is to provide a means to
measure, debug, control, and test a memory manager build using the MPS.
.purpose.measure: Measurement consists of emitting data which can be collected
and analysed in order to improve the attributes of application program, quite
possibly by adjusting parameters of the memory manager (see overview.mps.usage).
.purpose.control: Control means adjusting the behaviour of the MM dynamically.
For example, one might want to adjust a parameter in order to observe the
effect, then transfer that adjustment to the client application later.
.purpose.test: Test output can be used to ensure that the memory manager is
behaving as expected in response to certain inputs.
REQUIREMENTS
General
.req.fun.non-hosted: The MPM must be a host-independent system.
.req.attr.host: It should be easy for the client to set up the MPM for a
particular host (such as a washing machine).
Functional
.req.fun.measure: The subsystem must allow the MPS to transmit quantitative
measurement data to an external tool so that the system can be tuned.
.req.fun.debug: The subsystem must allow the MPS to transmit qualitative
information about its operation to an external tool so that the system can be
debugged.
.req.fun.control: The subsystem must allow the MPS to receive control
information from an external tool so that the system can be adjusted while it
is running.
.req.dc.env.no-net: The subsystem sould operate in environments where there is
no networking available.
.req.dc.env.no-fs: The subsystem should operate in environments where there is
no filesystem available.
ARCHITECTURE
.arch.diagram:
- I/O Architecture Diagram
.arch.int: The I/O Interface is a C function call interface by which the MPM
sends and receives "messages" to and from the hosted I/O module.
.arch.module: The modules are part of the MPS but not part of the freestanding
core system (see design.mps.exec-env). The I/O module is responsible for
transmitting those messages to the external tools, and for receiving messages
from external tools and passing them to the MPM.
.arch.module.example: For example, the "file implementation" might just
send/write telemetry messages into a file so that they can be received/read
later by an off-line measurement tool.
.arch.external: The I/O Interface is part of interface to the freestanding core
system (see design.mps.exec-env). This is so that the MPS can be deployed in a
freestanding environment, with a special I/O module. For example, if the MPS
is used in a washing machine the I/O module could communicate by writing output
to the seven-segment display.
Example Configurations
.example.telnet: This shows the I/O Subsystem communicating with a telnet
client over a TCP/IP connection. In this case, the I/O Subsystem is
translating the I/O Interface into an interactive text protocol so that the
user of the telnet client can talk to the MM.
.example.file: This shows the I/O Subsystem dumping measurement data into a
file which is later read and analysed. In this case the I/O Subsystem is
simply writing out binary in a format which can be decoded.
.example.serial: This shows the I/O Subsystem communicating with a graphical
analysis tool over a serial link. This could be useful for a developer who has
two machines in close proximity and no networking support.
.example.local: In this example the application is talking directly to the I/O
Subsystem. This is useful when the application is a reflective development
environment (such as MLWorks) which wants to observe its own behaviour.
- MPS I/O Configuration Diagrams
INTERFACE
.if.msg: The I/O interface is oriented around opaque binary "messages" which
the I/O module must pass between the MPM and external tools. The I/O module
need not understand or interpret the contents of those messages.
.if.msg.opaque: The messages are opaque in order to minimize the dependency of
the I/O module on the message internals. It should be possible for clients to
implement their own I/O modules for unusual environments. We do not want to
reveal the internal structure of our data to the clients. Nor do we want to
burden them with the details of our protocols. We'd also like their code to be
independent of ours, so that we can expand or change the protocols without
requiring them to modify their modules.
.if.msg.dgram: Neither the MPM nor the external tools should assume that the
messages will be delivered in finite time, exactly once, or in order. This
will allow the I/O modules to be implemented using unreliable transport layers
such as the Internet User Datagram Protocl (UDP). It will also give the I/O
module the freedom to drop information rather than block on a congested
network, or stop the memory manager when the disk is full, or similar events
which really shouldn't cause the memory manager to stop working. The protocols
we need to implement at the high level can be design to be robust againt
lossage without much difficulty.
I/O Module State
.if.state: The I/O module may have some internal state to preserve. The I/O
Interface defines a type for this state, "mps_io_t", a pointer to an incomplete
structure "mps_io_s". The I/O module is at liberty to define this structure.
typedef struct mps_io_s *mps_io_t;
Message Types
.if.type: The I/O module must be able to deliver messages of several different
types. It will probably choose to send them to different destinations based on
their type: telemetry to the measurement tool, debugging output to the
debugger, etc.
typedef int mps_io_type_t;
enum {
MPS_IO_TYPE_TELEMETRY,
MPS_IO_TYPE_DEBUG
};
Limits
.if.message-max: The interface will define an unsigned integral constant
"MPS_IO_MESSAGE_MAX" which will be the maximum size of messages that the MPM
will pass to "mps_io_send" (.if.send) and the maximum size it will expect to
receive from "mps_io_receive".
Interface Set-up and Tear-down
.if.create: The MPM will call "mps_io_create" to set up the I/O module. On
success, this function should return "MPS_RES_OK". It may also initialize
"*mps_io_r" to a "state" value which will be passed to subsequent calls through
the interface.
extern mps_res_t mps_io_create(mps_io_t *mps_io_r);
.if.destroy: The MPM will call "mps_io_destroy" to tear down the I/O module,
after which it guarantees that the state value "mps_io" will not be used
again. The "state" parameter is the state previously returned by
"mps_io_create" (.if.create).
extern void mps_io_destroy(mps_io_t mps_io);
Message Send and Receive
.if.send: The MPM will call "mps_io_send" when it wishes to send a message to a
destination. The "state" parameter is the state previously returned by
"mps_io_create" (.if.create). The "type" parameter is the type (.if.type) of
the message. The "message" parameter is a pointer to a buffer containing the
message, and "size" is the length of that message, in bytes. The I/O module
must make an effort to deliver the message to the destination, but is not
expected to guarantee delivery. The function should return "MPS_RES_IO" only
if a serious error occurs that should cause the MPM to return with an error to
the client application. Failure to deliver the message does not count.
[Should there be a timeout parameter? What are the timing constraints?
mps_io_send shouldn't block.]
extern mps_res_t mps_io_send(mps_io_t state,
mps_io_type_t type,
void *message,
size_t size);
.if.receive: The MPM will call "mps_io_receive" when it wants to see if a
message has been sent to it. The "state" parameter is the state previously
returned by "mps_io_create" (.if.create). The "buffer_o" parameter is a
pointer to a value which should be updated with a pointer to a buffer
containing the message received. The "size_o" parameter is a pointer to a
value which should be updated with the length of the message received. If
there is no message ready for receipt, the length returned should be zero.
(Should we be able to receive truncated messages? How can this be done neatly?)
extern mps_res_t mps_io_receive(mps_io_t state,
void **buffer_o,
size_t *size_o);
I/O MODULE IMPLEMENTATIONS
Routeing
The I/O module must decide where to send the various messages. A file-based
implementation could put them in different files based on their types. A
network-based implementation must decide how to address the messages. In
either case, any configuration must either be statically compiled into the
module, or else read from some external source such as a configuration file.
NOTES
The external tools should be able to reconstruct stuff from partial info. For
example, you come across a fragment of an old log containing just a few old
messages. What can you do with it?
Here's some completely untested code which might do the job for UDP.
---
#include "mpsio.h"
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <errno.h>
typedef struct mps_io_s {
int sock;
struct sockaddr_in mine;
struct sockaddr_in telemetry;
struct sockaddr_in debugging;
} mps_io_s;
static mps_bool_t inited = 0;
static mps_io_s state;
mps_res_t mps_io_create(mps_io_t *mps_io_o)
{
int sock, r;
if(inited)
return MPS_RES_LIMIT;
state.mine = /* setup somehow from config */;
state.telemetry = /* setup something from config */;
state.debugging = /* setup something from config */;
/* Make a socket through which to communicate. */
sock = socket(AF_INET, SOCK_DGRAM, 0);
if(sock == -1) return MPS_RES_IO;
/* Set socket to non-blocking mode. */
r = fcntl(sock, F_SETFL, O_NDELAY);
if(r == -1) return MPS_RES_IO;
/* Bind the socket to some UDP port so that we can receive messages. */
r = bind(sock, (struct sockaddr *)&state.mine, sizeof(state.mine));
if(r == -1) return MPS_RES_IO;
state.sock = sock;
inited = 1;
*mps_io_o = &state;
return MPS_RES_OK;
}
void mps_io_destroy(mps_io_t mps_io)
{
assert(mps_io == &state);
assert(inited);
(void)close(state.sock);
inited = 0;
}
mps_res_t mps_io_send(mps_io_t mps_io, mps_type_t type,
void *message, size_t size)
{
struct sockaddr *toaddr;
assert(mps_io == &state);
assert(inited);
switch(type) {
MPS_IO_TYPE_TELEMETRY:
toaddr = (struct sockaddr *)&state.telemetry;
break;
MPS_IO_TYPE_DEBUGGING:
toaddr = (struct sockaddr *)&state.debugging;
break;
default:
assert(0);
return MPS_RES_UNIMPL;
}
(void)sendto(state.sock, message, size, 0, toaddr, sizeof(*toaddr));
return MPS_RES_OK;
}
mps_res_t mps_io_receive(mps_io_t mps_io,
void **message_o, size_t **size_o)
{
int r;
static char buffer[MPS_IO_MESSAGE_MAX];
assert(mps_io == &state);
assert(inited);
r = recvfrom(state.sock, buffer, sizeof(buffer), 0, NULL, NULL);
if(r == -1)
switch(errno) {
/* Ignore interrupted system calls, and failures due to lack */
/* of resources (they might go away.) */
case EINTR: case ENOMEM: case ENOSR:
r = 0;
break;
default:
return MPS_RES_IO;
}
*message_o = buffer;
*size_o = r;
return MPS_RES_OK;
}
ATTACHMENTS
"O Architecture Diagram"
"O Configuration Diagrams"

90
mps/design/lib/index.txt Normal file
View file

@ -0,0 +1,90 @@
THE DESIGN OF THE MEMORY POOL SYSTEM LIBRARY INTERFACE
design.mps.lib
incomplete design
richard 1996-09-03
INTRODUCTION
.intro: This document is the design of the MPS Library Interface, a part of the
plinth.
.readership: Any MPS developer. Any clients that are prepared to read this in
order to get documentation.
GOALS
.goal: The goals of the MPS library interface are:
.goal.host: To control the dependency of the MPS on the hosted ISO C library so
that the core MPS remains freestanding (see design.mps.exec-env).
.goal.free: To allow the core MPS convenient access to ISO C functionality that
is provided on freestanding platforms (see design.mps.exec-env.std.com.free).
DESCRIPTION
Overview
.overview.access: The core MPS needs to access functionality that could be
provided by an ISO C hosted environment.
.overview.hosted: The core MPS must not make direct use of any facilities in
the hosted environment (design.mps.exec-env). However, it is sensible to make
use of them when the MPS is deployed in a hosted environment.
.overview.hosted.indirect: The core MPS does not make any direct use of hosted
ISO C library facilities. Instead, it indirects through the MPS Library
Interface, impl.h.mpslib.
.overview.free: The core MPS can make direct use of freestanding ISO C library
facilities and does not need to include any of the header files <limits.h>,
<stdarg.h>, and <stddef.h> directly.
.overview.complete: The MPS Library Interface can be considered as the complete
"interface to ISO" (in that it provides direct access to facilities that we get
in a freestanding environment and equivalents of any functionality we require
from the hosted environment).
.overview.provision.client: In a freestanding environment the client is
expected to provide functions meeting this interface to the MPS.
.overview.provision.hosted: In a hosted environment, impl.c.mpsliban may be
used. It just maps impl.h.mpslib directly onto the ISO C library equivalents.
- MPS Library Interface Diagram
Outside the Interface
We provide impl.c.mpsliban to the client, for two reasons:
1. he can use it to connect the MPS to the ISO C library if it exists,
2. as an example of an implementation of the MPS Library Interface.
IMPLEMENTATION
.impl: The MPS Library Interface comprises a header file impl.h.mpslib
(mpslib.h) and some documentation.
.impl.decl: The header file defines the interface to definitions which parallel
those parts of the non-freestanding ISO headers which are used by the MPS.
.impl.include: The header file also includes the freestanding headers
<limits.h>, <stdarg.h>, and <stddef.h> (and not <float.h>, though perhaps it
should).
NOTES
.doc: User doc in doc.mps.guide.interface and doc.mps.guide.appendix-plinth.
ATTACHMENT
"MPS Library Interface Diagram"

143
mps/design/lock/index.txt Normal file
View file

@ -0,0 +1,143 @@
THE DESIGN OF THE LOCK MODULE
design.mps.lock
draft design
dsm 1995-11-21
PURPOSE
.purpose: Support the locking needs of the thread-safe design. In particular:
- Recursive locks
- Binary locks
- Recursive "global" lock that need not be allocated or initialized by the
client
- Binary "global" lock that need not be allocated or initialized by the client
.context: The MPS has to be able to operate in a multi-threaded environment.
The thread-safe design (design.mps.thread-safety) requires client-allocatable
binary locks, a global binary lock and a global recursive lock. An interface
to client-allocatable recursive locks is also present to support any potential
use, because of historic requirements, and because the implementation will
presumably be necessary anyway for the global recursive lock.
BACKGROUND
.need: In an environment where multiple threads are accessing shared data. The
threads which access data which is shared with other threads need to cooperate
with those threads to maintain consistency. Locks provide a simple mechanism
for doing this.
.ownership: A lock is an object which may be "owned" by a single thread at a
time. By claiming ownership of a lock before executing some piece of code a
thread can guarantee that no other thread owns the lock during execution of
that code. If some other thread holds a claim on a lock, the thread trying to
claim the lock will suspend until the lock is released by the owning thread.
.data: A simple way of using this behaviour is to associate a lock with a
shared data structure. By claiming that lock around accesses to the data, a
consistent view of the structure can be seen by the accessing thread. More
generally any set of operations which are required to be mutally exclusive may
be performed so by using locks.
OVERVIEW
.adt: There is an ADT "Lock" which points to a locking structure "LockStruct".
This structure is opaque to any client, although an interface is provided to
supply the size of the structure for any client wishing to make a new lock.
The lock is not allocated by the module as allocation itself may require
locking. LockStruct is implementation specific.
.simple-lock: There are facilities for claiming and releasing locks. "Lock" is
used for both binary and recursive locking.
.global-locks: "Global" locks are so called because they are used to protect
data in a global location (such as a global variable). The lock module provides
2 global locks; one recursive and one binary. There are facilities for
claiming and releasing both of these locks. These global locks have the
advantage that they need not be allocated or atomically initialized by the
client, so they may be used for locking the initialization of the allocator
itself. The binary global lock is intended to protect mutable data, possibly
in conjunction with other local locking strategies. The recursive global lock
is intended to protect static read-only data during one-off initialization.
See design.mps.thread-safety.
.deadlock: This module does not provide any deadlock protection. Clients are
responsible for avoiding deadlock by using traditional strategies such as
ordering of locks. (See design.mps.thread-safety.deadlock.)
.single-thread: In the single-threaded configuration, locks are not needed and
the claim/release interfaces defined to be no-ops.
DETAILED DESIGN
.interface: The interface comprises the following functions:
- LockSize
Return the size of a LockStruct for allocation purposes.
- LockInit / LockFinish
After initialisation the lock is not owned by any thread. This must also be
the case before finalisation.
[ref convention?]
- LockClaim / LockRelease
LockClaim: claims ownership of a lock that was previously not held by current
thread.
LockRelease: releases ownership of a lock that is currently owned.
- LockClaimRecursive / LockReleaseRecursive
LockClaimRecursive: remembers the previous state of the lock with respect to
the current thread and claims the lock (if not already held).
LockReleaseRecursive: restores the previous state of the lock stored by
corresponding LockClaimRecursive call.
- LockClaimGlobal / LockReleaseGlobal
LockClaimGlobal: claims ownership of the binary global lock which was
previously not held by current thread.
LockReleaseGlobal: releases ownership of the binary global lock that is
currently owned.
- LockClaimGlobalRecursive / LockReleaseGlobalRecursive
LockClaimGlobalRecursive: remembers the previous state of the recursive
global lock with respect to the current thread and claims the lock (if not
already held).
LockReleaseGlobalRecursive: restores the previous state of the recursive
global lock stored by corresponding LockClaimGlobalRecursive call.
.impl.recursive: For recursive claims, the list of previous states can be
simply implemented by keeping a count of the number of claims made by the
current thread so far. In multi-threaded implementation below this is handled
by the operating system. A count is still kept and used to check correctness.
.impl.global: The binary and recursive global locks may actually be implemented
using the same mechanism as normal locks.
.impl.ansi: Single-Threaded Generic Implementation
- single-thread
- no need for locking
- locking structure contains count
- provides checking in debug version
- otherwise does nothing except keep count of claims
.impl.win32: Win32 Implementation
- supports Win32's threads
- uses Critical Sections [ref?]
- locking structure contains a Critical Section
- both recursive and non-recursive calls use same Windows function
- also performs checking
.impl.linux: LinuxThreads Implementation (possibly suitable for all PThreads
implementations)
- supports LinuxThreads threads, which are an implementation of PThreads. (see
<URL:http://pauillac.inria.fr/~xleroy/linuxthreads/>)
- locking structure contains a mutex, initialized to check for recursive locking
- locking structure contains a count of the number of active claims
- non-recursive locking calls pthread_mutex_lock and expects success
- recursive locking calls pthread_mutex_lock and expects either success or
EDEADLK (indicating a recursive claim).
- also performs checking

435
mps/design/locus/index.txt Normal file
View file

@ -0,0 +1,435 @@
THE DESIGN FOR THE LOCUS MANAGER
design.mps.locus
incomplete design
gavinm 1998-02-27
INTRODUCTION
.intro: The locus manager coordinates between the pools and takes the burden of
having to be clever about tract/group placement away from the pools, preserving
trace differentiability and contiguity where appropriate.
.source: mail.gavinm.1998-02-05.17-52(0), mail.ptw.1998-02-05.19-53(0),
mail.pekka.1998-02-09.13-58(0), and mail.gavinm.1998-02-09.14-05(0).
Document History
.hist.0: Originally written as part of change.dylan.box-turtle.170569. Much
developed since. gavinm 1998-02-27
.hist.1: Pekka wrote the real requirements after some discussion. pekka
1998-10-28
.hist.2: Pekka deleted Gavin's design and wrote a new one. pekka 1998-12-15
DEFINITIONS
.note.cohort: We use the word "cohort" in its usual sense here, but we're
particularly interested in cohorts that have properties relevant to tract
placement. It is such cohorts that the pools will try to organize using the
services of the locus manager. Typical properties would be trace
differentiability or (en masse) death-time
predictability. Typical cohorts would be instances of a non-generational pool,
or generations of a collection strategy.
.def.trace.differentiability: Objects (and hence tracts) that are collected,
may or may not have "trace differentiability" from each other, depending on
their placement in the different zones. Objects (or pointers to them) can also
have trace differentiability (or not) from non-pointers in ambiguous
references; in practice, we will be worried about low integers, that may appear
to be in zones 0 or -1.
REQUIREMENTS
.req.cohort: Tract allocations must specify the cohort they allocate in. These
kind of cohorts will be called loci, and they will have such attributes as are
implied by the other requirements. Critical.
.req.counter.objects: As a counter-requirement, pools are expected to manage
objects. Objects the size of a tract allocation request (segment-sized) are
exceptional. Critical. .req.counter.objects.just: This means the locus
manager is not meant to solve the problems of allocating large objects, and it
isn't required to know what goes on in pools.
.req.contiguity: Must support a high level of contiguity within cohorts when
requested. This means minimizing the number of times a cohort is made aware of
discontiguity. Essential (as we've effectively renegotiated this in SW, down
to a vague hope that certain critical cohorts are not too badly fragmented).
.req.contiguity.just: TSBA.
.req.contiguity.specific: It should be possible to request another allocation
next to a specific tract on either side (or an extension in that direction, as
the case may be). Such a request can fail, if there's no space there. Nice.
[IWBN to have one for "next to the largest free block".]
.req.differentiable: Must support the trace differentiability of segments that
may be condemned separately. Due to the limited number of zones, it must be
possible to place several cohorts into the same zone. Essential.
.req.differentiable.integer: It must be possible to place collectable
allocations so that they are trace-differentiable from small integers.
Essential.
.req.disjoint: Must support the disjointness of pages that have different VM
properties (such as mutable/immutable, read-only/read-write, and different
lifetimes). Optional. [I expect the implementation will simply work at page
or larger granularity, so the problem will not arise, but Tucker insisted on
stating this as a requirement. pekka 1998-10-28]
.req.low-memory: The architecture of the locus manager must not prevent the
design of efficient applications that often use all available memory.
Critical. .req.low-memory.expl: This basically says it must be designed to
perform well in low-memory conditions, but that there can be configurations
where it doesn't do as well, as long as this is documented for the application
programmer. Note that it doesn't say all applications are efficient, only that
if you manage to design an otherwise efficient application, the locus manager
will not sink it.
.req.address: Must conserve address space in VM arenas to a reasonable extent.
Critical.
.req.inter-pool: Must support the association of sets of tracts in different
pools into one cohort. Nice.
.req.ep-style: Must support the existing EP-style of allocation whereby
allocation is from one end of address space either upwards or downwards (or a
close approximation thereto with the same behavior). .req.ep-style.just: We
cannot risk disrupting a policy with well-known properties when this technology
is introduced.
.req.attributes: There should be a way to inform the locus manager about
various attributes of cohorts that might be useful for placement: deathtime,
expected total size, [more in the future]. Optional. [It's a given that the
cohorts must then have these attributes, within the limits set in the contract
of the appropriate interface.] .req.attributes.action: The locus manager
should use the attributes to guide its placement decisions. Nice.
.req.blacklisting: There should be a way of maintaining at least one blacklist
for pages (or some other small unit), that can not/should not be allocated to
collectable pools. [How to do blacklist breaking for ambiguous refs?]
Optional.
.req.hysteresis: There should be a way to indicate which cohorts fluctuate in
size and by how much, to guide the arena hysteresis to hold on to suitable
pages. Optional.
ANALYSIS
.anal.sw: Almost any placement policy would be an improvement on the current SW
one.
.anal.cause-and-effect: The locus manager doesn't usually need to know _why_
things need to be differentiable, disjoint, contiguous, etc. Abstracting the
reason away from the interface makes it more generic, more likely to have
serendipitous new uses. Attributes described by a quantity (deathtime, size,
etc.) are an exception to this, because we can't devise a common measure.
.anal.stable: The strategy must be stable: it must avoid repeated
recomputation, especially the kind that switches between alternatives with a
short period (repeated "bites" out the same region or flip-flopping between two
regions).
.anal.fragmentation: There's some call to avoid fragmentation in cohorts that
don't need strict contiguity, but this is not a separate requirement, since
fragmentation is a global condition, and can only be ameliorated if there's a
global strategy that clumps allocations together. .anal.deathtime: Cohorts
with good death-time clumping of their objects could use some locality of tract
allocation, because it increases the chances of creating large holes in the
address space (for other allocation to use). OTOH. many cohorts will not do
multiple frees in short succession, or at least cannot reasonably be predicted
to do so. This locality is not contiguity, nor is it low fragmentation, it's
just the requirement to place the new tracts next to the tract where the last
object was allocated in the cohort. Note that the placement of objects is
under the control of the pool, and the locus manager will not know it,
therefore this requirement should be pursued by requesting allocation next to a
particular tract (which we already have a requirement for).
.anal.asymmetrical: The strategy has to be asymmetrical with respect to cohorts
growing and shrinking. The reason of this asymmetry is that it can choose
where to grow, but it cannot choose where to shrink (except in a small way by
growing with good locality).
INTERFACE
.interface.locus: A cohort will typically reside on multiple tracts (and the
pools will avoid putting objects of other cohorts on them), so there should be
an interface to describe the properties of the cohort, and associate each
allocation request with the cohort. We shall call such an object, created to
represent a cohort, a locus (pl. loci).
.interface.locus.pool: Loci will usually be created by the pool that uses it.
Some of the locus attributes will be inherited from client-specified pool
attributes [this means there will be additional pool attributes].
.interface.detail: This describes interface in overview; for details, see
implementation section and code, or user doc.
Loci
.function.create: A function to create a locus:
Res LocusCreate(Locus *locusReturn, LocusAttrs attrs, ZoneGroup zg,
LocusAllocDesc adesc)
where adesc contains the information about the allocation sequences in the
locus, zg is used for zone differentiability, and attrs encodes the following:
.locus.contiguity: A locus can be contiguous. This means performing as
required in .req.contiguity, non-contiguous allocations can be freely placed
anywhere (but efficiency dictates that similar allocations are placed close
together and apart from others).
.locus.blacklist: Allocations in the locus will avoid blacklisted pages (for
collectable segments).
.locus.zero: Allocations in the locus are zero-filled.
[Other attributes will be added, I'm sure.]
.interface.zone-group: The locus can be made a member of a zone group. Passing
ZoneGroupNONE means it's not a member of any group (allocations will be placed
without regard to zone, except to keep them out of stripes likely to be needed
for some group). [I propose no mechanism for managing zone groups at this
time, since it's only used internally for one purpose. pekka 2000-01-17]
.interface.size: An allocation descriptor (LocusAllocDesc) contains various
descriptions of how the locus will develop over time (inconsistent
specifications are forbidden, of course):
.interface.size.typical-alloc: Size of a typical allocation in this locus, in
bytes. This will mainly affect the grouping of non-contiguous loci.
.interface.size.large-alloc: Typical large allocation that the manager should
try to allow for (this allows some relief from .req.counter.objects), in
bytes. This will mainly affect the size of gaps that will be allotted
adjoining this locus.
.interface.size.direction: Direction of growth: up/down/none. Only useful if
the locus is contiguous.
.interface.size.lifetime: Some measure of the lifetime of tracts (not
objects) in the cohort. [Don't know the details yet, probably only useful for
placing similar cohorts next to each other, so the details don't actually
matter. pekka 2000-01-17]
.interface.size.deathtime: Some measure of the deathtime of tracts (not
objects) in the cohort. [Ditto. pekka 2000-01-17]
.function.init: LocusInit is like LocusCreate, but without the allocation.
This is the usual i/f, since most loci are embedded in a pool or something.
.function.alloc: ArenaAlloc to take a locus arg. ArenaAllocHere is like it,
plus it takes a tract and a specification to place the new allocation
immediately above/below a given tract; if that is not possible, it returns
ResFAIL (this will make it useful for realloc functionality).
.function.set-total: A function to tell the arena the expected number of
(non-miscible client) loci, and of zone groups:
ArenaSetTotalLoci(Arena arena, Size nLoci, Size nZoneGroups)
Peaks
.function.peak.create: A function to create a peak:
mps_res_t mps_peak_create(mps_peak_t*, mps_arena_t)
A newly-created peak is open, and will not be used to guide the strategy of the
locus manager.
.function.peak.add: A function to add a description of the state of one pool
into the peak:
mps_res_t mps_peak_describe_pool(mps_peak_t, mps_pool_t, mps_size_desc_t)
Calling this function again for the same peak and pool instance will replace
the earlier description. .function.peak.add.size: The size descriptor contains
a total size in bytes or % of arena size [@@@@is this right?].
.function.peak.add.remove: Specifying a NULL size will remove the pool from the
peak. The client is not allowed to destroy a pool that is mentioned in any
peak; it must be first removed from the peak, or the peak must be destroyed.
This is to ensure that the client adjusts the peaks in a manner that makes
sense to the application; the locus manager can't know how to do that.
.function.peak.close: A function to indicate that all the significant pools
have been added to the peak, and it can now be used to guide the locus manager:
mps_res_t mps_peak_close(mps_peak_t)
For any pool not described in the peak, the locus manager will take its current
size at any given moment as the best prediction of its size at the peak.
.function.peak.close.after: It is legal to add more descriptions to the peak
after closing, but this will reopen the peak, and it will have to be closed
before the locus manager will use it again. The locus manager uses the
previous closed state of the peak, while this is going on.
.function.peak.destroy: A function to destroy a peak:
void mps_peak_destroy(mps_peak_t)
.interface.ep-style: This satisfies .req.ep-style by allowing SW to specify
zero size for most pools (which will cause them to be place next to other loci
with the same growth direction). [Not sure this is good enough, but we'll try
it first. pekka 2000-01-17]
ARCHITECTURE
Data Objects
.arch.locus: To represent the cohorts, we have locus objects. Usually a locus
is embedded in a pool instance, but generations are separate loci.
.arch.locus.attr: contiguity, blacklist, zg, current region, @@@@
.arch.locus.attr.exceptional: The client can define a typical large allocation
for the locus. Requests substantially larger than that are deemed exceptional.
.arch.zone-group: To satisfy .req.condemn, we offer zone groups. Each locus
can be a member of a zone group, and the locus manager will attempt to place
allocations in this locus in different zones from all the other zone groups. A
zone-group is represented as @@@@.
.arch.page-table: A page table is maintained by the arena, as usual to track
association between tracts, pools and segments, and mapping status for VM
arenas.
.arch.region: All of the address space is divided into disjoint regions,
represented by region objects. These objects store their current limits, and
high and low watermarks of currently allocated tracts (we hope there's usually
a gap of empty space between regions). The limits are actually quite porous
and flexible.
.arch.region.assoc: Each region is associated with one contiguous locus or any
number of non-contiguous loci (or none). We call the first kind of region
"contiguous". .arch.locus.assoc: Each locus remembers all regions where it has
tracts currently, excepting the badly-placed allocations (see below). It is
not our intention that any locus would have very many, or that loci that share
regions would have any reason to stop doing do.
.arch.region.more: Various quantities used by the placement computation are
also stored in the regions and the loci. Regions are created (and destroyed)
by the placement recomputation. Regions are located in stripes (if it's a
zoned region), but they can extend into neighboring stripes if an exceptionally
large tract allocation is requested (to allow for large objects).
.arch.chunk: Arenas may allocate more address space in additional chunks, which
may be disjoint from the existing chunks. Inter-chunk space will be
represented by dummy regions. There are also sentinel regions at both ends of
the address space.
Overview of Strategy
.arch.strategy.delay: The general strategy is to delay placement decisions
until they have to be made, but no later.
.arch.strategy.delay.until: Hence, the locus manager only makes placement
decisions when an allocation is requested (frees and other operations might set
a flag to cause the next allocation to redecide). This also allows the client
to change the peak and pool configuration in complicated ways without causing a
lot of recomputation, by doing all the changes without allocating in the middle
(unless the control pool needs more space because of the changes).
.arch.strategy.normal: While we want the placement to be sophisticated, we do
not believe it is worth the effort to consider all the data at each
allocation. Hence, allocations are usually just placed in one of the regions
used previously (see .arch.alloc) without reconsidering the issues.
.arch.strategy.normal.limit: However, the manager sets precautionary limits on
the regions to ensure that the placement decisions are revisited when an
irrevocable placement is about to be made.
.arch.strategy.create: The manager doesn't create new regions until they are
needed for allocation (but it might compute where they could be placed to
accommodate a peak).
Allocation
.arch.alloc: Normally, each allocation to a locus is placed in its current
region. New regions are only sought when necessary to fulfill an allocation
request or when there is reason to think the situation has changed
significantly (see .arch.significant).
.arch.alloc.same: An allocation is first attempted next to the previous
allocation in the same locus, respecting growth direction. If that is not
possible, a good place in the current region is sought. .arch.alloc.same.hole:
ATM, for finding a good place within a region, we just use the current
algorithm, limited to the region. In future, the placement within regions will
be more clever.
.arch.alloc.extend: If there's no adequate hole in the current region and the
request is not exceptional, the neighboring regions are examined to see if the
region could be extended at one border. (This will basically only be done if
the neighbor has shrunk since the last placement recomputation, because the
limit was set on sophisticated criteria, and should not be changed without
justification.) .arch.alloc.extend.here: When an allocation is requested next
to a specific tract (ArenaAllocHere), we try to extend a little harder [at
least for change_size, perhaps not for locality].
.arch.alloc.other: If no way can be found to allocate in the current region,
other regions used for this locus are considered in the same way, to see if
space can be found there. [Or probably look at other regions before trying to
extend anything?]
.arch.alloc.recompute: When no region of this locus has enough space for the
request, or when otherwise required, region placement is recomputed to find a
new region for the request (which might be the same region, after extension).
.arch.alloc.current: This region where the allocation was placed then becomes
the current region for this locus, except when the request was exceptional, or
when the region chosen was "bad" (see @@@@).
.arch.significant: Significant changes to the parameters affecting placement
are deemed to have happened at certain client calls and when the total
allocation has changed substantially since the last recomputation. Such
conditions set a flag that causes the next allocation to recompute even if its
current region is not full [possibly second-guess the decision to recompute
after some investigation of the current state?].
Deallocation
.arch.free: Deallocation simply updates the counters in the region and the
locus. For some loci, it will make the region of the deallocation the current
region.
.arch.free.remove: If a region becomes entirely empty, it is deleted (and the
neighbors limits might be adjusted [quite tricky to get right, this]).
Region Placement Recomputation
.arch.gap: When doing placement computations, we view the arena as a sequence
of alternating region cores and gaps (which can be small, even zero-sized).
Initially, we'll take the core of a region to be the area between the high and
low watermark, but in the future we might be more flexible about that. [Edge
determination is actually a worthwhile direction to explore.]
.arch.reach: The gap between two cores could potentially end up being allocated
to either region, if they grow in that direction, or one or neither, if they
don't. The set of states that the region assignment could reach by assigning
the gaps to their neighbors is called the reach of the current configuration.
.arch.placement.object: The object of the recomputation is to find a
configuration of regions that is not too far from the current configuration and
that keeps all the peaks inside its reach; if that is not possible, keep the
nearest ones in the reach and then minimize the total distance from the rest.
.arch.placement.hypothetical: The configurations that are considered will
include hypothetical placements for new regions for loci that cannot fit in
their existing regions at the peak. This is necessary to avoid choosing a bad
alternative.
.arch.placement.interesting: The computation will only consider new regions of
loci that are deemed interesting, i.e., far from their peak state. This will
reduce the computational burden and avoid jittering near a peak.
[details missing]
IMPLEMENTATION
[missing]
NOTES
.idea.change: Even after the first segment, be prepared to change your mind, if
by the second segment a lot of new loci have been created.
.distance: If the current state is far from a peak, there's time to reassign
regions and for free space to appear (in fact, under the steady arena
assumption, enough free space _will_ appear).
.clear-pool: Need to have a function to deallocate all objects in a pool, so
that PoolDestroy won't have to be used for that purpose.

View file

@ -0,0 +1,363 @@
MPS TO CLIENT MESSAGE PROTOCOL
design.mps.message
incomplete doc
drj 1997-02-13
INTRODUCTION
.readership: Any MPS developer.
.intro: The MCMP provides a means by which clients can receive messages from
the MPS asynchronously. Typical messages may be low memory notification (or in
general low utility), finalization notification, soft-failure notification.
There is a general assumption that it should not be disastrous for the MPS
client to ignore messages, but that it is probably in the clients best interest
to not ignore messages. The justification for this is that the MPS cannot
force the MPS client to read and act on messages, so no message should be
critical [bogus, since we cannot force clients to check error codes either -
Pekka 1997-09-17].
.contents: This document describes the design of the external and internal
interfaces and concludes with a sketch of an example design of an internal
client. The example is that of implementing finalization using PoolMRG.
REQUIREMENTS
.req: The MPS/Client message protocol will be used for implementing
finalization (see design.mps.finalize and req.dylan.fun.final). It will also
be used for implementing the notification of various conditions (possibly
req.dylan.prot.consult is relevant here).
INTERFACE
External Interface
.if.queue:
Messages are presented as a single queue per arena. Various functions are
provided to inspect the queue and inspect messages in it (see below).
Functions
.if.fun:
The following functions are provided:
.if.fun.poll: Poll. Sees whether there are any messages pending.
mps_bool_t mps_message_poll(mps_arena_t arena);
Returns 1 only if there is a message on the queue of arena. Returns 0
otherwise.
.if.fun.enable: Enable. Enables the flow of messages of a certain type.
void mps_message_type_enable(mps_arena_t arena, mps_message_type_t type);
Enables the specified message type. The queue of messages of a arena will
contain only messages whose types have been enabled. Initially all message
types are disabled. Effectively this function allows the client to declare to
the MPS what message types the client understands. The MPS does not generate
any messages of a type that hasn't been enabled. This allows the MPS to add
new message types (in subsequent releases of a memory manager) without
confusing the client. The client will only be receiving the messages if they
have explicitly enabled them (and the client presumably only enables message
types when they have written the code to handle them).
.if.fun.disable: Disable. Disables the flow of messages of a certain type.
void mps_message_type_disable(mps_arena_t arena, mps_message_type_t type);
The antidote to mps_message_type_enable. Disables the specified message type.
Flushes any existing messages of that type on the queue, and stops any further
generation of messages of that type. This permits clients to dynamically
decline interest in a message type, which may help to avoid a memory leak or
bloated queue when the messages are only required temporarily.
.if.fun.get: begins a message "transaction".
mps_bool_t mps_message_get(mps_message_t *message_return, mps_arena_t arena,
mps_message_type_t type);
If there is a message of the specified type on the queue then the first such
message will be removed from the queue and a handle to it will be returned to
the client in *messageReturn; in this case the function will return TRUE.
Otherwise it will return FALSE. Having obtained a handle on a message in this
way, the client can use the type-specific accessors to find out about the
message. When the client is done with the message the client should call
mps_message_discard; failure to do so will result in a resource leak.
.if.fun.discard: ends a message "transaction".
void mps_message_discard(mps_arena_t arena, mps_message_t message);
Indicates to the MPS that the client is done with this message and its
resources may be reclaimed.
.if.fun.type.any: Determines the type of a message in the queue
mps_bool_t mps_message_queue_type(mps_message_type_t *type_return, mps_arena_t
arena);
Returns 1 only if there is a message on the queue of arena, and in this case
updates *type_return to be the type of a message in the queue. Otherwise
returns 0.
.if.fun.type: Determines the type of a message (that has already been got).
mps_message_type_t mps_message_type(mps_arena_t arena, mps_message_t message)
Return the type of the message. Only legal when inside a message transaction
(i.e. after mps_message_get and before mps_message_discard). Note that the
type will be the same as the type that the client passed in the call to
mps_message_get.
Types of messages
.type: The type governs the "shape" and meaning of the message.
.type.int: Types themselves will just be a scalar quantity, an integer.
.type.semantics: A type indicates the semantics of the message.
.type.semantics.interpret: The semantics of a message are interpreted by the
client by calling various accessor methods on the message. .type.accessor: The
type of a message governs which accessor methods are legal to apply to the
message.
.type.example: Some example types:
.type.finalization: There will be a finalization type. The type is abstractly:
FinalizationMessage(Ref).
.type.finalization.semantics: A finalization message indicates that an object
has been discovered to be finalizable (see design.mps.poolmrg.def.final.object
for a definition of finalizable). .type.finalization.ref: There is an accessor
to get the reference of the finalization message (i.e. a reference to the
object which is finalizable) called mps_message_finalization_ref.
.type.finalization.ref.scan: Note that the reference returned should be stored
in scanned memory.
.compatibility:
Compatibility issues
.compatibility.future.type-new: Notice that message of a type that the client
doesn't understand are not placed on the queue, therefore the MPS can introduce
new types of message and existing client will still function and will not leak
resources. This has been achieved by getting the client to declare the types
that the client understands (with mps_message_type_enable, .if.fun.enable).
.compatibility.future.type-extend: The information available in a message of a
given type can be extended by providing more accessor methods. Old clients
won't get any of this information but that's okay.
Internal Interface
.message.instance: Messages are instances of Message Classes.
.message.concrete: Concretely a Message is represented by a MessageStruct. A
MessageStruct has the usual signature field (see design.mps.sig). A
MessageStruct has a type field which defines its type, a ring node, which is
used to attach the message to the queue of pending messages, a class field,
which identifies a MessageClass object. .message.intent: The intention is that
a MessageStruct will be embedded in some richer object which contains
information relevant to that specific type of message.
.message.type:
typedef struct MessageStruct *Message;
.message.struct:
struct MessageStruct {
Sig sig;
MessageType type;
MessageClass class;
RingStruct node;
} MessageStruct;
.class: A message class is an encapsulation of methods. It encapsulates
methods that are applicable to all types of messages (generic) and methods that
are applicable to messages only of a certain type (type-specific).
.class.concrete: Concretely a message class is represented by a
MessageClassStruct (a struct). Clients of the Message module are expected to
allocate storage for and initialise the MessageClassStruct. It is expected
that such storage will be allocated and initialised statically.
.class.not-type: Note that message classes and message types are distinct.
.class.not-type.why: (see also mail.drj.1997-07-15.10-33(0) from which this is
derived) This allows two different implementations (ie classes) of messages
with the same meaning (ie type). This may be necessary because the (memory)
management of the messages may be different in the two implemtations (which is
bogus). The case of having one class implement two types is not expected to be
so useful. .class.not-type.why.not: It's all pretty feeble justification
anyway.
.class.methods.generic: The generic methods are:
delete - used when the message is destroyed (by the client calling
mps_message_discard). The class implementation should finish the message (by
calling MessageFinish) and storage for the message should be reclaimed (if
applicable).
.class.methods.specific:
The type specific methods are:
.class.methods.specific.finalization:
Specific to MessageTypeFinalization
finalizationRef - returns a reference to the finalizable object represented by
this message.
.class.methods.specific.collectionstats:
Specific to MessageTypeCollectionStats
collectionStatsLiveSize - returns the number of bytes (of objects) that were
condemned but survived.
collectionStatsCondemnedSize - returns the number of bytes condemned in the
collection.
collectionStatsNotCondemnedSize - returns the the number of bytes (of objects)
that are subject to a GC policy (ie collectable) but were not condemned in the
collection.
.class.type:
typedef struct MessageClassStruct *MessageClass;
.class.sig.double: The MessageClassStruct has a signature field at both ends.
This is so that if the MessageClassStruct changes size (by adding extra methods
for example) then any static initializers will generate errors from the
compiler (there will be a type error causes by initialising a non-sig type
field with a sig) unless the static initializers are changed as well.
.class.struct:
typedef struct MessageClassStruct {
Sig sig; /* design.mps.sig */
const char *name; /* Human readable Class name */
/* generic methods */
MessageDeleteMethod delete; /* terminates a message */
/* methods specific to MessageTypeFinalization */
MessageFinalizationRefMethod finalizationRef;
/* methods specific to MessageTypeCollectionStats */
MessageCollectionStatsLiveSizeMethod collectionStatsLiveSize;
MessageCollectionStatsCondemnedSizeMethod collectionStatsCondemnedSize;
MessageCollectionStatsNotCondemnedSizeMethod collectionStatsNotCondemnedSize;
Sig endSig; /* design.mps.message.class.sig.double */
} MessageClassStruct;
.space.queue: The arena structure is augmented with a structure for managing
for queue of pending messages. This is a ring in the ArenaStruct.
struct ArenaStruct
{
...
RingStruct messageRing;
...
}
Functions
.fun.init:
/* Initializes a message */
void MessageInit(Arena arena, Message message, MessageClass class);
Initializes the MessageStruct pointed to by message. The caller of this
function is expected to manage the store for the MessageStruct.
.fun.finish:
/* Finishes a message */
void MessageFinish(Message message);
Finishes the MessageStruct pointed to by message. The caller of this function
is expected to manage the store for the MessageStruct.
.fun.post:
/* Places a message on the client accessible queue */
void MessagePost(Arena arena, Message message);
This function places a message on the queue of a arena. .fun.post.precondition
: Prior to calling the function the node field of the message must be a
singleton. After the call to the function the message will be available for
MPS client to access. After the call to the function the message fields must
not be manipulated except from the message's class's method functions (i.e.,
you mustn't poke about with the node field in particular).
.fun.empty:
void MessageEmpty(Arena arena);
Empties the message queue. This function has the same effect as discarding all
the messages on the queue. After calling this function there will be no
messages on the queue. .fun.empty.internal-only: This functionality is not
exposed to clients. We might want to expose this functionality to our clients
in the future.
Message Life Cycle
.life: A message will be allocated by a client of the message module, it will
be initialised by calling MessageInit. The client will eventually post the
message on the external queue (in fact most clients will create a message and
then immediately post it). The message module may then apply any of the
methods to the message. The message module will eventually destroy the message
by applying the Delete method to it.
EXAMPLES
Finalization
[possibly out of date, see design.mps.finalize and design.mps.poolmrg instead
-- drj 1997-08-28]
This subsection is a sketch of how PoolMRG will use Messages for finalization
(see design.mps.poolmrg).
PoolMRG has guardians (see design.mps.poolmrg.guardian), guardians are used to
manage final references and detect when an object is finalizable.
The link part of a guardian will be expanded to include a MessageStruct; in
fact the link part of a guardian will be expanded so that it is exactly a
MessageStruct (or rather a structure with a single member that has the type
MessageStruct).
The MessageStruct is allocated when the final reference is created (which is
when the referred to object is registered for finalization). This avoids
allocating at the time when the message gets posted (which might be a tricky,
undesirable, or impossible, time to allocate).
The two queues of PoolMRG (the entry queue, and the exit queue) will use the
MessageStruct ring node. Before the object (referred to by the guardian) is
finalizable the MessageStruct is not needed by the Message system (there is no
message to send yet!), so it is okay to use the Message's ring node to attach
the guardian to the entry queue (see
design.mps.poolmrg.guardian.two-part.justify). The exit queue of MRG will
simply be the external message queue.
MRG Message class
del - frees both the link part and the reference part of the guardian.

44
mps/design/pool/index.txt Normal file
View file

@ -0,0 +1,44 @@
THE DESIGN OF THE POOL AND POOL CLASS MECHANISMS
design.mps.pool
incomplete doc
richard 1996-07-31
- This document must derive the requirements for pool.c etc. from the
architecture.
.def.outer-structure: The "outer structure" (of a pool) is a C object of type
PoolXXXStruct or the type struct PoolXXXStruct itself.
.def.generic-structure: The "generic structure" is a C object of type
PoolStruct (found embedded in the outer-structure) or the type struct
PoolStruct itself.
.align: When initialised, the pool gets the default alignment (ARCH_ALIGN).
.no: If a pool class doesn't implement a method, and doesn't expect it to be
called, it should use a non-method (PoolNo*) which will cause an assertion
failure if they are reached.
.triv: If a pool class supports a protocol but does not require any more than a
trivial implementation, it should use a trivial method (PoolTriv*) which will
do the trivial thing.
.outer-structure.sig: It is good practice to put the signature for the outer
structure at the end (of the structure). This is because there's already one
at the beginning (in the poolStruct) so putting it at the end gives some extra
fencepost checking.
REQUIREMENTS
[Placeholder]
.req.fix: PoolFix must be fast.
OTHER
Interface in mpm.h
Types in mpmst.h
See also design.mps.poolclass

View file

@ -0,0 +1,446 @@
THE DESIGN OF THE AUTOMATIC MOSTLY-COPYING MEMORY POOL CLASS
design.mps.poolamc
incomplete design
richard 1995-08-25
INTRODUCTION
.intro: This is the design of the AMC Pool Class. AMC stands for Automatic
Mostly-Copying. This design is highly fragmentory and some may even be
sufficiently old to be misleading.
.readership: The intended readership is any MPS developer.
OVERVIEW
.overview: This class is intended to be the main pool class used by Harlequin
Dylan. It provides garbage collection of objects (hence "automatic"). It uses
generational copying algorithms, but with some facility for handling small
numbers of ambiguous references. Ambiguous references prevent the pool from
copying objects (hence "mostly copying"). It provides incremental collection.
[ lot of this design is awesomely old -- drj 1998-02-04]
DEFINITIONS
.def.grain: Grain. An quantity of memory which is both aligned to the pool's
alignment and equal to the pool's alignment in size. IE the smallest amount of
memory worth talking about.
DESIGN
Segments
.seg.class: AMC allocates segments of class AMCSegClass, which is a subclass of
GCSegClass. Instances contain a segTypeP field, which is of type int*. .seg.gen
: AMC organizes the segments it manages into generations. .seg.gen.map: Every
segment is in exactly one generation. .seg.gen.ind: The segment's segTypeP
field indicates which generation (that the segment is in) (an AMCGenStruct see
blah below). .seg.typep: The segTypeP field actually points to either the type
field of a generation or to the type field of a nail board.
.seg.typep.distinguish: The type field (which can be accessed in either case)
determines whether the segTypeP field is pointing to a generation or to a nail
board. .seg.gen.get: The map from segment to generation is implemented by
AMCSegGen which deals with all this.
Fixing and Nailing
[.fix.nail.* are placeholders for design rather than design really-- drj
1998-02-04]
.fix.nail:
.nailboard: AMC uses a nail board structure for recording ambiguous references
to segments. A nail board is a bit table with one bit per grain in the
segment. .nailboard.create: Nail boards are allocated dynamically whenever a
segment becomes newly ambiguously referenced. .nailboard.destroy: They are
deallocated during reclaim. Ambiguous fixes simply set the appropriate bit in
this table. This table is used by subsequent scans and reclaims in order to
work out what objects were marked.
.nailboard.emergency: During emergency tracing two things relating to nail
boards happen that don't normally: .nailboard.emergency.nonew: Nail boards
aren't allocated when we have new ambiguous references to segments
(.nailbaord.emergency.nonew.justify: We could try and allocate a nail board,
but we're in emergency mode so short of memory so it's unlikely to succeed, and
there would be additional code for yet another error path which complicates
things); .nailboard.emergency.exact: nail boards are used to record exact
references in order to avoid copying the objects. .nailboard.hyper-c
onservative: Not creating new nail boards (.nailboard.emergency.nonew above)
means that when we have a new reference to a segment during emergency tracing
then we nail the entire segment and preserve everything in place.
.fix.nail.states:
Partition the segment states into 4 sets:
white segment and not nailed (and has no nail board)
white segment and nailed and has no nail board
white segment and nailed and has nail board
the rest
.fix.nail.why: A segment is recorded as being nailed when either there is an
ambiguous reference to it, or there is an exact reference to it and the object
couldn't be copied off the segment (because there wasn't enough memory to
allocate the copy). In either of these cases reclaim cannot simply destroy the
segment (usually the segment will not be destroyed because it will have live
objects on it, though see .nailboard.limitations.middle below). If the segment
is nailed then we might be using a nail board to mark objects on the segment.
However, we cannot guarantee that being nailed implies a nail board, because we
might not be able to allocate the nail board. Hence all these states actually
occur in practice.
.fix.nail.distinguish: The nailed bits in the segment descriptor (SegStruct)
are used to record whether a segment is nailed or not. The segTypeP field of
the segment either points to (the "type" field of) an AMCGen or to an
AMCNailBoard, the type field can be used to determine which of these is the
case. (see .seg.typep above).
.nailboard.limitations.single: Just having a single nail board per segment
prevents traces from improving on the findings of each other: a later trace
could find that a nailed object is no longer nailed or even dead. Until the
nail board is discarded, that is. .nailboard.limitations.middle: An ambiguous
reference into the middle of an object will cause the segment to survive, even
if there are no surviving objects on it. .nailboard.limitations.reclaim:
AMCReclaimNailed could cover each block of reclaimed objects between two nailed
objects with a single padding object, speeding up further scans.
Emergency Tracing
.emergency.fix: AMCFixEmergency is at the core of AMC's emergency tracing
policy (unsurprisingly). AMCFixEmergency chooses exactly one of three options:
a) use the existing nail board structure to record the fix, b) preserve and
nail the segment in its entirety, c) snapout an exact (or high rank) pointer to
a broken heart to the broken heart's forwarding pointer. If the rank of the
reference is AMBIG then it either does a) or b) depending on wether there is an
existing nail board or not. Otherwise (the rank is exact or higher) if there
is a broken heart it is used to snapout the pointer. Otherwise it is as for an
AMBIG ref (we either do a) or b)).
.emergency.scan: This is basically as before, the only complication is that
when scanning a nailed segment we may need to do multiple passes, as
FixEmergency may introduce new marks into the nail board.
Buffers
.buffer.class: AMC uses buffer of class AMCBufClass (a subclass of SegBufClass)
.buffer.gen: Each buffer allocates into exactly one generation. .buffer.gen:
AMCBuf buffer contain a gen field which points to the generation that the
buffer allocates into. .buffer.fill.gen: AMCBufferFill uses the generation
(obtained from the gen field) to initialise the segment's segTypeP field which
is how segments get allocated in that generation.
.buffer.condemn: We condemn buffered segments, but not the contents of the
buffers themselves, because we can't reclaim uncommitted buffers (see
design.mps.buffer for details). If the segment has a forwarding buffer on it,
we detach it [why? @@@@ forwarding buffers are detached because they used to
cause objects on the same segment to not get condemned, hence caused retention
of garbage. Now that we condemn the non-buffered portion of buffered segments
this is probably unnecessary -- drj 1998-06-01 But it's probably more
efficient than keeping the buffer on the segment, because then the other stuff
gets nailed -- pekka 1998-07-10]. If the segment has a mutator buffer on it,
we nail the buffer. If the buffer cannot be nailed, we give up condemning,
since nailing the whole segment would make it survive anyway. The scan methods
skip over buffers and fix methods don't do anything to things that have already
been nailed, so the buffer is effectively black.
AMCStruct
.struct: AMCStruct is the pool class AMC instance structure. .struct.pool:
Like other pool class instances, it contains a PoolStruct containing the
generic pool fields.
.struct.format: The "format" field points to a Format structure describing the
object format of objects allocated in the pool. The field is intialized by
AMCInit from a parameter, and thereafter it is not changed until the pool is
destroyed. [actually the format field is in the generic PoolStruct these
days. drj 1998-09-21]
[lots more fields here]
Generations
.gen: Generations partition the segments that a pool manages (see .seg.gen.map
above). .gen.collect: Generations are more or less the units of condemnation
in AMC. And also the granularity for forwarding (when copying objects during a
collection): all the objects which are copied out of a generation use the same
forwarding buffer for allocating the new copies, and a forwarding buffer
results in allocation in exactly one generation.
.gen.rep: Generations are represented using an AMCGenStruct structure.
.gen.create: All the generation are create when the pool is created (during
AMCInitComm).
.gen.manage.ring: An AMC's generations are kept on a ring attached to the
AMCStruct (the genRing field). .gen.manage.array: They are also kept in an
array which is allocated when the pool is created and attached to the AMCStruct
(the gens field holds the number of generations, the gen field points to an
array of AMCGen). [it seems to me that we could probably get rid of the ring
-- drj 1998-09-22]
.gen.number: There are AMCTopGen + 2 generations in total. "normal"
generations numbered from 0 to AMCTopGen inclusive and an extra "ramp"
generation (see .gen.ramp below).
.gen.forward: Each generation has an associated forwarding buffer (stored in
the "forward" field of AMCGen). This is the buffer that is used to forward
objects out of this generation. When a generation is created in AMCGenCreate,
its forwarding buffer has a NULL p field, indicating that the forwarding buffer
has no generation to allocate in. The collector will assert out (in
AMCBufferFill where it checks that buffer->p is an AMCGen) if you try to
forward an object out of such a generation. .gen.forward.setup: All the
generation's forwarding buffer's are associated with generations when the pool
is created (just after the generations are created in AMCInitComm).
Ramps
.ramp: Ramps usefully implement the begin/end mps_alloc_pattern_ramp interface.
.gen.ramp: To implement ramping (request.dylan.170423), AMC uses a special
"ramping mode", where promotions are redirected. .gen.ramp.before: While
ramping, objects promoted from a designated (AMCRampGenFollows) generation are
forwarded into a special "ramp generation", instead of their usual generation.
.gen.ramp.itself: The ramp generation is promoted into itself during ramping
mode; after this mode ends, it is promoted into the generation after
AMCRampGenFollows (even if ramping mode is immediately re-entered, but only
once in that case).
.ramp.mode: Ramping is controlled using the rampMode field of the pool. There
are five modes:
enum { outsideRamp, beginRamp, ramping, finishRamp, collectingRamp };
[These would perhaps be better if they all start Ramp* or AMCRamp* drj
1998-08-07]
.ramp.count: The pool just counts the number of ap's that have begun ramp mode
(and not ended). .ramp.begin: Basically, when the count goes up from zero, the
pool enters into beginRamp mode; however, that doesn't happen if it is already
in finishRamp mode, thereby ensuring at least one (decision to start a)
collection when leaving ramp mode even if a new ramp starts immediately (but
see .ramp.collect below). When a new GC begins in beginRamp mode, and a
segment in generation AMCRampGenFollows is condemned, AMCWhiten switches the
generations to forward in the ramping way (.gen.ramp); the pool enters ramping
mode. (This assumes that each generation is condemned together with all lower
generations.) .ramp.end: After the ramp count goes back to zero, the pool
enters finishRamp mode, or outsideRamp directly, if there's no ramp generation
(this means we never collected generation AMCRampGenFollows, and hence never
switched the promotion). When a new GC begins in finishRamp mode (this GC
will always collect the ramp generation, because we jig the benefits to ensure
that), and a segment in generation AMCRampGenFollows is condemned, AMCWhiten
switches the generations to forward in the usual way (.gen.ramp); the pool
enters collectingRamp mode. .ramp.collect: The purpose of collectingRamp mode
is to ensure the pool will switch back into ramping if the ramp count goes
immediately back up, but not before having collected the ramp generation once.
So this mode tells AMCReclaim to check the ramp count, and change the mode to
beginRamp or outsideRamp.
.ramp.collect-all: There are two flavours of ramp collections: the normal one
that collects the ramp generation and the younger ones, and the collect-all
flavour that does a full GC (this is a hack for producing certain Dylan
statistics). The collection will be of collect-all flavour, if any of the
RampBegins during the corresponding rank asked for that. Ramp beginnings and
collections are asynchronous, so we need two fields to implement this
behaviour: collectAll to indicate whether the ramp collection that is about to
start should be collect-all, and collectAllNext to keep track of whether the
current ramp has any requests for it.
Headers
.header: AMC supports a fixed-size header on objects, with the client pointers
pointing after the header, rather than the base of the memory block. See
format documentation for details of the interface. .header.client: The code
mostly deals in client pointers, only computing the base and limit of a block
when these are needed (such as when an object is copied). In several places,
the code gets a block of some sort, a segment or a buffer, and creates a client
pointer by adding the header length (pool->format->headerLength). .header.fix:
There are two versions of the fix method, due to its criticality, with
(AMCHeaderFix) and without (AMCFix) headers. The correct one is selected in
AMCInitComm, and placed in the pool's fix field. This is the main reason why
fix methods dispatch through the instance, rather than the class like all other
methods.
OLD AND AGING NOTES BELOW HERE:
AMCFinish
.finish:
.finish.forward:
103 /* If the pool is being destroyed it is OK to destroy */
104 /* the forwarding buffers, as the condemned set is about */
105 /* to disappear. */
AMCBufferEmpty
.flush: Removes the connexion between a buffer and a group, so that the group
is no longer buffered, and the buffer is reset and will cause a refill when
next used.
.flush.pad: The group is padded out with a dummy object so that it appears full.
.flush.expose: The buffer needs exposing before writing the padding object onto
it. If the buffer is being used for forwarding it might already be exposed, in
this case the segment attached to it must be covered when it leaves the
buffer. See .fill.expose.
.flush.cover: The buffer needs covering whether it was being used for
forwarding or not. See .flush.expose.
AMCBufferFill
.fill:
185 * Reserve was called on an allocation buffer which was reset,
186 * or there wasn't enough room left in the buffer. Allocate a group
187 * for the new object and attach it to the buffer.
188 *
.fill.expose:
189 * .fill.expose: If the buffer is being used for forwarding it may
190 * be exposed, in which case the group attached to it should be
191 * exposed. See .flush.cover.
AMCBufferTrip
.trip:
239 * A flip occurred between a reserve and commit on a buffer, and
240 * the buffer was "tripped" (limit set to zero). The object wasn't
241 * scanned, and must therefore be assumed to be invalid, so the
242 * reservation must be rolled back. This function detaches the
243 * buffer from the group completely. The next allocation in the
244 * buffer will cause a refill, and reach AMCFill.
AMCBufferFinish
.buffer-finish:
264 * Called from BufferDestroy, this function detaches the buffer
265 * from the group it's attached to, if any.
AMCFix
.fix:
281 * fix an ambiguous reference to the pool
282 *
283 * Ambiguous references lock down an entire segment by removing it
284 * from old-space and also marking it grey for future scanning.
285 *
286 * fix an exact, final, or weak reference to the pool
287 *
288 * These cases are merged because the action for an already
289 * forwarded object is the same in each case. After that
290 * situation is checked for, the code diverges.
291 *
292 * Weak references are either snapped out or replaced with
293 * ss->weakSplat as appropriate.
294 *
295 * Exact and final references cause the referenced object to be copied t
o
296 * new-space and the old copy to be forwarded (broken-heart installed)
297 * so that future references are fixed up to point at the new copy.
298 *
299 * .fix.exact.expose: In order to allocate the new copy the
300 * forwarding buffer must be exposed. This might be done more
301 * efficiently outside the entire scan, since it's likely to happen
302 * a lot.
303 *
304 * .fix.exact.grey: The new copy must be at least as grey as the old
one,
305 * as it may have been grey for some other collection.
AMCGrey
.grey:
453 * Turns everything in the pool which is not condemned for a trace
454 * grey.
AMCSegScan
.seg-scan:
485 * .seg-scan.blacken: One a group is scanned it is turned black, i.e.
486 * the ti is removed from the grey TraceSet. However, if the
487 * forwarding buffer is still pointing at the group it could
488 * make it grey again when something is fixed, and cause the
489 * group to be scanned again. We can't tolerate this at present,
490 * the the buffer is flushed. The solution might be to scan buffers
491 * explicitly.
.seg-scan.loop:
505 /* While the group remains buffered, scan to the limit of */
506 /* initialized objects in the buffer. Either it will be reached, */
507 /* or more objects will appear until the segment fills up and the */
508 /* buffer moves away. */
.seg-scan.finish:
520 /* If the group is unbuffered, or becomes so during scanning */
521 /* (e.g. if the forwarding buffer gets flushed) then scan to */
522 /* the limit of the segment. */
.seg-scan.lower:
540 /* The segment is no longer grey for this collection, so */
541 /* it no longer needs to be shielded. */
AMCScan
.scan:
556 * Searches for a group which is grey for the trace and scans it.
557 * If there aren't any, it sets the finished flag to true.
AMCReclaim
.reclaim:
603 * After a trace, destroy any groups which are still condemned for the
604 * trace, because they must be dead.
605 *
606 * .reclaim.grey: Note that this might delete things which are grey
607 * for other collections. This is OK, because we have conclusively
608 * proved that they are dead -- the other collection must have
609 * assumed they were alive. There might be a problem with the
610 * accounting of grey groups, however.
611 *
612 * .reclaim.buf: If a condemned group still has a buffer attached, we
613 * can't destroy it, even though we know that there are no live objects
614 * there. Even the object the mutator is allocating is dead, because
615 * the buffer is tripped.
AMCAccess
.access:
648 * This is effectively the read-barrier fault handler.
649 *
650 * .access.buffer: If the page accessed had and still has the
651 * forwarding buffer attached, then trip it. The group will now
652 * be black, and the mutator needs to access it. The forwarding
653 * buffer will be moved onto a fresh grey page.
654 *
655 * .access.error: @@@@ There really ought to be some error recovery.
656 *
657 * .access.multi: @@@@ It shouldn't be necessary to scan more than
658 * once. Instead, should use a multiple-fix thingy. This would
659 * require the ScanState to carry a _set_ of traces rather than
660 * just one.
OLD NOTES
Group Scanning

View file

@ -0,0 +1,383 @@
THE DESIGN OF THE AUTOMATIC MARK-AND-SWEEP POOL CLASS
design.mps.poolams
draft design
nickb 1997-08-14
INTRODUCTION:
This is the design of the AMS pool class.
.readership: MM developers.
.source: design.mps.buffer, design.mps.trace, design.mps.scan,
design.mps.action and design.mps.class-interface [none of these were actually
used -- pekka 1998-04-21]. No requirements doc [we need a req.mps that
captures the commonalities between the products -- pekka 1998-01-27].
Document History
.hist.0: Nick Barnes wrote down some notes on the implementation 1997-08-14.
Pekka P. Pirinen wrote the first draft design 1998-01-27.
.hist.1: Pekka edited on the basis of review.design.mps.poolams.0, and
redesigned the colour representation (results mostly in
analysis.non-moving-colour(0)).
.hist.2: Described subclassing and allocation policy. pekka 1999-01-04
OVERVIEW:
This document describes the design of the AMS pool class. The AMS pool is a
proof-of-concept design for a mark-sweep pool in the MPS. It's not meant to be
efficient, but it could serve as a model for an implementation of a more
advanced pool (such as EPVM).
REQUIREMENTS:
.req.mark-sweep: The pool must use a mark-and-sweep GC algorithm.
.req.colour: The colour representation should be as efficient as possible.
.req.incremental: The pool must support incremental GC.
.req.ambiguous: The pool must support ambiguous references to objects in it
(but ambiguous references into the middle of an object do not preserve the
object).
.req.format: The pool must be formatted, for generality.
.req.correct: The design and the implementation should be simple enough to be
seen to be correct.
.req.simple: Features not related to mark-and-sweep GC should initially be
implemented as simply as possible, in order to save development effort.
.not-req.grey: We haven't figured out how buffers ought to work with a grey
mutator, so we use .req.correct to allow us to design a pool that doesn't work
in that phase. This is acceptable as long as we haven't actually implemented
grey mutator collection.
ARCHITECTURE:
Subclassing
.subclass: Since we expect to have many mark-and-sweep pools, we build in some
protocol for subclasses to modify various aspects of the behaviour. Notably
there's a subclassable segment class, and a protocol for performing iteration.
Allocation
.align: We divide the segments in grains, each the size of the format
alignment. .alloc-bit-table: We keep track of allocated grains using a bit
table. This allows a simple implementation of allocation and freeing using the
bit table operators, satisfying .req.simple, and can simplify the GC routines.
Eventually, this should use some sophisticated allocation technique suitable
for non-moving automatic pools.
.buffer: We use buffered allocation, satisfying .req.incremental. The AMC
buffer technique is reused, although it is not suitable for non-moving pools,
but req.simple allows us to do that for now.
.extend: If there's no space in any existing segment, a new segment is
allocated. The actual class is allowed to decide the size of the new segment.
.no-alloc: Do not support PoolAlloc, because we can't support one-phase
allocation for a scannable pool (unless we disallow incremental collection).
For exact details, see design.mps.buffer.
.no-free: Do not support PoolFree, because automatic pools don't need explicit
free and having it encourages clients to use it (and therefore to have dangling
pointers, double frees, &c.)
Colours
.colour: Objects in a segment which is _not_ condemned (for some trace) take
their colour (for this trace) from the segment. .colour.object: Since we need
to implement a non-copying GC, we keep track of the colour of each object in a
condemned segment separately. For this, we use bit tables with a bit for each
grain. This format is fast to access, has better locality than mark bits in
the objects themselves, and allows cheap interoperation with the allocation bit
table. As to the details, we follow analysis.non-moving-colour(0), with the
the analysis.non-moving-colour.free.black option [why?]. .colour.alloc-table:
We choose to keep a separate allocation table, for generality.
.ambiguous.middle: We will allow ambiguous references into the middle of an
object (as required by .req.ambiguous), using the trick in
analysis.non-moving-colour.interior.ambiguous-only to speed up scanning.
.interior-pointer: Note that non-ambiguous interior pointers are outlawed.
.colour.alloc: Objects are allocated black. This is the most efficient
alternative for traces in the black mutator phase, and .not-req.grey means
that's sufficient. [Some day, we need to think about allocating grey or white
during the grey mutator phase.]
Scanning
.scan.segment: The tracer protocol requires (for segment barrier hits) that
there is a method for scanning a segment and turning all grey objects on it
black. This cannot be achieved with a single sequential sweep over the
segment, since objects that the sweep has already passed may become grey as
later objects are scanned. .scan.graph: For a non-moving GC, it is more
efficient to trace along the reference graph than segment by segment [it would
also allow passing type information from fix to scan]. Currently, the tracer
doesn't offer this option when it's polling for work.
.scan.stack: Tracing along the reference graph cannot be done by recursive
descent, because we can't guarantee that the stack won't overflow. We can,
however, maintain an explicit stack of things to trace, and fall back on
iterative methods (.scan.iter) when it overflows and can't be extended.
.scan.iter: As discussed in .scan.segment, when scanning a segment, we need to
ensure that there are no grey objects in the segment when the scan method
returns. We can do this by iterating a sequential scan over the segment until
nothing is grey (see .marked.scan for details). .scan.iter.only: Some
iterative method is needed as a fallback for the more advanced methods, and as
this is the simplest way of implementing the current tracer protocol, we will
start by implementing it as the only scanning method.
.scan.buffer: We do not scan between ScanLimit and Limit of a buffer (see
.iteration.buffer), as usual [design.mps.buffer should explain why this works,
but doesn't. Pekka 1998-02-11].
.fix.to-black: When fixing a reference to a white object, if the segment does
not refer to the white set, the object cannot refer to the white set, and can
therefore be marked as black immediately (rather than grey).
ANALYSIS:
[This section intentionally left blank.]
IDEAS:
[This section intentionally left blank.]
IMPLEMENTATION:
Colour
.colour.determine: Following the plan in .colour, if SegWhite(seg) includes the
trace, the colour of an object is given by the bit tables. Otherwise if
SegGrey(seg) includes the trace, all the objects are grey. Otherwise all the
objects are black.
.colour.bits: As we only have searches for runs of zero bits, we use two bit
tables, the non-grey and non-white tables, but this is hidden beneath a layer
of macros talking about grey and white in positive terms.
.colour.single: We have only implemented a single set of mark and scan tables,
so we can only condemn a segment for one trace at a time. This is checked for
in condemnation. If we want to do overlapping white sets, each trace needs its
own set of tables.
.colour.check: The grey&white state is illegal, and free objects must be not
grey and not white as explained in analysis.non-moving-colour.free.black.
Iteration
.iteration: Scan, reclaim and other operations need to iterate over all objects
in a segment. We abstract this into a single iteration function, even though
we no longer use it for reclaiming and rarely for scanning.
.iteration.buffer: Iteration skips directly from ScanLimit to Limit of a
buffer. This is because this area may contain partially-initialized and
uninitialized data, which cannot be processed. [ScanLimit is used for reasons
which are not documented in design.mps.buffer.] Since the iteration skips the
buffer, callers need to take the appropriate action, if any, on it.
Scanning Algorithm
.marked: Each segment has a 'marksChanged' flag, indicating whether anything in
it has been made grey since the last scan iteration (.scan.iter) started. This
flag only concerns the colour of objects with respect to the trace for which
the segment is condemned, as this is the only trace for which objects in the
segment are being made grey by fixing. Note that this flag doesn't imply that
there are grey objects in the segment, because the grey objects might have been
subsequently scanned and blackened.
.marked.fix: The marksChanged flag is set TRUE by AMSFix when an object is made
grey.
.marked.scan: AMSScan must blacken all grey objects on the segment, so it must
iterate over the segment until all grey objects have been seen. Scanning an
object in the segment might grey another one (.marked.fix), so the scanner
iterates until this flag is FALSE, setting it to FALSE before each scan. It is
safe to scan the segment even if it contains nothing grey.
.marked.scan.fail: If the format scanner returns failure (see
protocol.mps.scanning [is that the best reference?]), we abort the scan in the
middle of a segment. So in this case the marksChanged flag is set back to
TRUE, because we may not have blackened all grey objects.
.marked.unused: The marksChanged flag is meaningless unless the segment is
condemned. We make it FALSE in these circumstances.
.marked.condemn: Condemnation makes all objects in a segment either black or
white, leaving nothing grey, so it doesn't need to set the marksChanged flag
which must already be FALSE.
.marked.reclaim: When a segment is reclaimed, it can contain nothing marked as
grey, so the marksChanged flag must already be FALSE.
.marked.blacken: When the tracer decides not to scan, but to call PoolBlacken,
we know that any greyness can be removed. AMSBlacken does this and resets the
marksChanged flag, if it finds that the segment has been condemned.
.marked.clever: AMS could be clever about not setting the marksChanged flag, if
the fixed object is ahead of the current scan pointer. It could also keep low-
and high-water marks of grey objects, but we don't need to implement these
improvements at first.
Allocation
.buffer-init: We take one init arg to set the Rank on the buffer, just to see
how it's done.
.no-bit: As an optimization, we won't use the alloc bit table until the first
reclaim on the segment. Before that, we just keep a high-water mark.
.fill: AMSBufferFill takes the simplest approach: it iterates over the segments
in the pool, looking for one which can be used to refill the buffer.
.fill.colour: The objects allocated from the new buffer must be black for all
traces (.colour.alloc), so putting it on a black segment (meaning one where
neither SegWhite(seg) nor SegGrey(seg) include the trace, see
.colour.determine) is obviously OK. White segments (where SegWhite(seg)
includes the trace) are also fine, as we can use the colour tables to make it
black (we don't actually have to adjust the tables, since free grains have the
same colour table encoding as black, see .colour.object). At first glance, it
seems we can't put it on a segment that is grey for some trace (one where where
SegWhite(seg) doesn't include the trace, but SegGrey(seg) does), because the
new objects would become grey as the buffer's ScanLimit advanced. We could
switch the segment over to using colour tables, but this becomes very hairy
when multiple traces are happening, so in that case, we'd be better off either
not attaching to grey segments or allowing grey allocation, wasteful as it is
[@@@@ decide which].
.fill.slow: AMSBufferFill gets progressively slower as more segments fill up,
as it laboriously checks whether the buffer can be refilled from each segment,
by inspecting the allocation bitmap. This is helped a bit by keeping count of
free grains in each segment, but it still spends a lot of time iterating over
all the full segments checking the free size. Obviously, this can be much
improved (we could keep track of the largest free block in the segment and in
the pool, or we could keep the segments in some more efficient structure, or we
could have a real free list structure).
.fill.extend: If there's no space in any existing segment, the segSize method
is called to decide the size of the new segment to allocate. If that fails,
the code tries to allocate a segment that's just large enough to satisfy the
request.
.empty: AMSBufferEmpty makes the unused space free, since there's no reason not
to. We don't have to adjust the colour tables, since free grains have the same
colour table encoding as black, see .colour.object.
.reclaim.empty.buffer: Segments which after reclaim only contain a buffer could
be destroyed by trapping the buffer, but there's no point to this.
Initialization
.init: The initialization method AMSInit() takes one additional argument: the
format of objects allocated in the pool. The pool alignment is set equal to
the format alignment (see design.mps.align).
.init.internal: Subclasses call AMSInitInternal() to avoid the problems of
sharing va_list and emitting a superfluous PoolInitAMS event.
Condemnation
.action: We use PoolCollectAct to condemn the whole pool (except the buffers)
at once.
.condemn.buffer: Buffers are not condemned, instead they are coloured black, to
make sure that the objects allocated will be black, following .colour.alloc
(or, if you wish, because buffers are ignored like free space, so need the same
encoding).
.benefit.guess: The benefit computation is pulled out of a hat; any real pool
class will need a real benefit computation. It will return a positive value
when the allocated size of the pool is over one megabyte and more than twice
what it was when the last segment in this pool was reclaimed (we call this
lastReclaimedSize).
.benefit.repeat: We reset lastReclaimedSize when starting a trace in order to
avoid repeat condemnation (i.e., the next AMSBenefit returning 1.0 for the same
reason as the last). In the future we need to do better here.
Segment Merging and Splitting
.split-merge: We provide methods for splitting and merging AMS segments. The
pool implementation doesn't cause segments to be split or merged - but a
subclass might want to do this (see .stress.split-merge). The methods serve as
an example of how to implement this facility.
.split-merge.constrain: There are some additional constraints on what segments
may be split or merged:
.split-merge.constrain.align: Segments may only be split or merged at an
address which is aligned to the pool alignment as well as to the arena
alignment. .split-merge.constrain.align.justify: This constraint is implied by
the design of allocation and colour tables, which cannot represent segments
starting at unaligned addresses. The constraint only arises if the pool
alignment is larger than the arena alignment. There's no requirement to split
segments at unaligned addresses.
.split-merge.constrain.empty: The higher segment must be empty. I.e. the higher
segment passed to SegMerge must be empty, and the higher segment returned by
SegSplit must be empty. .split-merge.constrain.empty.justify: This constraint
makes the code significantly simpler. There's no requirement for a more complex
solution at the moment (as the purpose is primarily pedagogic).
.split-merge.fail: The split and merge methods are not proper anti-methods for
each other (see design.mps.seg.split-merge.fail.anti.no). Methods will not
reverse the side-effects of their counterparts if the allocation of the colour
and allocation bit tables should fail. Client methods which over-ride split and
merge should not be written in such a way that they might detect failure after
calling the next method, unless they have reason to know that the bit table
allocations will not fail.
TESTING:
.stress: There's a stress test, MMsrc!amsss.c, that does 800 KB of allocation,
enough for about three GCs. It uses a modified Dylan format, and checks for
corruption by the GC. Both ambiguous and exact roots are tested.
.stress.split-merge: There's also a stress test for segment splitting and
merging, MMsrc!segsmss.c. This is similar to amsss.c - but it defines a
subclass of AMS, and causes segments to be split and merged. Both buffered and
non-buffered segments are split / merged.
TEXT:
.addr-index.slow: Translating from an address to and from a grain index in a
segment uses macros such as AMSAddrIndex and AMSIndexAddr. These are slow
because they call SegBase on every translation.
.grey-mutator: To enforce the restriction set in .not-req.grey we check that
all the traces are flipped in AMSScan. It would be good to check in AMSFix as
well, but we can't do that, because it's called during the flip, and we can't
tell the difference between the flip and the grey mutator phases with the
current tracer interface.

View file

@ -0,0 +1,464 @@
AUTOMATIC WEAK LINKED
design.mps.poolawl
incomplete doc
drj 1997-03-11
INTRODUCTION
.readership: Any MPS developer
.intro: The AWL (Automatic Weak Linked) pool is used to manage Dylan Weak
Tables (see req.dylan.fun.weak). Currently the design is specialised for Dylan
Weak Tables, but it could be generalised in the future.
REQUIREMENTS
See req.dylan.fun.weak.
See meeting.dylan.1997-02-27(0) where many of the requirements for this pool
were first sorted out.
Must satisfy request.dylan.170123.
.req.obj-format: Only objects of a certain format need be supported. This
format is a subset of the Dylan Object Format. The pool uses the first slot in
the fixed part of an object to store an association. See
mail.drj.1997-03-11.12-05
DEFINITIONS
.def.grain: alignment grain, grain. A grain is a range of addresses where both
the base and the limit of the range are aligned and the size of range is equal
to the (same) alignment. In this context the alignment is the pool's alignment
(pool->alignment). The grain is the unit of allocation, marking, scanning, etc.
OVERVIEW
.overview:
.overview.ms: The pool is mark and sweep. .overview.ms.justify: Mark-sweep
pools are slightly easier to write (than moving pools), and there are no
requirements (yet) that this pool be high performance or moving or anything
like that. .overview.alloc: It is possible to allocate weak or exact objects
using the normal reserve/commit AP protocol. .overview.alloc.justify:
Allocation of both weak and exact objects is required to implement Dylan Weak
Tables. Objects are formatted; the pool uses format A. .overview.scan: The
pool handles the scanning of weak objects specially so that when a weak
reference is deleted the corresponding reference in an associated object is
deleted. The associated object is determined by using information stored in
the object itself (see .req.obj-format).
INTERFACE
.if.init: The init method takes one extra parameter in the vararg list. This
parameter should have type Format and be a format object that describes the
format of the objects to be allocated in this pool. The format should support
scan and skip methods. There is an additional restriction on the layout of
objects, see .req.obj-format.
.if.buffer: The BufferInit method takes one extra parameter in the vararg
list. This parameter should be either RankEXACT or RankWEAK. It determines
the rank of the objects allocated using that buffer.
DATASTRUCTURES
.sig: This signature for this pool will be 0x519bla3l (SIGPooLAWL)
.poolstruct: The class specific pool structure is
struct AWLStruct {
PoolStruct poolStruct;
Format format;
Shift alignShift;
ActionStruct actionStruct;
double lastCollected;
Serial gen;
Sig sig;
}
.poolstruct.format: The format field is used to refer to the object format.
The object format is passed to the pool during pool creation.
.poolstruct.alignshift: The alignShift field is the SizeLog2 of the pool's
alignment. It is computed and initialised when a pool is created. It is used
to compute the number of alignment grains in a segment which is the number of
bits need in the segment's mark and alloc bit table (see .awlseg.bt,
.awlseg.mark, and .awlseg.alloc below). @@ clarify
.poolstruct.actionStruct: Contains an Action which is used to participate in
the collection benefit protocol. See .fun.benefit AWLBenefit below for a
description of the algorithm used for determining when to collect.
.poolstruct.lastCollected: Records the time (using the mutator total allocation
clock, ie that returned by ArenaMutatorAllocSize) of the most recent call to
either AWLInit or AWLTraceBegin for this pool. So this is the time of the
beginning of the last collection of this pool. Actually this isn't true
because the pool can be collected without AWLTraceBegin being called (I think)
as it will get collected by being in the same zone as another pool/generation
that is being collected (which it does arrange to be, see the use of the gen
field in .poolstruct.gen below and .fun.awlsegcreate.where below).
.poolstruct.gen: This part of the mechanism by which the pool arranges to be in
a particular zone and arranges to be collected simulataneously with other
cohorts in the system. gen is the generation that is used in expressing a
generation preference when allocating a segment. The intention is that this
pool will get collected simulataneously with any other segments that are also
allocated using this generation preference (when using the VM arena, generation
preferences get mapped more or less to zones, each generation to a unique set
of zones in the ideal case). Whilst AWL is not generational it is expected
that this mechanism will arrange for it to be collected simultaneously with
some particular generation of AMC.
.poolstruct.gen.1: At the moment the gen field is set for all AWL pools to be 1.
.awlseg: The pool defines a segment class AWLSegClass, which is a subclass of
GCSegClass (see design.mps.seg.over.hierarchy.gcseg). All segments allocated by
the pool are instances of this class, and are of type AWLSeg, for which the
structure is
struct AWLSegStruct {
GCSegStruct gcSegStruct;
BT mark;
BT scanned;
BT alloc;
Count grains;
Count free;
Count singleAccesses;
AWLStatSegStruct stats;
Sig sig;
}
.awlseg.bt: The mark, alloc, and scanned fields are bit-tables (see
design.mps.bt). Each bit in the table corresponds to a a single alignment
grain in the pool.
.awlseg.mark: The mark bit table is used to record mark bits during a trace.
Condemn (see .fun.condemn below) sets all the bits of this table to zero. Fix
will read and set bits in this table. Currently there is only one mark bit
table. This means that the pool can only be condemned for one trace.
.awlseg.mark.justify: This is simple, and can be improved later when we want to
run more than one trace.
.awlseg.scanned: The scanned bit-table is used to note which objects have been
scanned. Scanning (see .fun.scan below) a segment will find objects that are
marked but not scanned, scan each object found and set the corresponding bits
in the scanned table.
.awlseg.alloc: The alloc bit table is used to record which portions of a
segment have been allocated. Ranges of bits in this table are set when a
buffer is attached to the segment. When a buffer is flushed (ie AWLBufferEmpty
is called) from the segment, the bits corresponding to the unused portion at
the end of the buffer are reset.
.awlseg.alloc.invariant: A bit is set in the alloc table <=> (the corresponding
address is currently being buffered || the corresponding address lies within
the range of an allocated object).
.awlseg.grains: The grains field is the number of grains that fit in the
segment. Strictly speaking this is not necessary as it can be computed from
SegSize and awl's alignment, however, precalculating it and storing it in the
segment makes the code simpler by avoiding lots of repeated calculations.
.awlseg.free: A conservative estimate of the number of free grains in the
segment. It is always guaranteed to be >= the number of free grains in the
segment, hence can be used during allocation to quickly pass over a segment.
Maintained by blah and blah. @@@@ Unfinished obviously.
FUNCTIONS
@@ How will pool collect? It needs an action structure.
External
.fun.init:
Res AWLInit(Pool pool, va_list arg);
AWLStruct has four fields, each one needs initializing.
.fun.init.poolstruct: The poolStruct field has already been initialized by
generic code (impl.c.pool).
.fun.init.format: The format will be copied from the argument list, checked,
and written into this field.
.fun.init.alignshift: The alignShift will be computed from the pool alignment
and written into this field.
.fun.init.sig: The sig field will be initialized with the signature for this
pool.
.fun.finish:
Res AWLFinish(Pool pool);
Iterates over all segments in the pool and destroys each segment (by calling
SegFree).
Overwrites the sig field in the AWLStruct. Finishing the generic pool
structure is done by the generic pool code (impl.c.pool).
.fun.alloc:
PoolNoAlloc will be used, as this class does not implement alloc.
.fun.free:
PoolNoFree will be used, as this class does not implement free.
.fun.fill:
Res AWLBufferFill(Seg *segReturn, Addr *baseReturn, Pool pool, Buffer buffer,
Size size);
This zips round all the the segments applying AWLSegAlloc to each segment that
has the same rank as the buffer. AWLSegAlloc attempts to find a free range, if
it finds a range then it may be bigger than the actual request, in which case
the remainder can be used to "fill" the rest of the buffer. If no free range
can be found in an existing segment then a new segment will be created (which
is at least large enough). The range of buffered addresses is marked as
allocated in the segment's alloc table.
.fun.empty:
void AWLBufferEmpty(Pool pool, Buffer buffer);
Locates the free portion of the buffer, that is the memory between the init and
the limit of the buffer and records these locations as being free in the
relevant alloc table. The segment that the buffer is pointing at (which
contains the alloc table that needs to be dinked with) is available via
BufferSeg.
.fun.benefit: The benefit returned is the total amount of mutator allocation
minus the lastRembemberedSize minus 10 Megabytes, so the pool becomes an
increasingly good candidate for collection at a constant (mutator allocation)
rate, crossing the 0 line when there has been 10Mb of allocation since the
(beginning of the) last collection. So it gets collected approximately every
10Mb of allocation. Note that it will also get collected by virtue of being in
the same zone as some AMC generation (assuming there are instantiated AMC
pools), see .poolstruct.gen above.
.fun.condemn:
Res AWLCondemn(Pool pool, Trace trace, Seg seg);
The current design only permits each segment to be condemned for one trace (see
.awlseg.mark). This function checks that the segment is not condemned for any
trace (seg->white == TraceSetEMPTY). The segment's mark bit-table is reset,
and the whiteness of the seg (seg->white) has the current trace added to it.
.fun.grey:
void AWLGrey(Pool pool, Trace trace, Seg seg);
if the segment is not condemned for this trace the segment's mark table is set
to all 1s and the segment is recorded as being grey.
.fun.scan:
Res AWLScan(ScanState ss, Pool pool, Seg seg);
.fun.scan.overview: The scanner performs a number of passes over the segment,
scanning each marked and unscanned (grey) object that is finds.
.fun.scan.overview.finish: It keeps perform a pass over the segment until it is
finished. .fun.scan.overview.finish.condition: A condition for finishing is
that no new marks got placed on objects in this segment during the pass.
.fun.scan.overview.finish.approximation: We use an even stronger condition for
finishing that assumes that scanning any object may introduce marks onto this
segment. It is finished when a pass results in scanning no objects (ie all
objects were either unmarked or both marked and scanned).
.fun.scan.overview.finished-flag: There is a flag called 'finished' which keeps
track of whether we should finish or not. We only ever finish at the end of a
pass. At the beginning of a pass the flag is set. During a pass if any
objects are scanned then the finished flags is reset. At the end of a pass if
the finished flag is still set then we are finished. No more passes take place
and the function returns.
.fun.scan.pass: A pass consists of a setup phase and a repeated phase.
.fun.scan.pass.buffer: The following assumes that in the general case the
segment is buffered; if the segment is not buffered then the actions that
mention buffers are not taken (they are unimportant if the segment is not
buffered).
.fun.scan.pass.p: The pass uses a cursor called 'p' to progress over the
segment. During a pass p will increase from the base address of the segment to
the limit address of the segment. When p reaches the limit address of the
segment, the pass in complete.
.fun.scan.pass.setup: p initially points to the base address of the segment.
.fun.scan.pass.repeat: The following comprises the repeated phase. The
repeated phase is repeated until the pass completion condition is true (ie p
has reached the limit of the segment, see .fun.scan.pass.p above and
.fun.scan.pass.repeat.complete below).
.fun.scan.pass.repeat.complete: if p is equal to the segment's limit then we
are done. We proceed to check whether any further passes need to be performed
(see .fun.scan.pass.more below).
.fun.scan.pass.repeat.free: if !alloc(p) (the grain is free) then increment p
and return to the beginning of the loop.
.fun.scan.pass.repeat.buffer: if p is equal to the buffer's ScanLimit (see
BufferScanLimit), then set p equal to the buffer's Limit (use BufferLimit) and
return to the beginning of the loop.
.fun.scan.pass.repeat.object-end: The end of the object is located using the
format->skip method.
.fun.scan.pass.repeat.object: if (mark(p) && !scanned(p)) then the object
pointed at is marked but not scanned, which means we must scan it, otherwise we
must skip it. .fun.scan.pass.repeat.object.dependent: To scan the object the
object we first have to determine if the object has a dependent object (see
.req.obj-format). .fun.scan.pass.repeat.object.dependent.expose: If it has a
dependent object then we must expose the segment that the dependent object is
on (only if the dependent object actually points to MPS managed memory) prior
to scanning and cover the segment subsequent to scanning.
.fun.scan.pass.repeat.object.dependent.summary: The summary of the dependent
segment must be set to RefSetUNIV to reflect the fact that we are allowing it
to be written to (and we don't know what gets written to the segment).
.fun.scan.pass.repeat.object.scan: The object is then scanned by calling the
format's scan method with base and limit set to the beginning and end of the
object (.fun.scan.scan.improve.single: A scan1 format method would make it
slightly simpler here). Then the finished flag is cleared and the bit in the
segment's scanned table is set.
.fun.scan.pass.repeat.advance: p is advanced past the object and we return to
the beginning of the loop.
.fun.scan.pass.more: At the end of a pass the finished flag is examined.
.fun.scan.pass.more.not: If the finished flag is set then we are done (see
.fun.scan.overview.finished-flag above), AWLScan returns.
.fun.scan.pass.more.so: Otherwise (the finished flag is reset) we perform
another pass (see .fun.scan.pass above).
.fun.fix:
Res AWLFix(Pool pool, ScanState ss, Seg seg, Ref *refIO);
ss->wasMarked is set to TRUE (clear compliance with
design.mps.fix.protocol.was-marked.conservative).
If the rank (ss->rank) is RankAMBIG then fix returns immediately unless the
reference is aligned to the pool alignment.
If the rank (ss->rank) is RankAMBIG then fix returns immediately unless the
referenced grain is allocated.
The bit in the marked table corresponding to the referenced grain will be
read. If it is already marked then fix returns. Otherwise (the grain is
unmakred), ss->wasMarked is set to FALSE, the remaining actions depend on
whether the rank (ss->rank) is Weak or not. If the rank is weak then the
reference is adjusted to 0 (see design.mps.weakness) and fix returns. If the
rank is something else then the mark bit corresponding to the referenced grain
is set, and the segment is greyed using TraceSegGreyen.
Fix returns.
.fun.reclaim:
void AWLReclaim(Pool pool, Trace trace, Seg seg);
This iterates over all allocated objects in the segment and frees objects that
are not marked.
When this iteration is complete the marked array is completely reset.
p point to base of segment
while(p < SegLimit(seg) {
if(!alloc(p)) { ++p;continue;}
q = skip(p) (ie q points to just past the object pointed at by p)
if !marked(p) free(p, q); free(p, q) consists of resetting the bits in the
alloc table from p to q-1 inclusive.
p = q
}
Reset the entire marked array using BTResRange.
.fun.reclaim.improve.pad: Consider filling free ranges with padding objects.
Now reclaim doesn't need to check that the objects are allocated before
skipping them. There may be a corresponding change for scan as well.
.fun.describe:
Res AWLDescribe(Pool pool, mps_lib_FILE *stream);
Internal:
.fun.awlsegcreate:
Res AWLSegCreate(AWLSeg *awlsegReturn, Size size);
Creates a segment of class AWLSegClass of size at least size.
.fun.awlsegcreate.size.round: size is rounded up to an ArenaAlign before
requesting the segment. .fun.awlsegcreate.size.round.justify: The arena
requires that all segment sizes are aligned to the ArenaAlign.
.fun.awlsegcreate.where: The segment is allocated using a generation
perference, using the generation number stored in the AWLStruct (the gen
field), see .poolstruct.gen above.
.fun.awlseginit:
Res awlSegInit(Seg seg, Pool pool, Addr base, Size size,
Bool reservoirPermit, va_list args)
Init method for AWLSegClass, called for SegAlloc whenever an AWLSeg is created
(see .fun.awlsegcreate above). .fun.awlseginit.tables: The segment's mark
scanned and alloc tables (see .awlseg.bt above) are allocated and initialised.
The segment's grains field is computed and stored.
.fun.awlsegfinish:
void awlSegFinish(Seg seg);
Finish method for AWLSegClass, called from SegFree. Will free the segment's
tables (see .awlseg.bt).
.fun.awlsegalloc:
Bool AWLSegAlloc(Addr *baseReturn, Addr *limitReturn, AWLSeg awlseg, AWL awl,
Size size);
Will search for a free block in the segment that is at least size bytes long.
The base address of the block is returned in *baseReturn, the limit of the
entire free block (which must be at least as large size and may be bigger) is
returned in *limitReturn. The requested size is converted to a number of
grains, BTFindResRange is called to find a run of this length in the alloc
bit-table (.awlseg.alloc). The return results (if it is successful) from
BTFindResRange are in terms of grains, they are converted back to addresses
before returning the relevant values from this function.
.fun.dependent-object:
Bool AWLDependentObject(Addr *objReturn, Addr parent);
This function abstracts the association between an object and its linked
dependent (see .req.obj-format). It currently assumes that objects are Dylan
Object formatted according to deisng.dylan.container (see
analsys.mps.poolawl.dependent.abstract for suggested improvements). An object
has a dependent object iff the 2nd word of the object (ie (((Word *)parent)[1])
) is non-NULL. The dependent object is the object referenced by the 2nd word
and must be a valid object.
This function assumes objects are in Dylan Object Format (see
design.dylan.container). It will check that the first word looks like a dylan
wrapper pointer. It will check that the wrapper indicates that the wrapper has
a reasonable format (namely at least one fixed field).
If the second word is NULL it will return FALSE.
If the second word is non-NULL then the contents of it will be assigned to
*objReturn, and it will return TRUE.
TEST
must create dylan objects.
must create dylan vectors with at least one fixed field.
must allocate weak thingies.
must allocate exact tables.
must link tables together.
must populate tables with junk.
some junk must die.
Use an LO pool and an AWL pool.
3 buffers. One buffer for the LO pool, one exact buffer for the AWL pool, one
weak buffer for the AWL pool.
Initial test will allocate one object from each buffer and then destroy all
buffers and pools and exit

200
mps/design/poollo/index.txt Normal file
View file

@ -0,0 +1,200 @@
LEAF OBJECT POOL CLASS
design.mps.poollo
incomplete doc
drj 1997-03-07
INTRODUCTION
.readership: Any MPS developer.
.intro: The Leaf Object Pool Class (LO for short) is a pool class developed for
DylanWorks. It is designed to manage objects that have no references (leaf
objects) such as strings, bit tables, etc. It is a garbage collected pool (in
that objects allocated in the pool are automatically reclaimed when they are
discovered to be unreachable.
[Need to sort out issue of alignment. Currently lo grabs alignment from
format, almost certainly "ought" to use the greater of the format alignment and
the MPS_ALIGN value -- @@ drj 1997-07-02]
DEFINITIONS
.def.leaf: A "leaf" object is an object that contains no references, or an
object all of whose references refer to roots. That is, any references that
the object has must refer to a priori alive objects that are guaranteed not to
move, hence the references do not need fixing.
.def.grain: A grain (of some alignment) is a contiguous aligned area of memory
of the smallest size possible (which is the same size as the alignment).
REQUIREMENTS
.req.source: See req.dylan.fun.obj.alloc and req.dylan.prot.ffi.access.
.req.leaf: The pool must manage formatted leaf objects (see .def.leaf above for
a defintion). This is intended to encompass Dylan and C leaf objects. Dylan
leaf objects have a reference to their wrapper, but are still leaf objects (in
the sense of .def.leaf) because the wrapper will be a root.
.req.nofault: The memory caontaining objects managed by the pool must not be
protected. The client must be allowed to access these objects without using
the MPS trampoline (the exception mechanism, q.v.).
OVERVIEW
.overview:
.overview.ms: The LO Pool is a non-moving mark-and-sweep collector.
.overview.ms.justify: mark-and-sweep pools are simpler than moving pools.
.overview.alloc: Objects are allocated in the pool using the reserve commit
protocol on allocation points. .overview.format: The pool is formatted. The
format of the objects in the pool is specified at instantiation time, using an
format object derived from a format A variant (using variant A is overkill, see
.if.init below) (see design.mps.format for excuse about calling the variant
'variant A').
INTERFACE
.if.init:
.if.init.args: The init method for this class takes one extra parameter in the
vararg parameter list. .if.init.format: The extra parameter should be an
object of type Format and should describe the format of the objects that are to
be allocated in the pool. .if.init.format.use: The pool uses the skip and
alignment slots of the format. The skip method is used to determine the length
of objects (during reclaim). The alignment field is used to determine the
granularity at which memory should be managed. .if.init.format.a: Currently
only format variant A is supported though clearley that is overkill as only
skip and alignment are used.
DATASTRUCTURES
.sig: The signature for the LO Pool Class is 0x51970b07 (SIGLOPOoL).
.poolstruct: The class specific pool structure is:
typedef struct LOStruct {
PoolStruct poolStruct; /* generic pool structure */
Format format; /* format for allocated objects */
Shift alignShift;
Sig sig; /* impl.h.misc.sig */
} LOStruct;
.poolstruct.format: This is the format of the objects that are allocated in the
pool.
.poolstruct.alignShift: This is shift used in alignment computations. It is
SizeLog2(pool->alignment). It can be used on the right of a shift operator (<<
or >>) to convert between a number of bytes and a number of grains.
.loseg: Every segment is an instance of segment class LOSegClass, a subclass of
GCSegClass, and is an object of type LOSegStruct.
.loseg.purpose: The purpose of the LOSeg structure is to associate the bit
tables used for recording allocation and mark information with the segment.
.loseg.decl: The declaration of the structure is as follows:
typedef struct LOSegStruct {
GCSegStruct gcSegStruct; /* superclass fields must come first */
LO lo; /* owning LO */
BT mark; /* mark bit table */
BT alloc; /* alloc bit table */
Count free; /* number of free grains */
Sig sig; /* impl.h.misc.sig */
} LOSegStruct;
.loseg.sig: The signature for a loseg is 0x519705E9 (SIGLOSEG).
.loseg.lo: The lo field points to the LO structure that owns this segment.
.loseg.bit: Bit Tables (see design.mps.bt) are used to record allocation and
mark information. This is relatively straightforward, but might be inefficient
in terms of space in some circumstances.
.loseg.mark: This is a Bit Table that is used to mark objects during a trace.
Each grain in the segment is associated with 1 bit in this table. When LOFix
(see .fun.fix below) is called the address is converted to a grain within the
segment and the corresponding bit in this table is set.
.loseg.alloc: This is a Bit Table that is used to record which addresses are
allocated. Addresses that are allocated and are not buffered have their
corresponding bit in this table set. If a bit in this table is reset then
either the address is free or is being buffered.
.loseg.diagram: The following diagram is now obsolete. It's also not very
interesting - but I've left the sources in case anyone ever gets around to
updating it. tony 1999-12-16
FUNCTIONS
External
.fun.init:
.fun.destroy:
.fun.buffer-fill:
[explain way in which buffers interact with the alloc table and how it could be
improved]
.fun.buffer-empty:
.fun.condemn:
.fun.fix:
static Res LOFix(Pool pool, ScanState ss, Seg seg, Ref *refIO)
[sketch]
Fix treats references of most ranks much the same. There is one mark table
that records all marks. A reference of rank RankAMBIG is first checked to see
if it is aligned to the pool alignment and discarded if not. The reference is
converted to a grain number within the segment (by subtracting the segments'
base from the refrence and then dividing by the grain size). The bit (the one
corresponding to the grain number) is set in the mark table. Exception, for a
weak reference (rank is RankWEAK) the mark table is checked and the reference
is fixed to 0 if this address has not been marked otherwise nothing happens.
Note that there is no check that the reference refers to a valid object
boundary (which wouldn't be a valid check in the case of ambiguous references
anyway).
.fun.reclaim:
static void LOReclaim(Pool pool, Trace trace, Seg seg)
LOReclaim derives the loseg from the seg, and calls loSegReclaim (see
.fun.segreclaim below).
Internal
.fun.segreclaim:
static void loSegReclaim(LOSeg loseg, Trace trace)
[sketch]
for all the contiguous allocated regions in the segment it locates the
boundaries of all the objects in that region by repeatedly skipping (calling
format->skip) from the beginning of the region (the beginning of the region is
guaranteed to coincide with the beginning of an object). For each object it
examines the bit in the mark bit table that corresponds to the beginning of the
object. If that bit is set then the object has been marked as a result of a
previous call to LOFix, the object is preserved by doing nothing. If that bit
is not set then the object has not been marked and should be reclaimed; the
object is reclaimed by resetting the appropriate range of bits in the segment's
free bit table.
[special things happen for buffered segments]
[explain how the marked variable is used to free segments]
ATTACHMENT
"LOGROUP.CWK"

View file

@ -0,0 +1,16 @@
THE DESIGN OF THE MANUAL FIXED SMALL MEMORY POOL CLASS
design.mps.poolmfs
incomplete design
richard 1996-11-07
OVERVIEW:
MFS stands for "Manual Fixed Small". The MFS Pool Class manages objects that
are of a fixed size. It is intended to only manage small objects efficiently.
Storage is recycled manually by the client programmer.
A particular instance of an MFS Pool can manage objects only of a single size,
but different instances can manage objects of different sizes. The size of
object that an instance can manage is declared when the instance is created.

View file

@ -0,0 +1,515 @@
GUARDIAN POOLCLASS
design.mps.poolmrg
incomplete doc
drj 1997-02-03
INTRODUCTION
.readership: Any MPS developer.
.intro: This is the design of the Guardian PoolClass. The Guardian PoolClass
is part of the MPS. The Guardian PoolClass is internal to the MPS (has no
client interface) and is used to implement finalization.
.source: Some of the techniques in paper.dbe93 ("Guardians in a
Generation-Based Garbage Collector") were used in this design. Some analysis
of this design (including various improvements and some more in-depth
justification) is in analysis.mps.poolmrg. That document should be understood
before changing this document.
It is also helpful to look at design.mps.finalize and design.mps.message.
GOALS
.goal.final: The Guardian Pool should support all requirements pertaining to
finalization.
REQUIREMENTS
.req: We have only one requirement pertaining to finalization:
req.dylan.fun.finalization: Support the Dylan language-level implementation of
finalized objects: objects are registered, and are finalized in random order
when they would otherwise have died. Cycles are broken at random places.
There is no guarantee of promptness.
.req.general: However, finalization is a very common piece of functionality
that is provided by (sophisticated) memory managers, so we can expect other
clients to request this sort of functionality.
.anti-req: Is it required that the Guardian Pool return unused segments to the
arena? (PoolMFS does not do this) (PoolMRG will not do this in its initial
implementation)
TERMINOLOGY
.def.mrg: MRG: The Pool Class's identifier will be MRG. This stands for
"Manual Rank Guardian". The pool is manually managed and implements guardians
for references of a particular rank (currently just final).
.def.final.ref: final reference: A reference of rank final (see
design.mps.type.rank).
.def.final.object: finalizable object: An object is finalizable with respect to
a final reference if, since the creation of that reference, there was a point
in time when no references to the object of lower (that is, stronger) rank were
reachable from a root.
.def.final.object.note: Note that this means an object can be finalizable even
if it is now reachable from the root via exact references.
.def.finalize: finalize: To finalize an object is to notify the client that the
object is finalizable. The client is presumed to be interested in this
information (typically it will apply some method to the object).
.def.guardian: guardian: An object allocated in the Guardian Pool. A guardian
contains exactly one final reference, and some fields for the pool's internal
use. Guardians are used to implement a finalization mechanism.
OVERVIEW
.over: The Guardian Pool Class is a PoolClass in the MPS. It is intended to
provide the functionality of "finalization".
.over.internal: The Guardian PoolClass is internal to the MPM, it is not
intended to have a client interface. Clients are expected to access the
functionality provided by this pool (finalization) using a separate MPS
finalization interface (design.mps.finalize).
.over.one-size: The Guardian Pool manages objects of a single certain size,
each object contains a single reference of rank final.
.over.one-size.justify: This is all that is necessary to meet our requirements
for finalization. Whenever an object is registered for finalization, it is
sufficient to create a single reference of rank final to it.
.over.queue: A pool maintains a queue of live guardian objects, called (for
historical reasons) the "entry" queue. .over.queue.free: The pool also
maintains a queue of free guardian objects called the "free" queue.
.over.queue.exit.not: There used to be an "exit" queue, but this is now
historical and there shouldn't be any current references to it.
.over.alloc: When guardians are allocated, they are placed on the entry queue.
Guardians on the entry queue refer to objects that have not yet been shown to
be finalizable (either the object has references of lower rank than final to
it, or the MPS has not yet got round to determining that the object is
finalizable). .over.message.create: When a guardian is discovered to refer to
a finalizable object it is removed from the entry queue and becomes a message
on the space's queue of messages. .over.message.deliver: When the MPS client
receives the message the message system arranges for the message to be
destroyed and the pool reclaims the storage associated with the
guardian/message.
.over.scan: When the pool is scanned at rank final each reference will be
fixed. If the reference is to an object that was in old arena (before the
fix), then the object must now be finalizable. In this case the containing
guardian will be removed from the entry queue and posted as a message.
.over.scan.justify: The scanning process is a crucial step necessary for
implementing finalization. It is the means by which the MPS detects that
objects are finalizable.
.over.message: PoolClassMRG implements a MessageClass (see
design.mps.message). All the messages are of one MessageType. This type is
MessageTypeFinalization. Messages are created when objects are discovered to
be finalizable and destroyed when the MPS client has received the message.
.over.message.justify: Messages provide a means for the MPS to communicate with
its client. Notification of finalization is just such a communication.
Messages allow the MPS to inform the client of finalization events when it is
convenient for the MPS to do so (i.e. not in PageFault context).
.over.manual: Objects in the Guardian Pool are manually managed.
.over.manual.alloc: They are allocated (by ArenaFinalize when objects are
registered for finalization. .over.manual.free: They are freed when the
associated message is destroyed.
.over.manual.justify: The lifetime of a guardian object is very easy to
determine so manual memory management is appropriate.
PROTOCOLS
Object Registration
.protocol.register: There is a protocol by which objects can be registered for
finalization. This protocol is handled by the arena module on behalf of
finalization. see design.mps.finalize.int.finalize.
Finalizer Execution
.protocol.finalizer: If an object is proven to be finalizable then a message to
this effect will eventually be posted. A client can receive the message,
determine what to do about it, and do it. Typically this would involve calling
the finalization method for the object, and deleting the message. Once the
message is deleted, the object may become recyclable.
Setup / Destroy
.protocol.life: An instance of PoolClassMRG is needed in order to support
finalization, it is called the "final" pool and is attached to the arena (see
design.mps.finalize.int.arena.struct). .protocol.life.birth: The final pool is
created lazily by ArenaFinalize. .protocol.life.death: The final pool is
destroyed during ArenaDestroy.
DATA STRUCTURES
.guardian:
The guardian
.guardian.over: A guardian is an object used to manage the references and other
datastructures that are used by the pool in order to keep track of which
objects are registered for finalization, which ones have been finalized, and so
on. .guardian.state: A guardian can be in one of four states:
.guardian.state.enum: The states are Free, Prefinal, Final, PostFinal (referred
to as MRGGuardianFree, etc. in the implementation). .guardian.state.free: The
guardian is free, meaning that it is on the free list for the pool and
available for allocation. .guardian.state.prefinal: The guardian is allocated,
and refers to an object that has not yet been discovered to be finalizable.
.guardian.state.final: The guardian is allocated, and refers to an object that
has been shown to be finalizable; this state corresponds to the existence of a
message. .guardian.state.postfinal: This state is only used briefly and is
entirely internal to the pool; the guardian enters this state just after the
associated message has been destroyed (which happens when the client receives
the message) and will be freed immediately (whereupon it will enter the Free
state). This state is used for checking only (so that MRGFree can check that
only guardians in this state are being freed).
.guardian.life-cycle: Guardians go through the following state life-cycle:
Free -> Prefinal -> Final -> Postfinal -> Free.
.guardian.two-part: A guardian is a structure consisting abstractly of a link
part and a reference part. Concretely, the link part will be a LinkPartStruct,
and the reference part will be a Word. The link part is used by the pool, the
reference part forms the object visible to clients of the pool. The reference
part is the reference of Rank FINAL that refers to objects registered for
finalization and is how the MPS detects finalizable objects.
.guardian.two-part.union: The LinkPartStruct is a discriminated union of a
RingStruct and a MessageStruct. The RingStruct is used when the guardian is
either Free or Prefinal. The MessageStruct is used when the guardian is
Final. Neither part of the union is used when the guardian is in the Postfinal
state.
.guardian.two-part.justify: This may seem a little profligate with space, but
this is okay as we are not required to make finalization extremely space
efficient.
.guardian.parts.separate: The two parts will be stored in separate segments.
.guardian.parts.separate.justify: This is so that the data structures the pool
uses to manage the objects can be separated from the objects themselves. This
avoids the pool having to manipulate data structures that are on shielded
segments (analysis.mps.poolmrg.hazard.shield).
.guardian.assoc: The nth (from the beginning of the segment) ref part in one
segment will correspond with the nth link part in another segment. The
association between the two segments will be managed by the additional fields
in pool-specific segment subclasses (see .mrgseg). .guardian.ref: Guardians
that are either Prefinal or Final are live and have valid references (possibly
NULL) in their ref parts. Guardians that are free are dead and always have
NULL in their ref parts (see .free.overwrite and .scan.free)
.guardian.ref.free: When freeing an object, it is a pointer to the reference
part that will be passed (internally in the pool).
.guardian.init: Guardians are initialized when the pool is grown
(.alloc.grow). The initial state has the ref part NULL and the link part is
attached to the free ring. Freeing an object returns a guardian to its initial
state.
.poolstruct:
The Pool structure, MRGStruct will have
.poolstruct.entry: the head of the entry queue.
.poolstruct.exit: the head of the exit queue.
.poolstruct.free: a free list.
.poolstruct.rings: The entry queue, the exit queue, and the free list will all
use Rings. Each Ring will be maintained using the link part of the guardian.
.poolstruct.rings.justify: This is because Rings are convenient to use and are
well tested. It is possible to implement all three lists using a singly linked
list, but the saving is certainly not worth making at this stage.
.poolstruct.refring: a ring of "ref" segments in use for links or messages (see
.mrgseg.ref.mrgring below).
.poolstruct.extend: a precalculated extendby field (see .init.extend). This
value is used to determine how large a segment should be requested from the
Arena for the reference part segment when the pool needs to grow (see
.alloc.grow.size). .poolstruct.extend.justify: Calculating a reasonable value
for this once and remembering it simplifies the allocation (.alloc.grow).
.poolstruct.init: poolstructs are initialized once for each pool instance by
MRGInit (.init). The initial state has all the rings initialized to singleton
rings, and the extendBy field initialized to some value (see .init.extend).
.mrgseg:
The pool defines two segment subclasses: MRGRefSegClass and MRGLinkSegClass.
Segments of the former class will be used to store the ref parts of guardians,
segments of the latter will be used to store the link parts of guardians (see
.guardian.two-part). Segments are always allocated in pairs, with one of each
class (by function MRGSegPairCreate). Each segment contains a link to its pair.
.mrgseg.ref: MRGRefSegClass is a subclass of GCSegClass. Instances are of type
MRGRefSeg, and contain:
.mrgseg.ref.mrgring: a field for the ring of ref part segments in the pool.
.mrgseg.ref.linkseg: a pointer to the paired link segment.
.mrgseg.ref.grey: a set describing the greyness of the segment for each trace.
.mrgseg.ref.init: A segment is created and initialized once every time the pool
is grown (.alloc.grow). The initial state has the segment ring node
initialized and attached to the pool's segment ring, the linkseg field points
to the relevant link segment, the grey field is initialized such that the
segment is not grey for all traces.
.mrgseg.link: MRGLinkSegClass is a subclass of SegClass. Instances are of type
MRGLinkSeg, and contain:
.mrgseg.link.refseg: a pointer to the paired ref segment. This may be NULL
during initialization, while the pairing is being established.
.mrgseg.link.init: The initial state has the linkseg field pointing to the
relevant ref segment.
FUNCTIONS
.check: MRGCheck
Will check the signatures, the class, and each field of the MRGStruct. Each
field is checked as being appropriate for its type. .check.justify: There are
no non-trivial invariants that can be easily checked.
.alloc: [these apply to MRGRegister now. - Pekka 1997-09-19]
.alloc.grow: If the free list is empty then two new segments will be allocated
and the free list filled up from them (note that the reference fields of the
new guardians will need to be overwritten with NULL, see .free.overwrite)
.alloc.grow.size: The size of the reference part segment will be the pool's
extendBy (.poolstruct.extend) value. The link part segment will be whatever
size is necessary to accommodate N link parts, where N is the number of
reference parts that fit in the reference part segment.
.alloc.error: If any of the requests for more resource (there are two; one for
each of two segments) fail then the successful requests will be retracted and
the result code from the failing request will be returned.
.alloc.pop: MRGAlloc will pop a ring node off the free list, and add it to the
entry queue.
.free: MRGFree
MRGFree will remove the guardian from the message queue and add it to the
free list. .free.push: The guardian will simply be added to the front of the
free list (i.e. no keeping the free list in address order or anything like
that). .free.inadequate: No attempt will be made to return unused free
segments to the Arena (although see analysis.mps.poolmrg.improve.free.* for
suggestions).
.free.overwrite:
MRGFree also writes over the reference with NULL. .free.overwrite.justify:
This is so that when the segment is subsequently scanned (.scan.free), the
reference that used to be in the object is not accidentally fixed.
.init: MRGInit
Has to initialize the two queues, the free ring, the ref ring, and the
extendBy field. .init.extend: The extendBy field is initialized to one
ArenaAlign() (usually a page). .init.extend.justify: This is adequate as the
pool is not expected to grow very quickly.
.finish: MRGFinish
Iterate over all the segments, returning all the segments to the Arena.
.scan: MRGScan
.scan.trivial: Scan will do nothing (i.e. return immediately) if the tracing
rank is anything other than final. [This optimization is missing.
impl.c.trace.scan.conservative is not a problem because there are no faults on
these segs, because there are no references into them. But that's why
TraceScan can't do it. - Pekka 1997-09-19] .scan.trivial.justify: If the
rank is lower than final then scanning is detrimental, it will only delay
finalization. If the rank is higher than final there is nothing to do, the
pool only contains final references.
.scan.guardians: Scan will iterate over all guardians in the segment. Every
guardian's reference will be fixed (.scan.free: note that guardians that are on
the free list have NULL in their reference part). .scan.wasold: If the object
referred to had not been fixed previously (i.e. was unmarked) then the object
is not referenced by a reference of a lower rank (than FINAL) and hence is
finalizable. .scan.finalize: The guardian will be finalized. This entails
moving the guardian from state Prefinal to Final; it is removed from the entry
queue and initialized as a message and posted on the arena's message queue.
.scan.finalize.idempotent: In fact this will only happen if the guardian has
not already been finalized (which is determined by examining the state of the
guardian).
.scan.unordered: Because scanning occurs a segment at a time, the order in
which objects are finalized is "random" (it cannot be predicted by considering
only the references between objects registered for finalization). See
analysis.mps.poolmrg.improve.semantics for how this can be improved.
.scan.unordered.justify: Unordered finalization is all that is required.
(see analysis.mps.poolmrg.improve.scan.nomove for a suggested improvement that
avoids redundant unlinking and relinking).
.describe: MRGDescribe
Will print out the usual blurb.
Will iterate along each of the entry and exit queues and printout the
guardians in each. The location of the guardian and the value of the reference
in it will be printed out.
.functions.unused: BufferInit, BufferFill, BufferEmpty, BufferFinish,
TraceBegin, Condemn, Fix, Reclaim, TraceEnd, Benefit.
All of these will be unused.
.functions.trivial: The Grey method of the pool class will be PoolTrivGrey,
this pool has no further bookkeeping to perform for grey segments.
TRANSGRESSIONS
.trans.no-finish: The MRG pool does not trouble itself to tidy up its internal
rings properly when being destroyed.
.trans.free-seg: No attempt is made to release free segments to the arena. A
suggested strategy for this is as follows:
- Add a count of free guardians to each segment, and maintain it in
appropriate places.
- Add a free segment ring to the pool.
- In MRGRefSegScan, if the segment is entirely free, don't scan it, but
instead detach its links from the free ring, and move the segment to the free
segment ring.
- At some appropriate point (such as the end of MRGAlloc), destroy free
segments.
- In MRGAlloc, if there are no free guardians, check the free segment ring
before creating a new pair of segments.
Note that this algorithm would give some slight measure of segment hysteresis.
It is not the place of the pool to support general segment hysteresis.
FUTURE
.future.array: In future, for speed or simplicity, this pool could be rewritten
to use an array. See mail.gavinm.1997-09-04.13-08(0).
TESTS
.test: [This section is utterly out of date. -- Pekka 1997-09-19] The test
impl.c.finalcv is similar to the weakness test (see design.mps.weakness,
impl.c.weakcv [???]).
Functionality
This is the functionality to be tested:
.fun.alloc: Can allocate objects.
.fun.free: Can free objects that were allocated.
.prot.write: Can write a reference into an allocated object.
.prot.read: Can read the reference from an allocated object.
.promise.faithful: A reference stored in an allocated object will continue to
refer to the same object.
.promise.live: A reference stored in an allocated object will preserve the
object referred to.
.promise.unreachable: Any objects referred to in finalization messages are
not (at the time of reading the message) reachable via a chain of ambiguous or
exact references. (we will not be able to test this at first as there is no
messaging interface)
.promise.try: The Pool will make a "good faith" effort to finalize objects
that are not reachable via a chain of ambiguous or exact references.
Attributes
The following attributes will be tested:
.attr.none: There are no attribute requirements.
Implementation
[New test]
new test will simply allocate a number of objects in the AMC pool and finalize
each one, throwing away the reference to the objects. Churn.
.test.mpm: The test will use the MPM interface (impl.h.mpm).
.test.mpm.justify: This is because it is not intended to provide an MPS
interface to this pool directly, and the MPS interface to finalization has not
been written yet (impl.h.mps). .test.mpm.change: Later on it may use the MPS
interface, in which case, where the following text refers to allocating objects
in the MRG pool it will need adjusting.
.test.two-pools: The test will use two pools, an AMC pool, and an MRG pool.
.test.alloc: A number of objects will be allocated in the MRG pool.
.test.free: They will then be freed. This will test .fun.alloc and .fun.free,
although not very much.
.test.rw.a: An object, 'A', will be allocated in the AMC pool, a reference to
it will be kept in a root. .test.rw.alloc: A number of objects will be
allocated in the MRG pool. .test.rw.write: A reference to A will be written
into each object. .test.rw.read: The reference in each object will be read and
checked to see if it refers to A. .test.rw.free: All the objects will be
freed. .test.rw.drop: The reference to A will be dropped. This will test
.prot.write and .prot.read.
.test.promise.fl.alloc: A number of objects will be allocated in the AMC
pool. .test.promise.fl.tag: Each object will be tagged uniquely.
.test.promise.fl.refer: a reference to it will be stored in an object allocated
in the MRG pool. .test.promise.fl.churn: A large amount of garbage will be
allocated in the AMC pool. Regularly, whilst this garbage is being allocated,
a check will be performed that all the objects allocated in the MRG pool refer
to valid objects and that they still refer to the same objects. All objects
from the MRG pool will then be freed (thus dropping all references to the AMC
objects). This will test .promise.faithful and .promise.live.
.test.promise.ut.not: The following part of the test has not implemented.
This is because the messaging system has not yet been implemented.
.test.promise.ut.alloc: A number of objects will be allocated in the AMC
pool. .test.promise.ut.refer: Each object will be referred to by a root and
also referred to by an object allocated in the MRG pool. .test.promise.ut.drop
: References to a random selection of the objects from the AMC pool will be
deleted from the root. .test.promise.ut.churn: A large amount of garbage will
be allocated in the AMC pool. .test.promise.ut.message: The message interface
will be used to receive finalization messages. .test.promise.ut.final.check:
For each finalization message received it will check that the object referenced
in the message is not referred to in the root. .test.promise.ut.nofinal.check:
After some amount of garbage has been allocated it will check to see if any
objects are not in the root and haven't been finalized. This will test
.promise.unreachable and .promise.try.
NOTES
.access.inadequate: PoolAccess will scan segments at Rank Exact. Really it
should be scanned at whatever the minimum rank of all grey segments is (the
trace rank phase), however there is no way to find this out. As a consequence
we will sometimes scan pages at Rank exact when the pages could have been
scanned at Rank final. This means that finalization of some objects may
sometimes get delayed.

View file

@ -0,0 +1,12 @@
THE DESIGN OF THE MANUAL VARIABLE MEMORY POOL CLASS
design.mps.poolmv
incomplete design
richard 1995-08-25
IMPLEMENTATION:
.lost: It is possible for MV to "lose" memory when freeing an objects. This
happens when an extra block descriptor is needed (ie the interior of a block is
being freed) and the call to allocate the block fails.

View file

@ -0,0 +1,760 @@
THE DESIGN OF A NEW MANUAL-VARIABLE MEMORY POOL CLASS
design.mps.poolmv2
draft design
P T Withington 1998-02-13
INTRODUCTION:
This is a second-generation design for a pool that manually manages
variable-sized objects. It is intended as a replacement for poolmv (except in
its control pool role) and poolepdl, and it is intended to satisfy the
requirements of the Dylan "misc" pool and the product malloc/new drop-in
replacement.
[This form should include these fields, rather than me having to create them
"by hand"]
.readership: MM developers
.source: req.dylan(6), req.epcore(16), req.product(2)
.background: design.mps.poolmv(0), design.mps.poolepdl(0),
design.product.soft.drop(0), paper.wil95(1), paper.vo96(0), paper.grun92(1),
paper.beck82(0), mail.ptw.1998-02-25.22-18(0)
.hist.-1: Initial email discussion mail.ptw.1998-02-04.21-27(0), ff.
.hist.0: Draft created 1998-02-13 by P. T. Withington from email RFC
mail.ptw.1998-02-12.03-36, ff.
.hist.1: Revised 1998-04-01 in response to email RFC
mail.ptw.1998-03-23.20-43(0), ff.
.hist.2: Revised 1998-04-15 in response to email RFC
mail.ptw.1998-04-13.21-40(0), ff.
.hist.3: Erroneously incremented version number
.hist.4: Revised 1998-05-06 in response to review review.design.mps.poolmv2.2
(0)
DEFINITIONS
.def.alignment: Alignment is a constraint on an object's address, typically to
be a power of 2 (see also, glossary.alignment )
.def.bit-map: A bitmap is a boolean-valued vector (see also, glossary.bitmap ).
.def.block: A block is a contiguous extent of memory. In this document, block
is used to mean a contiguous extent of memory managed by the pool for the pool
client, typically a subset of a segment (compare with .def.segment).
.def.cartesian-tree: A cartesian tree is a binary tree ordered by two keys
(paper.stephenson83(0)).
.def.crossing-map: A mechanism that supports finding the start of an object
from any address within the object, typically only required on untagged
architectures (see also, glossary.crossing.map ).
.def.footer: A block of descriptive information describing and immediately
following another block of memory (see also .def.header).
.def.fragmentation: Fragmented memory is memory reserved to the program but not
usable by the program because of the arrangement of memory already in use (see
also, glossary.fragmentation ).
.def.header: A block of descriptive information describing and immediately
preceding another block of memory (see also, glossary.in-band.header ).
.def.in-band: From "in band signalling", when descriptive information about a
data structure is stored in the data structure itself (see also,
glossary.in-band.header ).
.def.out-of-band: When descriptive information about a data structure is stored
separately from the structure itself (see also, glossary.out-of-band.header ).
.def.refcount: A refcount is a count of the number of users of an object (see
also, glossary.reference.count ).
.def.segment: A segment is a contiguous extent of memory. In this document,
segment is used to mean a contiguous extent of memory managed by the MPS arena
(design.mps.arena(1)) and subdivided by the pool to provide blocks (see
.def.block) to its clients.
.def.splay-tree: A splay tree is a self-adjusting binary tree (paper.st85(0),
paper.sleator96(0)).
.def.splinter: A splinter is a fragment of memory that is too small to be
useful (see also, glossary.splinter )
.def.subblock: A subblock is a contiguous extent of memory. In this document,
subblock is used to mean a contiguous extent of memory manage by the client for
its own use, typically a subset of a block (compare with .def.block).
ABBREVIATIONS
.abbr.abq: ABQ = Available Block Queue
.abbr.ap: AP = Allocation Point
.abbr.cbs: CBS = Coalescing Block Structure
.abbr.mps: MPS = Memory Pool System
.abbr.mv: MV = Manual-Variable
.abbr.ps: PS = PostScript
OVERVIEW:
mv2 is intended to satisfy the requirements of the clients that need
manual-variable pools, improving on the performance of the existing
manual-variable pool implementations, and reducing the duplication of code that
currently exists. The expected clients of mv2 are: Dylan (currently for its
misc pool), EP (particularly the dl pool, but all pools other than the PS
object pool), and Product (initially the malloc/new pool, but also other manual
pool classes).
REQUIREMENTS:
.req.cat: Requirements are categorized per guide.req(2).
.req.risk: req.epcore(16) is known to be obsolete, but the revised document has
not yet been accepted.
Critical Requirements
.req.fun.man-var: The pool class must support manual allocation and freeing of
variable-sized blocks (source: req.dylan.fun.misc.alloc,
req.epcore.fun.{dl,gen,tmp,stat,cache,trap}.{alloc,free},
req.product.fun.{malloc,new,man.man}).
.non-req.fun.gc: There is not a requirement that the pool class support
formatted objects, scanning, or collection objects; but it should not be
arbitrarily precluded.
.req.fun.align: The pool class must support aligned allocations to
client-specified alignments. An individual instance need only support a single
alignment; multiple instances may be used to support more than one alignment
(source: req.epcore.attr.align).
.req.fun.reallocate: The pool class must support resizing of allocated blocks
(source req.epcore.fun.dl.promise.free, req.product.dc.env.{ansi-c,cpp}).
.non-req.fun.reallocate.in-place: There is not a requirement blocks must be
resized in place (where possible); but it seems like a good idea.
.req.fun.thread: Each instance of the pool class must support multiple threads
of allocation (source req.epcore.fun.dl.multi, req.product.dc.env.{ansi-c,cpp}).
.req.attr.performance: The pool class must meet or exceed performance of
"competitive" allocators (source: rec.epcore.attr.{run-time,tp},
req.product.attr.{mkt.eval, perform}). [Dylan does not seem to have any
requirement that storage be allocated with a particular response time or
throughput, just so long as we don't block for too long. Clearly there is a
missing requirement.]
.req.attr.performance.time: By inference, the time overhead must be competetive.
.req.attr.performance.space: By inference, the space overhead must be
competetive.
.req.attr.reliability: The pool class must have "rock-solid reliability"
(source: req.dylan.attr.rel.mtbf, req.epcore.attr.rel, req.product.attr.rel).
.req.fun.range: The pool class must be able to manage blocks ranging in size
from 1 byte to all of addressable memory
(req.epcore.attr.{dl,gen,tmp,stat,cache,trap}.obj.{min,max}. The range
requirement may be satisfied by multiple instances each managing a particular
client-specified subrange of sizes. [Dylan has requirements
req.dylan.attr.{capacity,obj.max}, but no requirement that such objects reside
in a manual pool.]
.req.fun.debug: The pool class must support debugging erroneous usage by client
programs (source: req.epcore.fun.{dc.variety, debug.support},
req.product.attr.{mkt.eval,perform}). Debugging is permitted to incur
additional overhead.
.req.fun.debug.boundaries: The pool class must support checking for accesses
outside the boundaries of live objects.
.req.fun.debug.log: The pool class must support logging of all allocations and
deallocations.
.req.fun.debug.enumerate: The pool class must support examining all allocated
objects.
.req.fun.debug.free: The pool class must support detecting incorrect,
overlapping, and double frees.
.req.fun.tolerant: The pool class must support tolerance of erroneous usage
(source req.product.attr.use.level.1).
Essential Requirements
.req.fun.profile: The pool class should support memory usage profiling (source:
req.product.attr.{mkt.eval, perform}).
.req.attr.flex: The pool class should be flexible so that it can be tuned to
specific allocation and freeing patterns (source:
req.product.attr.flex,req.epcore.attr.{dl,cache,trap}.typ). The flexibility
requirement may be satisfied by multiple instances each optimizing a specific
pattern.
.req.attr.adapt: The pool class should be adaptive so that it can accommodate
changing allocation and freeing patterns (source:
req.epcore.fun.{tmp,stat}.policy, req.product.attr.{mkt.eval,perform}).
Nice Requirements
.req.fun.suballocate: The pool class may support freeing of any aligned,
contiguous subset of an allocated block (source req.epcore.fun.dl.free.any,
req.product.attr.{mkt.eval,perform}).
ARCHITECTURE:
.arch.overview: The pool has several layers: client allocation is by Allocation
Points (APs). .arch.overview.ap: APs acquire storage from the pool
available-block queue (ABQ). .arch.overview.abq: The ABQ holds blocks of a
minimum configurable size: "reuse size". .arch.overview.storage: The ABQ
acquires storage from the arena or from the coalescing-block structure (CBS).
.arch.overview.storage.contiguous: The arena storage is requested to be
contiguous to maximize opportunities for coalescing (Loci will be used when
available). .arch.overview.cbs: The CBS holds blocks freed by the client until,
through coalescing, they have reached the reuse size, at which point they are
made available on the ABQ.
.arch.ap: The pool will use allocation points as the allocation interface to
the client. .arch.ap.two-phase: Allocation points will request blocks from the
pool and suballocate those blocks (using the existing AP, compare and
increment, 2-phase mechanism) to satisfy client requests. .arch.ap.fill: The
pool will have a configurable "fill size" that will be the preferred size block
used to fill the allocation point. .arch.ap.fill.size: The fill size should be
chosen to amortize the cost of refill over a number of typical reserve/commit
operations, but not so large as to exceed the typical object population of the
pool. .arch.ap.no-fit: When an allocation does not fit in the remaining space
of the allocation point, there may be a remaining fragment.
.arch.ap.no-fit.sawdust: If the fragment is below a configurable threshold
(minimum size), it will be left unused (but returned to the CBS so it will be
reclaimed when adjacent objects are freed); .arch.ap.no-fit.splinter:
otherwise, the remaining fragment will be (effectively) returned to the head of
the available-block queue, so that it will be used as soon as possible (i.e.,
by objects of similar birthdate). .arch.ap.no-fit.oversize: If the requested
allocation exceeds the fill size it is treated exceptionally (this may indicate
the client has either misconfigured or misused the pool and should either
change the pool configuration or create a separate pool for these exceptional
objects for best performance). .arch.ap.no-fit.oversize.policy: Oversize blocks
are assumed to have exceptional lifetimes, hence are allocated to one side and
do not participate in the normal storage recycling of the pool.
.arch.ap.refill.overhead: If reuse size is small, or becomes small due to
.arch.adapt, all allocations will effectively be treated exceptionally (the AP
will trip and a oldest-fit block will be chosen on each allocation). This mode
will be within a constant factor in overhead of an unbuffered pool.
.arch.abq: The available block queue holds blocks that have coalesced
sufficiently to reach reuse size. arch.abq.reuse.size: A multiple of the
quantum of virtual memory is used as the reuse size (.anal.policy.size).
.arch.abq.fifo: It is a FIFO queue (recently coalesced blocks go to the tail of
the queue, blocks are taken from the head of the queue for reuse).
.arch.abq.delay-reuse: By thus delaying reuse, coalescing opportunities are
greater. .arch.abq.high-water: It has a configurable high water mark, which
when reached will cause blocks at the head of the queue to be returned to the
arena, rather than reused. .arch.abq.return: When the MPS supports it, the pool
will be able to return free blocks from the ABQ to the arena on demand.
.arch.return.segment: arch.abq.return can be guaranteed to be able to return a
segment by setting reuse size to twice the size of the segments the pool
requests from the arena.
.arch.cbs: The coalescing block structure holds blocks that have been freed by
the client. .arch.cbs.optimize: The data structure is optimized for coalescing.
.arch.cbs.abq: When a block reaches reuse size, it is added to the ABQ.
.arch.cbs.data-structure: The data structures are organized so that a block can
be on both the CBS and ABQ simultaneously to permit additional coalescing, up
until the time the block is removed from the ABQ and assigned to an AP.
.arch.fragmentation.internal: Internal fragmentation results from The pool will
request large segments from the arena to minimize the internal fragmentation
due to objects not crossing segment boundaries.
.arch.modular: The architecture will be modular, to allow building variations
on the pool by assembling different parts. .arch.modular.example: For example,
it should be possible to build pools with any of the freelist mechanisms, with
in-band or out-of-band storage (where applicable), that do or do not support
derived object descriptions, etc.
.arch.modular.initial: The initial architecture will use
.mech.freelist.splay-tree for the CBS, .sol.mech.storage.out-of-band,
.sol.mech.desc.derived, and .sol.mech.allocate.buffer.
.arch.segregate: The architecture will support segregated allocation through
the use of multiple allocation points. The client will choose the appropriate
allocation point either at run time, or when possible, at compile time.
.arch.segregate.initial: The initial architecture will segregate allocations
into two classes: large and small. This will be implemented by creating two
pools with different parameters.
.arch.segregate.initial.choice: The initial architecture will provide glue code
to choose which pool to allocate from at run time. If possible this glue code
will be written in a way that a good compiler can optimize the selection of
pool at compile time. Eventually this glue code should be subsumed by the
client or generated automatically by a tool.
.arch.debug: Debugging features such as tags, fenceposts, types, creators will
be implemented in a layer above the pool and APs. A generic pool debugging
interface will be developed to support debugging in this outer layer.
.arch.debug.initial: The initial architecture will have counters for
objects/bytes allocated/freed and support for detecting overlapping frees.
.arch.dependency.loci: The architecture depends on the arena being able to
efficiently provide segments of varying sizes without excessive fragmentation.
The locus mechanism should satisfy this dependency. (See .anal.strategy.risk)
.arch.dependency.mfs: The architecture internal data structures depend on
efficient manual management of small, fixed-sized objects (2 different sizes).
The MFS pool should satisfy this dependency.
.arch.contingency: Since the strategy we propose is new, it may not work.
.arch.contingency.pathalogical: In particular, pathological allocation patterns
could result in fragmentation such that no blocks recycle from the CBS to ABQ.
.arch.contingency.fallback: As a fallback, there will be a pool creation
parameter for a high water mark for the CBS.
.arch.contingency.fragmentation-limit: When the free space in the CBS as a
percentage of all the memory managed by the pool (a measure of fragmentation)
reaches that high water mark, the CBS will be searched oldest-fit before
requesting additional segments from the arena. .arch.contingency.alternative:
We also plan to implement .mech.freelist.cartesian-tree as an alternative CBS,
which would permit more efficient searching of the CBS.
.arch.parameters: The architecture supports several parameters so that multiple
pools may be instantiated and tuned to support different object cohorts. The
important parameters are: reuse size, minimum size, fill size, ABQ high water
mark, CBS fragmentation limit (see .arch.contingency.fragmentation-limit).
.arch.parameters.client-visible: The client-visible parameters of the pool are
the minimum object size, the mean object size, the maximum object size, the
reserve depth and fragmentation limit. The minimum object size determines when
a splinter is kept on the head of the ABQ (.arch.ap.no-fit.splinter). The
maximum object size determines the fill size (.arch.ap.fill-size) and hence
when a block is allocated exceptionally (.arch.ap.no-fit.oversize). The mean
object size is the most likely object size. The reserve depth is a measure of
the hysteresis of the object population. The mean object size, reserve depth
and, maximum object size are used to determine the size of the ABQ
(.arch.abq.high-water). The fragmentation limit is used to determine when
contingency mode is used to satisfy an allocation request (.arch.contingency).
.arch.adapt: We believe that an important adaptation to explore is tying the
reuse size inversely to the fragmentation (as measured in
.arch.contingency.fragmentation-limit). .arch.adapt.reuse: By setting reuse
size low when fragmentation is high, smaller blocks will be available for
reuse, so fragmentation should diminish. .arch.adapt.overhead: This will result
in higher overhead as the AP will need to be refilled more often, so reuse size
should be raised again as fragmentation diminishes. .arch.adapt.oldest-fit: In
the limit, if reuse size goes to zero, the pool will implement a "oldest-fit"
policy: the oldest free block of sufficient size will be used for each
allocation.
.arch.adapt.risk: This adaptation is an experimental policy and should not be
delivered to clients until thoroughly tested.
ANALYSIS:
.anal.discard: We have discarded many traditional solutions based on experience
and analysis in paper.wil95(1). In particular, managing the free list as a
linear list arranged by address or size and basing policy on searching such a
linear list in a particular direction, from a particular starting point, using
fit and/or immediacy as criteria. We believe that none of these solutions is
derived from considering the root of the problem to be solved (as described in
.strategy), although their behavior as analyzed by Wilson gives several
insights.
.anal.strategy: For any program to run in the minimum required memory (with
minimal overhead -- we discard solutions such as compression for now),
fragmentation must be eliminated. To eliminate fragmentation, simply place
blocks in memory so that they die "in order" and can be immediately coalesced.
This ideal is not achievable, but we believe we can find object attributes that
correlate with deathtime and exploit them to approximate the ideal. Initially
we believe birth time and type (as approximated by size) will be useful
attributes to explore.
.anal.strategy.perform: To meet .req.attr.performance, the implementation of
.sol.strategy must be competitive in both time and space.
.anal.strategy.risk: The current MPS segment substrate can cause internal
fragmentation which an individual pool can do nothing about. We expect that
request.epcore.170193.sugg.loci will be implemented to remove this risk.
.anal.policy: Deferred coalescing, when taken to the extreme will not minimize
the memory consumption of a program, as no memory would ever be reused. Eager
reuse appears to lead to more fragmentation, whereas delayed reuse appears to
reduce fragmentation (paper.wil95(1)). The systems studied by Wilson did not
directly address deferring reuse. Our proposed policy is to reuse blocks when
they reach a (configurable) size. We believe that this policy along with the
policy of segregating allocations by death time, will greatly reduce
fragmentation. .anal.policy.risk: This policy could lead to pathologic behavior
if allocations cannot be successfully segregated.
.anal.policy.allocate.segregate: This policy has some similarities to
CustomAlloc (paper.grun92(1)). CustomAlloc segregates objects by size classes,
and then within those classes chooses a different allocator depending on
whether that size class has a stable or unstable population. Classes with
stable population recycle storage within the class, whereas classes with
unstable populations return their storage to the general allocation pool for
possible reuse by another class. CustomAlloc, however, requires profiling the
application and tuning the allocator according to those profiles. Although we
intend to support such tuning, we do not want to require it.
.anal.policy.reallocate: For reallocation, .fun.suballocate can be used to free
the remainder if a block is made smaller. Doing so will cause the freed block
to obey .sol.policy.allocate [i.e., the freed block will not be treated
specially, it will be subject to the normal policy on reuse]. Copying can be
used if a block is made larger. paper.vo96(0) reports success in
over-allocating a block the first time it is resized larger, presumably because
blocks that are resized once tend to be resized again and over-allocating may
avoid a subsequent copy. If each object that will be reallocated can be given
its own allocation point until its final reallocation, the allocation point can
be used to hold released or spare storage.
.anal.policy.size: We believe that this will take advantage of the underlying
virtual memory system's ability to compact the physical memory footprint of the
program by discarding free fragments that align with the virtual memory
quantum. (In a VM system one can approximate compaction by sparse mapping. If
every other page of a segment is unused, the unused pages can be unmapped,
freeing up physical memory that can be mapped to a new contiguous vm range.)
.anal.mech.freelist: The literature (paper.grun92(1), paper.vo96(0)) indicate
that .sol.freelist.cartesian-tree provides a space-efficient implementation at
some cost in speed. .sol.freelist.splay-tree is faster but less
space-efficient. .sol.freelist.bitmap is unstudied. Many of the faster
allocators maintain caches of free blocks by size to speed allocation of
"popular" sizes. We intend to initially explore not doing so, as we believe
that policy ultimately leads to fragmentation by mixing objects of varying
death times. Instead we intend to use a free list mechanism to support fast
coalescing, deferring reuse of blocks until a minimum size has been reached.
anal.mech.allocate.optimize-small: Wilson (paper.wil95(1)) notes that small
blocks typically have short lifetimes and that overall performance is improved
if you optimize the management of small blocks, e.g.,
sol.mech.allocate.lookup-table for all small blocks. We believe that
.sol.mech.allocate.buffer does exactly that.
.anal.mech.allocate.optimize-new: Wilson (paper.wil95(1)) reports some benefit
from "preserving wilderness", that is, when a block of memory must be requested
from the system to satisfy an allocation, only the minimum amount of that block
is used, the remainder is preserved (effectively by putting it at the tail of
the free list). This mechanism may or may not implement .sol.policy.allocate.
We believe a better mechanism is to choose to preserve or not, based on
.sol.policy.allocate.
IDEAS:
.sol: Many solution ideas for manual management of variable-sized memory blocks
are enumerated by paper.wil95(1). Here we list the most promising, and some of
our own.
Strategy
.sol.strategy: To run a program in the minimal required memory, with minimal
overhead, utilize memory efficiently. Memory becomes unusable when fragmented.
Strategy is to minimize fragmentation. So place blocks where they won't cause
fragmentation later.
.sol.strategy.death: objects that will die together (in time) should be
allocated together (in space); thus they will coalesce, reducing fragmentation.
.sol.strategy.death.birth: assume objects allocated near each other in time
will have similar deathtimes (paper.beck82(0))
.sol.strategy.death.type: assume objects of different type may have different
deathtimes, even if born together
.sol.strategy.death.predict: find and use program features to predict deathtimes
.sol.strategy.reallocate: reallocation implies rebirth, or at least a change in
lifetime
.sol.strategy.debug: as much of the debugging functionality as possible should
be implemented as a generally available MPS utility; the pool will provide
support for debugging that would be expensive or impossible to allocate outside
the pool
Policy
[Policy is an implementable decision procedure, hopefully approximating the
strategy.]
.sol.policy.reuse: defer reusing blocks, to encourage coalescing
.sol.policy.split: when a block is split to satisfy an allocation, use the
remainder as soon as possible
.sol.policy.size: prevent .policy.reuse from consuming all of memory by
choosing a (coalesced) block for reuse when it reaches a minimum size
.sol.policy.size.fixed: use the quantum of virtual memory (e.g., one page) as
minimum size
.sol.policy.size.tune: allow tuning minimum size
.sol.policy.size.adapt: adaptively change minimum size
.sol.policy.allocate: allocate objects with similar birthdate and lifetime
together
.sol.policy.allocate.segregate: segregate allocations by type
.sol.policy.allocate.segregate.size: use size as a substitute for type
.sol.policy.allocate.segregate.tune: permit tuning of segregation
.sol.policy.allocate.segregate.adapt: adaptively segregate allocations
.sol.policy.reallocate: implement reallocation in a central mechanism outside
of the pool, create a generic pool interface in support of same.
.sol.policy.debug: implement a pool debugging interface
.sol.policy.debug.counters: implement debugging counters in the pool that are
queried with a generic interface
.sol.policy.debug.verify: implement debugging error returns on overlapping frees
Mechanism
[Mechanisms are algorithms or data structures used to implement policy.]
.sol.mech.free-list: mechanisms that can be used to describe the free list
.sol.mech.free-list.cartesian-tree: Using address and size as keys supports
fast coalescing of adjacent blocks and fast searching for optimal-sized blocks.
Unfortunately, because the shape of the tree is constrained by the second key,
it can become unbalanced. This data structure is used in the SunOS 4.1 malloc
(paper.grun92(1)).
.sol.mech.free-list.splay-tree: The amortized cost of a splay tree is
competitive with balanced binary trees in the worst case, but can be
significantly better for regular patterns of access because recently-accessed
keys are moved to the root of the tree and hence can be re-accessed quickly.
This data structure is used in the System Vr4 malloc (paper.vo96(0)). (For a
complete analysis of the splay tree algorithm time bounds see paper.st85(0).)
.sol.mech.free-list.bit-map: Using address as an index and fix-sized blocks,
the booleans can represent whether a block is free or not. Adjacent blocks can
be used to construct larger blocks. Efficient algorithms for searching for runs
in a vector are known. This data structure is used in many file system disk
block managers.
.sol.mech.free-list.refcount: A count of the number of allocated but not freed
subblocks of a block can be used to determine when a block is available for
reuse. This is an extremely compact data structure, but does not support
subblock reuse.
.sol.mech.free-list.hybrid: Bitmaps appear suited particularly to managing
small, contiguous blocks. The tree structures appear suited particularly to
managing varying-sized, discontiguous blocks. A refcount can be very efficient
if objects can be placed accurately according to death time. A hybrid mechanism
may offer better performance for a wider range of situations.
.sol.mech.storage: methods that can be used to store the free list description
.sol.mech.storage.in-band: The tree data structures are amenable to being
stored in the free blocks themselves, minimizing the space overhead of
management. To do so imposes a minimum size on free blocks and reduces the
locality of the data structure.
.sol.mech.storage.out-of-band: The bit-map data structure must be stored
separately.
.sol.mech.desc: for an allocated block to be freed, its base and bound must be
known
.sol.mech.desc.derived: Most clients can supply the base of the block. Some
clients can supply the bound.
.sol.mech.desc.in-band: When the bound cannot be supplied, it can be stored as
an in-band "header". If neither the base nor bound can be supplied (e.g., the
client may only have an interior pointer to the block), a header and footer may
be required.
.sol.mech.desc.out-of-band: In un-tagged architectures, it may be necessary to
store the header and footer out-of-band to distinguish them from client data.
Out-of-band storage can improve locality and reliability. Any of the free-list
structures can also be used to describe allocated blocks out-of-band.
.sol.mech.desc.crossing-map: An alternative for untagged architectures is to
store a "crossing map" which records an encoding of the start of objects and
then store the descriptive information in-band.
.sol.mech.allocate: mechanisms that can be used to allocate blocks (these
typically sit on top of a more general free-list manager)
.sol.mech.allocate.lookup-table: Use a table of popular sizes to cache free
blocks of those sizes.
.sol.mech.allocate.buffer: Allocate from contiguous blocks using compare and
increment.
.sol.mech.allocate.optimize-small: Use a combination of techniques to ensure
the time spent managing a block is small relative to the block's lifetime;
assume small blocks typically have short lifetimes.
.sol.mech.allocate.optimize-new: When "virgin" memory is acquired from the
operating system to satisfy a request, try to preserve it (i.e., use only what
is necessary)
.sol.mech.allocate.segregate.size: use size as a substitute for type
.sol.mech.reallocate: use .req.fun.suballocate to return unused memory when a
block shrinks, but differentiate this from an erroneous overlapping free by
using separate interfaces.
IMPLEMENTATION:
The implementation consists of the following separable modules:
Coalescing Block Structure
.impl.c.cbs: The initial implementation will use .sol.mech.free-list.splay-tree
and sol.mech.storage.out-of-band. For locality, this storage should be managed
as a linked free list of splay nodes suballocated from blocks acquired from a
pool shared by all CBS's. Must support creation and destruction of an empty
tree. Must support search, insert and delete by key of type Addr. Must support
finding left and right neighbors of a failed search for a key. Must support
iterating over the elements of the tree with reasonable efficiency. Must
support storing and retrieving a value of type Size associated with the key.
Standard checking and description should be provided. See design.mps.splay(0)
and design.mps.cbs(0).
Available Block Queue
.impl.c.abq: The initial implementation will be a queue of fixed size
(determined at pool creation time from the high water mark). Must support
creation and destruction of an empty queue. Must support insertion at the head
or tail of the queue (failing if full), peeking at the head of the queue, and
removal of the head (failing if empty) or any element of the queue (found by a
search). Standard checking and description should be provided.
Pool Implementation
.impl.c: The initial implementation will use the above modules to implement a
buffered pool. Must support creation and destruction of the pool. Creation
takes parameters: minimum size, mean size, maximum size, reserve depth and
fragmentation limit. Minimum, mean, and maximum size are used to calculate the
internal fill and reuse sizes. Reserve depth and mean size are used to
calculate the ABQ high water mark. Fragmentation limit is used to set the CBS
contingency mode. Must support buffer initialization, filling and emptying.
Must support freeing. Standard checking and description should be provided.
[Eventually, it should support scanning, so it can be used with collected
pools, but no manual pool currently does.]
.impl.c.future: The implementation should not preclude "buffered free"
(mail.ptw.1997-12-05.19-07(0), ff.) being added in the future.
.impl.c.parameters: The pool parameters are calculated as follows from the
input parameters: minimum, mean, and maximum size are taked directly from the
parameters. .impl.c.parameter.fill-size: The fill size is set to the maximum
size times the reciprocal of the fragmentation limit, aligned to the arena
alignment. .imple.c.parameter.reuse-size: The reuse size is set to twice the
fill size (see .arch.abq.return.segment, .impl.c.free.merge.segment).
.impl.c.parameter.abq-limit: The ABQ high-water limit is set to the reserve
depth times the mean size (that is, the queue should hold as many reuse blocks
as would take to cover the population hysteresis if the population consisted
solely of mean-sized blocks, see .arch.abq.high-water).
.impl.c.parameter.avail-limit: The CBS high-water limit is implemented by
comparing the available free space to an "available limit". The available
limit is updated each time a segment is allocated from or returned to the arena
by setting it to the total size of the pool times the fragmentation limit
divide vy 100 (see .arch.contingency.fallback).
.impl.c.ap.fill: An AP fill request will be handled as follows:
o If the request is larger than fill size, attempt to request a segment from
the arena sufficient to satisfy the request
o Use any previously returned splinter (from .impl.c.ap.empty), if large enough
o Attempt to retrieve a free block from the head of the ABQ (removing it from
ABQ and CBS if found).
o If above fragmentation limit, attempt to find a block on the CBS, using
oldest-fit search
o Attempt to request a segment of fill size from the arena
o Attempt to find a block on the CBS, using oldest-fit search
o Otherwise, fail
.impl.c.ap.empty: An AP empty request will be handled as follows:
o If remaining free is less than min size, return it to the CBS
o If the remaining free is larger than any previous splinter, return that
splinter to the CBS and save this one for use by a subsequent fill
o Otherwise return the remaining block to the CBS
.impl.c.free: When blocks are returned to the CBS a search is made for adjacent
blocks that can be merged. If not, the block is simply inserted in the CBS. If
a merge occurs between two blocks on the ABQ, the ABQ must be adjusted to
reflect the merge. .impl.c.free.exception: Exceptional blocks are returned
directly to the arena.
.impl.c.free.merge: If a merge occurs and the merged block is larger than reuse
size:
o If the ABQ is full, remove the block at the head of the ABQ from the ABQ and
CBS and return it to the arena(*)
o Insert the newly merged block at the tail of the ABQ, leaving it on the CBS
for further merging
.impl.c.free.merge.segment: (*) Merged blocks may not align with arena
segments. If necessary, return the interior segments of a block to the arena
and return the splinters to the CBS. .impl.c.free.merge.segment.reuse: If the
reuse size (the size at which blocks recycle from the CBS to the ABQ) is at
least twice the fill size (the size of segments the pool allocates from the
arena), we can guarantee that there will always be a returnable segment in
every ABQ block. .impl.c.free.merge.segment.overflow: If the reuse size is set
smaller (see .arch.adapt), there may not be a returnable segment in an ABQ
block, in which case the ABQ has "overflowed". Whenever this occurs, the ABQ
will be refilled by searching the CBS for dropped reusable blocks when needed.
.impl.c.free.merge.segment.risk: The current segment structure does not really
support what we would like to do. Loci should do better: support reserving
contiguous address space and mapping/unmapping any portion of that address
space.
.impl.c.free.merge.alternative: Alternatively, if the MPS segment substrate
permitted mapping/unmapping of pages, the pool could use very large segments
and map/unmap pages as needed.
AP Dispatch
.impl.c.multiap: The initial implementation will be a glue layer that selects
among several AP's for allocation according to the predicted deathtime (as
approximated by size) of the requested allocation. Each AP will be filled from
a pool instance tuned to the range of object sizes expected to be allocated
from that AP. [For bonus points provide an interface that creates a batch of
pools and AP's according to some set of expected object sizes. Eventually
expand to understand object lifetimes and general lifetime prediction keys.]
impl.c.multiap.sample-code: This glue code is not properly part of the pool or
MPS interface. It is a layer on top of the MPS interface, intended as sample
code for unsophisticated clients. Sophisticated clients will likely want to
choose among multiple AP's more directly.
TESTING:
.test.component: Components .impl.c.splay, .impl.c.cbs, and .impl.c.abq will be
subjected to individual component tests to verify their functionality.
.test.regression: All tests applied to poolmv (design.mps.poolmv(0)) and
poolepdl (design.mps.poolepdl(0)) will be applied to poolmv2 to ensure that mv2
is at least as functional as the pools it is replacing.
.test.qa: Once poolmv2 is integrated into the MPS, the standard MPS QA tests
will be applied to poolmv2 prior to each release.
.test.customer: Customer acceptance tests will be performed on a per-customer
basis before release to that customer (cf., proc.release.epcore(2).test)
TEXT:
Possible tweaks (from mail.pekka.1998-04-15.13-10(0)):
1. Try to coalesce splinters returned from AP's with the front (or any) block
on the ABQ.
2. Sort ABQ in some other way to minimize splitting/splinters. E.g., proximity
to recently allocated blocks.

View file

@ -0,0 +1,106 @@
DESIGN OF THE MANUALLY-MANAGED VARIABLE-SIZE FIRST-FIT POOL
design.mps.poolmvff
incomplete doc
gavinm 1998-09-09
INTRODUCTION
.intro: The pool was created in a response to a belief that EPDL/EPDR's first
fit policy is beneficial for some classes of client behaviour, but the
performance of a linear free list was unacceptable. This pool implements a
first (or last) fit policy for variable-sized manually-managed objects, with
control over first/last, segment preference high/low, and slot fit low/high.
Document History
.hist.0: GavinM wrote a list of methods and function plus some notes 1998-09-09.
.hist.1: Added overview, removed bogus ArenaEnter design, and described
buffered allocation. pekka 1999-01-06
.hist.2: Modified for the "Sunset On Segments" redesign of segments. Buffered
allocation is no longer limited to segment boundaries.
OVERVIEW
.over: This pool implements certain variants of the address-ordered first-fit
policy. The implementation allows allocation across segment boundaries.
.over.buffer: Buffered allocation is also supported, but in that case, the
buffer-filling policy is worst-fit. Buffered and unbuffered allocation can be
used at the same time, but in that case, the first ap must be created before
any allocations. .over.buffer.class: The pool uses the simplest buffer class,
BufferClass. This is appropriate since these buffers don't attach to segments,
and hence don't constrain buffered regions to lie within segment boundaries.
.over.segments: The pool uses the simplest segment class (SegClass). There's no
need for anything more complex.
METHODS
.method: The MVFF pool supports the following methods:
.method.init: Res MVFFInit(Pool pool, va_list arg)
This takes six vararg parameters:
- extendBy -- the segment size;
- avgSize -- the average object size;
- alignment -- the alignment of allocations and frees (must be at least
sizeof(void*));
- slotHigh -- whether to allocate objects at the end of free blocks found, as
opposed to at the start (for unbuffered allocation);
- arenaHigh -- whether to express SegPrefHIGH to the arena, as opposed to
SegPrefLOW;
- firstFit -- whether to use the suitable block of lowest address, as opposed
to the highest (for unbuffered allocation).
.method.init.epdl: To simulate the EPDL pool, specify extendBy, avgSize, and
maxSize as normal, and use slotHigh=FALSE, arenaHigh=FALSE, firstFit=TRUE.
.method.init.epdr: To simulate the EPDL pool, specify extendBy, avgSize, and
maxSize as normal, and use slotHigh=TRUE, arenaHigh=TRUE, firstFit=TRUE.
.method.init.other: The performance characteristics of other combinations are
unknown.
.method.finish: The PoolFinish method,
.method.alloc: Alloc and Free methods are supported, implementing the policy
set by the pool params (see .method.init).
.method.describe: The usual describe method.
.method.buffer: The buffer methods implement a worst-fit fill strategy.
EXTERNAL FUNCTIONS
.function: MVFF supports the following external functions:
.function.free-size: size_t mps_mvff_free_size(mps_pool_t pool)
This function returns the total size of free space in segments allocated to
the MVFF pool instance.
.function.size: size_t mps_mvff_size(mps_pool_t pool)
This function returns the total memory used by pool segments, whether free or
allocated.
.function.class: mps_class_t mps_class_mvff(void)
This function returns the class object for the pool class, to be used in pool
creation.
IMPLEMENTATION
.impl.free-list: The pool stores its free list in a CBS (see design.mps.cbs).
It uses the CBS's mayUseInline facility to avoid running out of memory to store
the free this. This is the reason for the alignment restriction above.
DETAILS
.design.seg-size: When adding a segment, we use extendBy as the segment size
unless the object won't fit, in which case we use the object size (in both
cases we align up).
.design.seg-fail: If allocating a segment fails, we try again with a segment
size just large enough for the object we're allocating. This is in response to
request.mps.170186.

85
mps/design/prot/index.txt Normal file
View file

@ -0,0 +1,85 @@
GENERIC DESIGN OF THE PROTECTION MODULE
design.mps.prot
incomplete doc
drj 1997-04-02
INTRODUCTION
.readership: Any MPS developer.
.intro: This is the generic design of the Protection Module. The protection
module provides protection services to other parts of the MPS. It is expected
that different operating systems will have different implementations of this
module.
INTERFACE
.if.setup:
void ProtSetup(void);
ProtSetup will be called exactly once (per process). It will be called as part
of the initialization of the first space that is created. It should arrange
for the setup and initialization of any datastructures or services that are
necessary in order to implement the protection module. (On UNIX it expected
that it will install a signal handler, on Windows it will do nothing)
.if.set:
void ProtSet(Addr base, Addr limit, AccessSet mode)
ProtSet should set the protection of the memory between base and limit,
including base, but not including limit (ie the half-open interval
[base,limit)) to that specified by mode.
The mode parameter should have the AccessWrite bit set if write accesses to the
page are to be forbidden, and should have the AccessRead bit set if read
accesses to the page are to be forbidden. A request to forbid read accesses
(ie AccessRead is set) may also forbid write accesses, but read accesses will
not be forbidden unless AccessRead is set.
.if.tramp:
void ProtTramp(void **resultReturn, void *(*f)(void *, size_t), void *p, size_t
s);
.if.sync:
void ProtSync(Space space);
ProtSync is called to ensure that the actual protection of each segment (as
determined by the OS) is in accordance with the segments's pm field.
.if.context-type:
typedef struct MutatorFaultContextStruct *MutatorFaultContext;
This abstract type is implemented by the protection module (impl.c.prot*). It
represents the continuation of the mutator which is restored after a mutator
fault has been handled. The functions ProtCanStepInstruction (.if.canstep
below) and ProtStepInstruction (.if.step below) inspect and manipulate the
context.
.if.canstep:
Bool ProtCanStepInstruction(MutatorFaultContext context);
Examines the context to determine whether the protection module can single-step
the instruction which is causing the fault. Should return TRUE if and only if
the instruction can be single-stepped (ie ProtStepInstruction can be called).
.if.step:
Bool Res ProtStepInstruction(MutatorFaultContext context);
Single-steps the instruction which is causing the fault. This function should
only be called if ProtCanStepInstruction applied to
the context returned TRUE. It should return ResUNIMPL if the instruction
cannot be single-stepped. It should return ResOK if the
instruction is single-stepped.
The mutator context will be updated by the emulation/execution of the
instruction such that resuming the mutator will not cause the
instruction which was causing the fault to be executed.

View file

@ -0,0 +1,67 @@
ANSI IMPLEMENTATION OF PROTECTION MODULE
design.mps.protan
incomplete doc
drj 1997-03-19
INTRODUCTION
.readership: Any MPS developer
.intro: This is the design for the ANSI implementation of the Protection Module.
REQUIREMENTS
.req.test: This module is required for testing. Particularly on platforms
where no real implementation of the protection module exists.
.req.rapid-port: This module is required for rapid porting. It should enable a
developer to port a minimally useful configuration of the MPS to new platforms
very quickly.
OVERVIEW
.overview: Most of the functions in the module do nothing. The exception is
ProtSync which traverses over all segments in the arena and simulates an access
to each segment that has any protection on it. This means that this module
depends on certain fields in the segment structure.
.overview.noos: No operating system specific (or even ANSI hosted specific)
code is in this module. It can therefore be used on any platform, particularly
where no real implementation of the module exists. It satisfies .req.test and
.req.rapid-port in this way.
FUNCTIONS
.fun.protsetup:
ProtSetup
Does nothing as there is nothing to do (under UNIX we might expect the
Protection Module to install one or more signal handlers at this pointer, but
that is not appropropriate for the ANSI implementation). Of course, we can't
have an empty function body, so there is a NOOP; here.
.fun.sync:
ProtSync
.fun.sync.what:
ProtSync is called to ensure that the actual protection of each segment (as
determined by the OS) is in accordance with the segments's pm field. In the
ANSI implementation we have no way of changing the protection of a segment, so
instead we generate faults on all protected segments in the assumption that
that will remove the protection on segments.
.fun.sync.how:
Continually loops over all the segments until it finds that all segments have
no protection. .sync.seg: If it finds a segment that is protected then
PoolAccess is called on that segment's pool and with that segment. The call to
PoolAccess is wrapped with a ShieldEnter and ShieldLeave thereby giving the
pool the illusion that the fault was generated outside the MM. This depends on
being able to determine the protection of a segment (using the pm field), on
being able to call ShieldEnter and ShieldLeave, and on being able to call
PoolAccess.

155
mps/design/protli/index.txt Normal file
View file

@ -0,0 +1,155 @@
LINUX IMPLEMENTATION OF PROTECTION MODULE
design.mps.protli
incomplete doc
tony 2000-02-03
INTRODUCTION
.readership: Any MPS developer
.intro: This is the design of the Linux implementation of the protection
module. It makes use of various services provided by Linux. It is intended to
work with LinuxThreads.
REQUIREMENTS
.req.general: Required to implement the general protection interface defined in
design.mps.prot.if.*.
MISC
.improve.sigvec: Note 1 of ProtSetup notes that we can't honour the sigvec(2)
entries of the next handler in the chain. What if when we want to pass on the
signal instead of calling the handler we call sigvec with the old entry and use
kill to send the signal to ourselves and then restore our handler using sigvec
again. [need more detail and analysis here].
DATASTRUCTURES
.data.signext: This is static. Because that is the only communications channel
available to signal handlers. [write a little more here]
FUNCTIONS
.fun.setup: ProtSetup installs a signal handler for the signal SIGSEGV to catch
and handle protection faults (this handler is the function sigHandle, see
.fun.sighandle). The previous handler is recorded (in the variable sigNext, see
.data.signext) so that it can be reached from sigHandle if it fails to handle
the fault. .fun.setup.problem: The problem with this approach is that we can't
honour the wishes of the sigvec(2) entry for the previous handler (in terms of
masks in particular).
.fun.set: ProtSet uses mprotect to adjust the protection for pages.
void ProtSet(Addr base, Addr limit, AccessSet mode)
.fun.set.convert: The requested protection (which is expressed in the mode
parameter, see design.mps.prot.if.set) is translated into an OS protection. If
read accesses are to be forbidden then all accesses are forbidden, this is done
by setting the protection of the page to PROT_NONE. If write accesses are to
be forbidden (and not read accesses) then write accesses are forbidden and read
accesses are allowed, this is done by setting the protection of the page to
PROT_READ|PROT_EXEC. Otherwise (all access are okay), the protection is set to
PROT_READ|PROT_WRITE|PROT_EXEC.
.fun.set.assume.mprotect: We assume that the call to mprotect always succeeds.
.fun.set.assume.mprotect: This is because we should always call the function
with valid arguments (aligned, references to mapped pages, and with an access
that is compatible with the access of the underlying object).
.fun.sync: ProtSync does nothing in this implementation as ProtSet sets the
protection without any delay.
void ProtSync(Space space);
.fun.tramp: The protection trampoline is trivial under Linux, as there is
nothing that needs to be done in the dynamic context of the mutator in order
to catch faults. (Contrast this with Win32 Structured Exception Handling.)
void ProtTramp(void **resultReturn, void *(*f)(void *, size_t), void *p, size_t
s);
THREADS
.threads: The design must operate in a multi-threaded environment (with
LinuxThreads) and cooperate with the Linux support for locks (see
design.mps.lock) and the thread suspension mechanism (see design.mps.pthreadext
).
.threads.suspend: The SIGSEGV signal handler does not mask out any signals, so
a thread may be suspended while the handler is active, as required by the
design (see design.mps.pthreadext.req.suspend.protection). The signal handlers
simply nest at top of stack.
.threads.async: POSIX (and hence Linux) imposes some restrictions on signal
handler functions (see design.mps.pthreadext.anal.signal.safety). Basically the
rules say the behaviour of almost all POSIX functions inside a signal handler
is undefined, except for a handful of functions which are known to be
"async-signal safe". However, if it's known that the signal didn't happen
inside a POSIX function, then it is safe to call arbitrary POSIX functions
inside a handler.
.threads.async.protection: If the signal handler is invoked because of an MPS
access, then we know the access must have been caused by client code (because
the client is not allowed to permit access to protectable memory to arbitrary
foreign code [need a reference for this]). In these circumstances, it's OK to
call arbitrary POSIX functions inside the handler.
.threads.async.other: If the signal handler is invoked for some other reason
(i.e. one we are not prepared to handle) then there is less we can say about
what might have caused the SEGV. In general it is not safe to call arbitrary
POSIX functions inside the handler in this case.
.threads.async.choice: The signal handler calls ArenaAccess to determine
whether the SEGV was the result of an MPS access. ArenaAccess will claim
various MPS locks (i.e. the arena ring lock and some arena locks). The code
calls no other POSIX functions in the case where the SEGV is not an MPS access.
The locks are implemented as mutexes and are claimed by calling
pthread_mutex_lock, which is not defined to be async-signal safe.
.threads.async.choice.ok: However, despite the fact that PThreads documentation
doesn't define the behaviour of pthread_mutex_lock in these circumstances, we
expect the LinuxThreads implementation will be well-behaved unless the SEGV
occurs while while in the process of locking or unlocking one of the MPS locks
(see .threads.async.linux-mutex). But we can assume that a SEGV will not happen
then (because we use the locks correctly, and generally must assume that they
work). Hence we conclude that it is OK to call ArenaAccess directly from the
signal handler.
.threads.async.linux-mutex: A study of the LinuxThreads source code reveals
that mutex lock and unlock functions are implemented as a spinlock (using a
locked compare-and-exchange instruction) with a backup suspension mechanism
using sigsuspend. On locking, the spinlock code performs a loop which examines
the state of the lock, and then atomically tests that the state is unchanged
while attempting to modify it. This part of the code is reentrant (and hence
async-signal safe). Eventually, when locking, the spinlock code may need to
block, in which case it calls sigsuspend waiting for the manager thread to
unblock it. The unlocking code is similar, except that this code may need to
release another thread, in which case it calls kill. sigsuspend and kill are
both defined to be async-signal safe by POSIX. In summary, the mutex locking
functions use primitives which are entirely async-signal safe. They perform
side-effects which modify the fields of the lock structure only. This code may
be safely invoked inside a signal handler unless the interrupted function is in
the process of manipulating the fields of that lock structure.
.threads.async.improve: In future it would be preferable to not have to assume
reentrant mutex locking and unlocking functions. By making the assumption we
also assume that the implementaion of mutexes in LinuxThreads will not be
completely re-designed in future (which is not wise for the long term). An
alternative approach would be necessary anyway when supporting another platform
which doesn't offer reentrant locks (if such a platform does exist).
.threads.async.improve.how: We could avoid the assumption if we had a means of
testing whether an address lies within an arena chunk without the need to claim
any locks. Such a test might actually be possible. For example, arenas could
update a global datastructure describing the ranges of all chunks, using atomic
updates rather than locks; the handler code would be allowed to read this
without locking. However, this is somewhat tricky; a particular consideration
is that it's not clear when it's safe to deallocate stale portions of the
datastructure.
.threads.sig-stack: We do not handle signals on a separate signal stack.
Separate signal stacks apparantly don't work properly with Pthreads.

View file

@ -0,0 +1,441 @@
THE DESIGN FOR PROTOCOL INHERITANCE IN MPS
design.mps.protocol
incomplete doc
tony 1998-10-12
INTRODUCTION
.intro: This document explains the design of the support for class inheritance
in MPS. It is not yet complete. It describes support for single inheritance
of classes. Future extensions will describe multiple inheritance and the
relationship between instances and classes.
.readership: This document is intended for any MM developer.
.hist.0: Written by Tony 1998-10-12
PURPOSE
.purpose.code-maintain: The purpose of the protocol inheritance design is to
ensure that the MPS code base can make use of the benefits of OO class
inheritance to maximize code reuse, minimize code maintainance and minimize the
use of "boiler plate" code.
.purpose.related: For related discussion, see mail.tony.1998-08-28.16-26(0),
mail.tony.1998-09-01.11-38(0), mail.tony.1998-10-06.11-03(0) & other messages
in the same threads.
REQUIREMENTS
.req.implicit: The object system should provide a means for classes to inherit
the methods of their direct superclasses implicitly for all functions in the
protocol without having to write any explicit code for each inherited function.
.req.override: There must additionally be a way for classes to override the
methods of their superclasses.
.req.next-method: As a result of .req.implicit, classes cannot make static
assumptions about methods used by direct superclasses. The object system must
provide a means for classes to extend (not just replace) the behaviour of
protocol functions, such as a mechanism for invoking the "next-method".
.req.ideal.extend: The object system must provide a standard way for classes to
implement the protocol supported by they superclass and additionally add new
methods of their own which can be specialized by subclasses.
.req.ideal.multiple-inheritance: The object system should support multiple
inheritance such that sub-protocols can be "mixed in" with several classes
which do not themselves support identical protocols.
OVERVIEW
.overview.root: We start with the root of all conformant class hierarchies,
which is called "ProtocolClass". ProtocolClass is an "abstract" class (i.e. it
has no direct instances, but it is intended to have subclasses). To use Dylan
terminology, instances of its subclasses are "general" instances of
ProtocolClass. They look as follows:-
Instance Object Class Object
-------------------- --------------------
| sig | |-------->| sig |
-------------------- | --------------------
| class |----| | superclass |
-------------------- --------------------
| ... | | coerceInst |
-------------------- --------------------
| ... | | coerceClass |
-------------------- --------------------
| | | ... |
.overview.inherit: Classes inherit the protocols supported by their
superclasses. By default they have the same methods as the class(es) from
which they inherit. .overview.inherit.specialize: Classes may specialize the
behaviour of their superclass. They do this by by overriding methods or other
fields in the class object.
.overview.extend: Classes may extend the protocols supported by their
superclasses by adding new fields for methods or other data.
.overview.sig.inherit: Classes will contain (possibly several) signatures.
Classes must not specialize (i.e. override) the signature(s) they inherit from
their superclass(es).
.overview.sig.extend: If a class definition extends a protocol, it is normal
policy for the class definition to include a new signature as the last field in
the class object.
.overview.coerce-class: Each class contains a coerceClass field. This contains
a method which can find the part of the class object which implements the
protocols of a supplied superclass argument (if, indeed, the argument IS a
superclass). This function may be used for testing subclass/superclass
relationships, and it also provides support for multiple inheritance.
.overview.coerce-inst: Each class contains a coerceInst field. This contains a
method which can find the part of an instance object which contains the
instance slots of a supplied superclass argument (if, indeed, the argument IS a
superclass). This function may be used for testing whether an object is an
instance of a given class, and it also provides support for multiple
inheritance.
.overview.superclass: Each class contains a superclass field. This enables
classes to call "next-method", as well as enabling the coercion functions.
.overview.next-method: A specialized method in a class can make use of an
overridden method from a superclass by accessing the method from the
appropriate field in the superclass object and calling it. The superclass may
be accessed indirectly from the class's "Ensure" function when it is statically
known (see .overview.access). This permits "next-method" calls, and is fully
scalable in that it allows arbitrary length method chains. The SUPERCLASS
macro helps with this (see .int.static-superclass).
.overview.next-method.naive: In some cases it is necessary to write a method
which is designed to specialize an inherited method, needs to call the
next-method, and yet the implementation doesn't have static knowledge of the
superclass. This might happen because the specialized method is designed to be
reusable by many class definitions. The specialized method can usually locate
the class object from one of the parameters passed to the method. It can then
access the superclass through the "superclass" field of the class, and hence
call the next method. This technique has some limitations and doesn't support
longer method chains. It is also dependent on none of the class definitions
which use the method having any subclasses.
.overview.access: Classes must be initialized by calls to functions, since it
is these function calls which copy properties from superclasses. Each class
must provide an "Ensure" function, which returns the canonical copy of the
class. The canonical copy may reside in static storage, but no MPS code may
refer to that static storage by name.
.overview.naming: There are some strict naming conventions which must be
followed when defining and using classes. The use is obligatory because it is
assumed by the macros which support the definition and inheritance mechanism.
For every class SomeClass, we insist upon the following naming conventions:-
SomeClassStruct - names the type of the structure for the protocol class.
This might be a typedef which aliases the type to the
type of the superclass, but if the class has extended
the protocols of the superclass the it will be a type which
contains the new class fields.
SomeClass - names the type *SomeClassStruct.
This might be a typedef which aliases the type to the
type of the superclass, but if the class has extended
the protocols of the superclass then it will be a type
which
contains the new class fields.
EnsureSomeClass - names the function that returns the initialized
class object.
INTERFACE
Class Definition
.int.define-class: Class definition is performed by the macro
DEFINE_CLASS(className, var). A call to the macro must be followed by a body
of initialization code in braces {}. The parameter className is used to name
the class being defined. The parameter var is used to name a local variable of
type className, which is defined by the macro; it refers to the canonical
storage for the class being defined. This variable may be used in the
initialization code. (The macro doesn't just pick a name implicitly because of
the danger of a name clash with other names used by the programmer). A call to
DEFINE_CLASS(SomeClass, var) does the following:
Defines the EnsureSomeClass function.
Defines some static storage for the canonical class object
Defines some other things to ensure the class gets initialized exactly once
.int.define-alias-class: A convenience macro DEFINE_ALIAS_CLASS is provided
which both performs the class definition and defines the types SomeClass and
SomeClass struct as aliases for some other class types. This is particularly
useful for classes which simply inherit, and don't extend protocols. The macro
call DEFINE_ALIAS_CLASS(className, superName, var) is exactly equivalent to the
following:
typedef superName className;
typedef superNameStruct classNameStruct;
DEFINE_CLASS(className, var)
.int.define-special: If classes are particularly likely to be subclassed
without extension, the class implementor may choose to provide a convenience
macro which expands into DEFINE_ALIAS_CLASS with an appropriate name for the
superclass. For example, there might be a macro for defining pool classes such
that the macro call DEFINE_POOL_CLASS(className, var) is exactly equivalent to
the macro call DEFINE_ALIAS_CLASS(className, PoolClass, var). It may also be
convenient to define a static superclass accessor macro at the same time (see
.int.static-superclass.special).
Single Inheritance
.int.inheritance: Class inheritance details must be provided in the class
initialization code (see .int.define-class). Inheritance is performed by the
macro INHERIT_CLASS(thisClassCoerced, parentClassName). A call to this macro
will make the class being defined a direct subclass of ParentClassName by
ensuring that all the fields of the parent class are copied into thisClass, and
setting the superclass field of thisClass to be the parent class object. The
parameter thisClassCoerced must be of type parentClassName. If the class
definition defines an alias class (see .int.define-alias-class), then the
variable named as the second parameter to DEFINE_CLASS will be appropriate to
pass to INHERIT_CLASS.
Specialization
.int.specialize: Class specialization details must be given explicitly in the
class initialization code (see .int.define-class). This must happen AFTER the
inheritance details are given (see .int.inheritance).
Extension
.int.extend: To extend the protocol when defining a new class, a new type must
be defined for the class structure. This must embed the structure for the
primarily inherited class as the first field of the structure. Class extension
details must be given explicitly in the class initialization code (see
.int.define-class). This must happen AFTER the inheritance details are given
(see .int.inheritance).
Introspection
.introspect.c-lang: The design includes a number of introspection functions for
dynamically examining class relationships. These functions are polymorphic and
accept arbitrary subclasses of ProtocolClass. C doesn't support such
polymorphism. So although these have the semantics of functions (and could be
implemented as functions in another language with compatible calling
conventions) they are actually implemented as macros. The macros are named as
method-style macros despite the fact that this arguably contravenes
guide.impl.c.macro.method. The justification for this is that this design is
intended to promote the use of polymorphism, and it breaks the abstraction for
the users to need to be aware of what can and can't be expressed directly in C
function syntax. These functions all end in "Poly" to identify them as
polymorphic functions.
.int.superclass: ProtocolClassSuperclassPoly(class) is an introspection
function which returns the direct superclass of class object class.
.int.static-superclass: SUPERCLASS(className) is an introspection macro which
returns the direct superclass given a class name, which must (obviously) be
statically known. The macro expands into a call to the ensure function for the
class name, so this must be in scope (which may require a forward
declaration). The macro is useful for next-method calls (see
.overview.next-method). The superclass is returned with type ProtocolClass so
it may be necessary to cast it to the type for the appropriate subclass.
.int.static-superclass.special: Implementors of classes which are designed to
be subclassed without extension may choose to provide a convenience macro which
expands into SUPERCLASS along with a type cast. For example, there might be a
macro for finding pool superclasses such that the macro call
POOL_SUPERCLASS(className) is exactly equivalent to
(PoolClass)SUPERCLASS(className). It's convenient to define these macros
alongside the convenience class definition macro (see .int.define-special).
.int.class: ClassOfPoly(inst) is an introspection function which returns the
class of which inst is a direct instance.
.int.subclass: IsSubclassPoly(sub, super) is an introspection function which
returns a boolean indicating whether sub is a subclass of super. I.e., it is a
predicate for testing subclass relationships.
Multiple inheritance
.int.mult-inherit: Multiple inheritance involves an extension of the protocol
(see .int.extend) and also multiple uses of the single inheritance mechanism
(see .ini.inheritance). It also requires specialized methods for coerceClass
and coerceInst to be written (see .overview.coerce-class &
.overview.coerce-inst). Documentation on support for multiple inheritance is
under construction. This facility is not currently used. The basic idea is
described in mail.tony.1998-10-06.11-03(0).
Protocol guidelines
.guide.fail: When designing an extensible function which might fail, the design
must permit the correct implementation of the failure-case code. Typically, a
failure might occur in any method in the chain. Each method is responsible for
correctly propagating failure information supplied by superclass methods and
for managing it's own failures.
.guide.fail.before-next: Dealing with a failure which is detected before any
next-method call is made is similar to a fail case in any non-extensible
function. See .example.fail below.
.guide.fail.during-next: Dealing with a failure returned from a next-method
call is also similar to a fail case in any non-extensible function. See
.example.fail below.
.guide.fail.after-next: Dealing with a failure which is detected after the next
methods have been successfully invoked is more complex. If this scenario is
possible, the design must include an "anti-function", and each class must
ensure that it provides a method for the anti-method which will clean up any
resources which are claimed after a successful invocation of the main method
for that class. Typically the anti-function would exist anyway for clients of
the protocol (e.g. "finish" is an anti-function for "init"). The effect of the
next-method call can then be cleaned up by calling the anti-method for the
superclass. See .example.fail below.
Example
.example.inheritance: The following example class definition shows both
inheritance and specialization. It shows the definition of the class
EPDRPoolClass, which inherits from EPDLPoolClass and has specialized values of
the name, init & alloc fields. The type EPDLPoolClass is an alias (typedef)
for PoolClass.
typedef EPDLPoolClass EPDRPoolClass;
typedef EPDLPoolClassStruct EPDRPoolClassStruct;
DEFINE_CLASS(EPDRPoolClass, this)
{
INHERIT_CLASS(this, EPDLPoolClass);
this->name = "EPDR";
this->init = EPDRInit;
this->alloc = EPDRAlloc;
}
.example.extension: The following (hypothetical) example class definition shows
inheritance, specialization and also extension. It shows the definition of the
class EPDLDebugPoolClass, which inherits from EPDLPoolClass, but also
implements a method for checking properties of the pool.
typedef struct EPDLDebugPoolClassStruct {
EPDLPoolClassStruct epdl;
DebugPoolCheckMethod check;
Sig sig;
} EPDLDebugPoolClassStruct;
typedef EPDLDebugPoolClassStruct *EPDLDebugPoolClass;
DEFINE_CLASS(EPDLDebugPoolClass, this)
{
EPDLPoolClass epdl = &this->epdl;
INHERIT_CLASS(epdl, EPDLPoolClass);
epdl->name = "EPDLDBG";
this->check = EPDLDebugCheck;
this->sig = EPDLDebugSig;
}
.example.fail: The following example shows the implemenation of failure-case
code for an "init" method, making use of the "finish" anti-method:-
static Res mySegInit(Seg seg, Pool pool, Addr base, Size size,
Bool reservoirPermit, va_list args)
{
SegClass super;
MYSeg myseg;
OBJ1 obj1;
Res res;
Arena arena;
AVERT(Seg, seg);
myseg = SegMYSeg(seg);
AVERT(Pool, pool);
arena = PoolArena(pool);
/* Ensure the pool is ready for the segment */
res = myNoteSeg(pool, seg);
if(res != ResOK)
goto failNoteSeg;
/* Initialize the superclass fields first via next-method call */
super = (SegClass)SUPERCLASS(MYSegClass);
res = super->init(seg, pool, base, size, reservoirPermit, args);
if(res != ResOK)
goto failNextMethods;
/* Create an object after the next-method call */
res = ControlAlloc(&obj1, arena, sizeof(OBJ1Struct), reservoirPermit);
if(res != ResOK)
goto failObj1;
myseg->obj1 = obj1
return ResOK;
failObj1:
/* call the anti-method for the superclass */
super->finish(seg);
failNextMethods:
/* reverse the effect of myNoteSeg */
myUnnoteSeg(pool, seg);
failNoteSeg:
return res;
}
IMPLEMENTATION
.impl.derived-names: The DEFINE_CLASS macro derives some additional names from
the class name as part of it's implementation. These should not appear in the
source code - but it may be useful to know about this for debugging purposes.
For each class definition for class SomeClass, the macro defines the following:
extern SomeClass EnsureSomeClass(void); /* The class accessor function.
See.overview.naming */
static Bool protocolSomeClassGuardian; /* A boolean which indicates whether
the class has been initialzed yet */
static void protocolEnsureSomeClass(SomeClass); /* A function called by
EnsureSomeClass. All the class initialization code is actually in this function
*/
static SomeClassStruct protocolSomeClassStruct; /* Static storage for the
canonical class object */
.impl.init-once: Class objects only behave according to their definition after
they have been initialized, and class protocols may not be used before
initialization has happened. The only code which is allowed to see a class
object in a partially initialized state is the initialization code itself --
and this must take care not to pass the object to any other code which might
assume it is initialized. Once a class has been initialized, the class might
have a client. The class must not be initialized again when this has happened,
because the state is not necessarily consistent in the middle of an
initialization function. The initialization state for each class is stored in
a boolean "guardian" variable whose name is derived from the class name (see
.impl.derived-names). This ensures the initialization hapens only once. The
path through the EnsureSomeClass function should be very fast for the common
case when this variable is TRUE, and the class has already been initialized, as
the canonical static storage can simply be returned in that case. However,
when the value of the guardian is FALSE, the class is not initialized. In this
case, a call to EnsureSomeClass must first execute the initialization code and
then set the guardian to TRUE. However, this must happen atomically (see
.impl.init-lock).
.impl.init-lock: There would be the possibility of a race condition if
EnsureSomeClass were called concurrently on separate threads before SomeClass
has been initialized. The class must not be initialized more than once, so the
sequence test-guard, init-class, set-guard must be run as a critical region.
It's not sufficient to use the arena lock to protect the critical region,
because the class object might be shared between multiple arenas. The
DEFINE_CLASS macro uses a global recursive lock instead. The lock is only
claimed after an initial unlocked access of the guard variable shows that the
class in not initialized. This avoids any locking overhead for the common case
where the class is already initialized. This lock is provided by the lock
module -- see design.mps.lock(0).

105
mps/design/protsu/index.txt Normal file
View file

@ -0,0 +1,105 @@
SUNOS 4 IMPLEMENTATION OF PROTECTION MODULE
design.mps.protsu
incomplete doc
drj 1997-03-20
INTRODUCTION
.readership: Any MPS developer
.intro: This is the design of the SunOS 4 implementation of the protection
module. It is intended to be used only in SunOS 4 (os.su). It makes use of
various services provided by SunOS 4.
[largely unwritten]
REQUIREMENTS
.req.general: Required to implement the general protection interface defined in
design.mps.prot.if.*.
OVERVIEW
[uses mprotect]
MISC
.improve.sig-stack: Currently we do not handle signals on a separate signal
stack. If we handled signals on our own stack then we could guarantee not to
run out of stack while we were handling the signal. This would be useful (it
may even be required). We would have to use sigvec(2) rather than signal(3)
(set the SV_ONSTACK flag and use sigstack(2)). This has drawbacks as the
signal stack is not grown automatically, so we would have to to frig the stacks
back if we wanted to pass on the signal to some other handler as that handler
may require arbitrary amounts of stack.
.improve.sigvec: Note 1 of ProtSetup notes that we can't honour the sigvec(2)
entries of the next handler in the chain. What if when we want to pass on the
signal instead of calling the handler we call sigvec with the old entry and use
kill to send the signal to ourselves and then restore our handler using sigvec
again. ramble ramble. [need more detail and analysis here].
assume mprotect never fails and why. [We also need a policy here]
DATASTRUCTURES
.data.signext: This is static. Because that is the only communications channel
available to signal handlers. [write a little more here]
FUNCTIONS
.fun.setup:
ProtSetup
The setup involves installing a signal handler for the signal SIGSEGV to catch
and handle protection faults (this handler is the function sigHandle, see
.fun.sighandle). The previous handler is recorded (in the variable sigNext, see
.data.signext) so that it can be reached from sigHandle if it fails to handle
the fault.
The problem with this approach is that we can't honor the wishes of the
sigvec(2) entry for the previous handler (in terms of masks in particular).
Obviously it would be okay to always chain the previous signal handler onto
sigNext, however in the case where the previous handler is the one we've just
installed (ie, sigHandle) then it is not necessary to chain the handler, so we
don't.
.fun.set:
void ProtSet(Addr base, Addr limit, AccessSet mode)
.fun.set.convert: The requested protection (which is expressed in the mode
parameter, see design.mps.prot.if.set) is translated into an OS protection. If
read accesses are to be forbidden then all accesses are forbidden, this is done
by setting the protection of the page to PROT_NONE. If write access are to be
forbidden (and not read accesses) then write accesses are forbidden and read
accesses are allowed, this is done by setting the protection of the page to
PROT_READ|PROT_EXEC. Otherwise (all access are okay), the protection is set to
PROT_READ|PROT_WRITE|PROT_EXEC.
.fun.set.assume.mprotect: We assume that the call to mprotect always succeeds.
.fun.set.assume.mprotect: This is because we should always call the function
with valid arguments (aligned, references to mapped pages, and with an access
that is compatible with the access of the underlying object).
.fun.sync:
void ProtSync(Space space);
This does nothing in this implementation as ProtSet sets the protection without
any delay.
.fun.tramp:
void ProtTramp(void **resultReturn, void *(*f)(void *, size_t), void *p, size_t
s);
The protection trampoline is trivial under SunOS, as there is nothing that
needs to be done in the dynamic context of the mutator in order to catch
faults. (Contrast this with Win32 Structured Exception Handling.)

View file

@ -0,0 +1,274 @@
DESIGN OF THE POSIX THREAD EXTENSIONS FOR MPS
design.mps.pthreadext
draft doc
tony 2000-02-01
INTRODUCTION
.readership: Any MPS developer.
.intro: This is the design of the Pthreads extension module, which provides
some low-level threads support for use by MPS (notably suspend and resume).
DEFINITIONS
.pthreads: The term "Pthreads" means an implementation of the POSIX
1003.1c-1995 thread standard. (Or the Single UNIX Specification, Version 2, aka
USV2 or UNIX98.)
.context: The "context" of a thread is a (platform-specific) OS-defined
structure which describes the current state of the registers for that thread.
REQUIREMENTS
.req.suspend: A means to suspend threads, so that they don't make any progress.
.req.suspend.why: Needed by the thread manager so that other threads registered
with an arena can be suspended (see design.mps.thread-manager). Not directly
provided by Pthreads.
.req.resume: A means to resume suspended threads, so that they are able to make
progress again. .req.resume.why: Needed by the thread manager. Not directly
provided by Pthreads.
.req.suspend.multiple: Allow a thread to be suspended on behalf of one arena
when it has already been suspended on behalf of one or more other arenas.
.req.suspend.multiple.why: The thread manager contains no design for
cooperation between arenas to prevent this.
.req.resume.multiple: Allow requests to resume a thread on behalf of each arena
which had previously suspended the thread. The thread must only be resumed
when requests from all such arenas have been received.
.req.resume.multiple.why: A thread manager for an arena must not permit a
thread to make progress before it explicitly resumes the thread.
.req.suspend.context: Must be able to access the context for a thread when it
is suspended.
.req.suspend.protection: Must be able to suspend a thread which is currently
handling a protection fault (i.e., an arena access). Such a thread might even
own an arena lock.
.req.legal: Required to use Pthreads / POSIX APIs in a legal manner.
ANALYSIS
.anal.suspend: Thread suspension is inherently asynchronous. MPS needs to be
able to suspend another thread without prior knowledge of the code that thread
is running. (I.e., we can't rely on cooperation between threads.) The only
asynchronous communication available on POSIX is via signals - so the suspend
and resume mechanism must ultimately be built from signals.
.anal.signal.safety: POSIX imposes some restrictions on what a signal handler
function might do when invoked asynchronously (see
<URL:http://www.opengroup.org/onlinepubs/007908799/xsh/sigaction.html>, and
search for the string "reentrant"). In summary, a small number of POSIX
functions are defined to be "async-signal safe", which means they may be
invoked without restriction in signal handlers. All other POSIX functions are
considered to be unsafe. Behaviour is undefined if an unsafe function is
interrupted by a signal and the signal handler then proceeds to call another
unsafe function. See mail.tony.1999-08-24.15-40(0)and followups for some
further analysis.
.anal.signal.safety.implication: Since we can't assume that we won't attempt to
suspend a thread while it is running an unsafe function, we must limit the use
of POSIX functions in the suspend signal handler to those which are designed to
be "async-signal safe". One of the few such functions related to
synchronization is sem_post.
.anal.signal.example: An example of how to suspend threads in POSIX was posted
to newsgroup comp.programming.threads in August 1999. The code that was posted
was written by David Butenhof, and may be found here:
Some further discussion about the code in the newsgroup is recorded here:
.anal.signal.linux-hack: In the current implementation of Linux Pthreads, it
would be possible to implement suspend/resume using SIGSTOP and SIGCONT. This
is, however, nonportable and will probably stop working on Linux at some point.
.anal.component: There is no known way to meet the requirements above in a way
which cooperates with another component in the system which also provides its
own mechanism to suspend and resume threads. The best bet for achieving this
is to provide the functionality in shared low-level component which may be used
by MPS and other clients. This will require some discussion with other
potential clients and/or standards bodies. .anal.component: NB., such
cooperation is actually a requirement for Dylan (req.dylan.dc.env.self), though
this is not a problem, since all the Dylan components share the MPS mechanism.
INTERFACE
.if.pthreadext.abstract: A thread is represented by the abstract type
PThreadext. A PThreadext object corresponds directly with a PThread (of type
pthread_t). There may be more than one PThreadext object for the same PThread.
.if.pthreadext.structure: The structure definition of PThreadext
(PThreadextStruct) is exposed by the interface so that it may be embedded in a
client datastructure (e.g. ThreadStruct). This means that all storage
management can be left to the client (which is important because there might be
multiple arenas involved). Clients may not access the fields of a
PThreadextStruct directly.
.if.init: Initializes a PThreadext object for a thread with the given id:
void PThreadextInit(PThreadext pthreadext, pthread_t id)
.if.check: Checks a PThreadext object for consistency:
Bool PThreadextCheck(PThreadext pthreadext)
Note that this function takes the mutex, so it must not be called with the
mutex held (doing so will probably deadlock the thread).
.if.suspend: Suspends a PThreadext object (puts it into a suspended state).
Meets .req.suspend.*. The object must not already be in a suspended state. If
the function returns ResOK, the context of the thread is returned in
contextReturn, and the corresponding PThread will not make any progress until
it is resumed:
Res PThreadextSuspend(PThreadext pthreadext, struct sigcontext **contextReturn)
.if.resume: Resumes a PThreadext object. Meets .req.resume.*. The object must
already be in a suspended state. Puts the object into a non-suspended state.
Permits the corresponding PThread to make progress again, (although that might
not happen immediately if there is another suspended PThreadext object
corresponding to the same thread):
Res PThreadextResume(PThreadext pthreadext)
.if.finish: Finishes a PThreadext object:
void PThreadextFinish(PThreadext pthreadext)
IMPLEMENTATION
.impl.pthreadext: The structure definition for a PThreadext object is:
typedef struct PThreadextStruct {
Sig sig; /* design.mps.sig */
pthread_t id; /* Thread ID */
struct sigcontext *suspendedScp; /* sigcontext if suspended */
RingStruct threadRing; /* ring of suspended threads */
RingStruct idRing; /* duplicate suspensions for id */
} PThreadextStruct;
.impl.field.id: The id field shows which PThread the object corresponds to.
.impl.field.scp: The suspendedScp field contains the context when in a
suspended state. Otherwise it is NULL.
.impl.field.threadring: The threadRing field is used to chain the object onto
the suspend ring when it is in the suspended state (see .impl.suspend-ring).
When not in a suspended state, this ring is single.
.impl.field.idring: The idRing field is used to group the object with other
objects corresponding to the same PThread (same id field) when they are in the
suspended state. When not in a suspended state, or when this is the only
PThreadext object with this id in the suspended state, this ring is single.
.impl.global.suspend-ring: The module maintains a global suspend-ring - a ring
of PThreadext objects which are in a suspended state. This is primarily so
that it's possible to determine whether a thread is curently suspended anyway
because of another PThreadext object, when a suspend attempt is made.
.impl.global.victim: The module maintains a global variable which is used to
indicate which PThreadext is the current victim during suspend operations. This
is used to communicate information between the controlling thread and the
thread being suspended (the victim). The variable has value NULL at other times.
.impl.static.mutex: We use a lock (mutex) around the suspend and resume
operations. This protects the state data (suspend-ring etc. see impl.global.*).
Since only one thread can be suspended at a time, there's no possibility of two
arenas suspending each other by concurrently suspending each other's threads.
.impl.static.semaphore: We use a semaphore to synchronize between the
controlling and victim threads during the suspend operation. See .impl.suspend
and .impl.suspend-handler).
.impl.static.init: The static data and global variables of the module are
initialized on the first call to PThreadextSuspend, using pthread_once to avoid
concurrency problems. We also enable the signal handlers at the same time (see
.impl.suspend-handler and .impl.resume-handler).
.impl.suspend: PThreadextSuspend first ensures the module is initialized (see
.impl.static.init). After this, it claims the mutex (see .impl.static.mutex).
It then checks to see whether thread of the target PThreadext object has
already been suspended on behalf of another PThreadext object. It does this by
iterating over the suspend ring.
.impl.suspend.already-suspended: If another object with the same id is found on
the suspend ring, then the thread is already suspended. The context of the
target object is updated from the other object, and the other object is linked
into the idRing of the target.
.impl.suspend.not-suspended: If the thread is not already suspended, then we
forcibly suspend it using a technique similar to Butenhof's (see
.anal.signal.example): First we set the victim variable (see
.impl.global.victim) to indicate the target object. Then we send the signal
PTHREADEXT_SIGSUSPEND to the thread (see .impl.signals), and wait on the
semaphore for it to indicate that it has received the signal and updated the
victim variable with the context. If either of these operations fail (e.g.
because of thread termination) we unlock the mutex and return ResFAIL.
.impl.suspend.update: Once we have ensured that the thread is definitely
suspended, we add the target PThreadext object to the suspend ring, unlock the
mutex, and return the context to the caller.
.impl.suspend-handler: The suspend signal handler is invoked in the target
thread during a suspend operation, when a PTHREADEXT_SIGSUSPEND signal is sent
by the controlling thread (see .impl.suspend.not-suspended). The handler
determines the context (received as a parameter, although this may be
platform-specific) and stores this in the victim object (see
.impl.global.victim). The handler then masks out all signals except the one
that will be received on a resume operation (PTHREADEXT_SIGRESUME) and
synchronizes with the controlling thread by posting the semaphore. Finally the
handler suspends until the resume signal is received (using sigsuspend).
.impl.resume: PThreadextResume first claims the mutex (see .impl.static.mutex).
It then checks to see whether thread of the target PThreadext object has also
been suspended on behalf of another PThreadext object (in which case the id
ring of the target object will not be single).
.impl.resume.also-suspended: If the thread is also suspended on behalf of
another PThreadext, then the target object is removed from the id ring.
.impl.resume.not-also: If the thread is not also suspended on behalf of another
PThreadext, then the thread is resumed using the technique proposed by Butenhof
(see .anal.signal.example). I.e. we send it the signal PTHREADEXT_SIGRESUME
(see .impl.signals) and expect it to wake up. If this operation fails (e.g.
because of thread termination) we unlock the mutex and return ResFAIL.
.impl.resume.update: Once the target thread is in the appropriate state, we
remove the target PThreadext object from the suspend ring, set its context to
NULL and unlock the mutex.
.impl.resume-handler: The resume signal handler is invoked in the target thread
during a resume operation, when a PTHREADEXT_SIGRESUME signal is sent by the
controlling thread (see .impl.resume.not-also). The resume signal handler
simply returns. This is sufficient to unblock the suspend handler, which will
have been blocking the thread at the time of the signal. The Pthreads
implementation ensures that the signal mask is restored to the value it had
before the signal handler was invoked.
.impl.finish: PThreadextFinish supports the finishing of objects in the
suspended state, and removes them from the suspend ring and id ring as
necessary. It must claim the mutex for the removal operation (to ensure
atomicity of the operation). Finishing of suspended objects is supported so
that clients can dispose of resources if a resume operation fails (which
probably means that the PThread has terminated).
.impl.signals: The choice of which signals to use for suspend and restore
operations may need to be platform-specific. Some signals are likely to be
generated and/or handled by other parts of the application and so should not be
used (e.g. SIGSEGV). Some implementations of PThreads use some signals for
themselves, so they may not be used; e.g. LinuxThreads uses SIGUSR1 and SIGUSR2
for its own purposes. The design abstractly names the signals
PTHREADEXT_SIGSUSPEND and PTHREAD_SIGRESUME, so that they may be easily mapped
to appropriate real signal values. Candidate choices are SIGXFSZ and SIGPWR.
ATTACHMENTS
"posix.txt"
"susp.c"

View file

@ -0,0 +1,95 @@
THE DESIGN OF THE LOW-MEMORY RESERVOIR
design.mps.reservoir
incomplete design
tony 1998-07-30
INTRODUCTION:
The low-memory reservoir provides client support for implementing handlers for
low-memory situations which allocate. The reservoir is implemented inside the
arena as a pool of unallocatable segments.
OVERVIEW:
This is just a placeholder at the moment.
ARCHITECTURE:
.adt: The reservoir interface looks (almost) like an abstract data type of type
Reservoir. It's not quite abstract because the arena embeds the structure of
the reservoir (of type ReservoirStruct) into its own structure, for simplicity
of initialization.
.align: The reservoir is implemented as a pool of available tracts, along with
a size and limit which must always be aligned to the arena alignment. The size
corresponds to the amount of memory currently maintained in the reservoir. The
limit is the maximum amount that it is desired to maintain.
.wastage: When the reservoir limit is set by the client, the actual limit
should be increased by an arena alignment amount for every active mutator
buffer.
.really-empty: When the reservoir limit is set to 0, assume that the client
really doesn't have a need for a reservoir at all. In this case, the client
won't even want an allowance to be made for wastage in active buffers.
IMPLEMENTATION:
.interface: The following functions comprise the interface to the reservoir
module:
.interface.check: ReservoirCheck checks the reservoir for consistency:
extern Bool ReservoirCheck(Reservoir reservoir);
.interface.init: ReservoirInit initializes the reservoir and its associated
pool, setting the size and limit to 0:
extern Res ReservoirInit(Reservoir reservoir, Arena arena);
.interface.finish: ReservoirFinish de-initializes the reservoir and its
associated pool:
extern void ReservoirFinish (Reservoir reservoir);
.interface.limit: ReservoirLimit returns the limit of the reservoir:
extern Size ReservoirLimit(Reservoir reservoir);
.interface.set-limit: ReservoirSetLimit sets the limit of the reservoir, making
an allowance for wastage in mutator buffers:
extern void ReservoirSetLimit(Reservoir reservoir, Size size);
.interface.available: ReservoirAvailable returns the available size of the
reservoir:
extern Size ReservoirAvailable(Reservoir reservoir);
.interface.ensure-full: ReservoirEnsureFull attempts to fill the reservoir with
memory from the arena, until it is full:
extern Res ReservoirEnsureFull(Reservoir reservoir);
.interface.deposit: ReservoirDeposit attempts to fill the reservoir with memory
in the supplied range, until it is full. This is called by the arena from
ArenaFree if the reservoir is not known to be full. Any memory which is not
added to the reservoir (because the reservoir is full) is freed via the arena
class's free method.
extern void ReservoirDeposit(Reservoir reservoir, Addr base, Size size);
.interface.withdraw: ReservoirWithdraw attempts to allocate memory of the
specified size to the specified pool to the reservoir. If no suitable memory
can be found it returns ResMEMORY.
extern Res ReservoirWithdraw(Addr *baseReturn, Tract *baseTractReturn,
Reservoir reservoir, Size size, Pool pool);
.interface.withdraw.align: Currently, ReservoirWithdraw can only withdraw
memory in chunks of the size of the arena alignment. This is because the
reservoir doesn't attempt to coalesce adjacent memory blocks. This deficiency
should be fixed in the future.
.pool: The memory managed by the reservoir is owned by the reservoir pool.
This memory is never sub-allocated. Each tract belonging to the pool is linked
onto a list. The head of the list is in the Reservoir object. Links are
stored in the TractP fields of each tract object.

154
mps/design/ring/index.txt Normal file
View file

@ -0,0 +1,154 @@
THE DESIGN OF THE RING DATA STRUCTURE
design.mps.ring
incomplete doc
richard 1996-09-26
INTRODUCTION
.source: rings are derived from the earlier use of Deques. See
design.mps.deque.
DESCRIPTION
.def.ring: Rings are circular doubly-linked lists of ring "nodes". The nodes
are fields of structures which are the "elements" of the ring.
Ring node structures (RingStruct) are in-lined in the structures on the ring,
like this:
typedef struct FooStruct *Foo; /* the element type */
typedef struct FooStruct { /* the element structure */
int baz, bim;
RingStruct ring; /* the ring node */
float bip, bop;
} FooStruct;
This arrangement means that they do not need to be managed separately. This is
especially useful in avoiding re-entrancy and bootstrapping problems in the
memory manager. Rings also provide flexible insertion and deletion because the
entire ring can be found from any node.
In the MPS, rings are used to connect a "parent" structure (such as a Space) to
a number of "child" structures (such as Pools), as shown in .fig.ring (note the
slight abuse of naming convention (in that barRing is not called
barRingStruct)).
.fig.ring: A ring of Bar objects owned by a Foo object.
.fig.empty: An empty ring of Bar objects owned by a Foo object.
.def.singleton: A "singleton" ring is a ring containing one node, whose
previous and next nodes are itself (see .fig.single).
.fig.single: A singleton Bar object not on any ring.
.fig.elt: How RING_ELT gets a parent pointer from a node pointer.
- Ring Diagrams
INIT / FINISH
.init: Rings are initialized with the RingInit function. They are initialized
to be a singleton ring (.def.singleton).
.finish: Rings are finished with the RingFinish funtion. A ring must be a
singleton ring before it can be finished (it is an error to attempt to finish a
non-singleton ring).
ITERATION
.for: A macro is used for iterating over the elements in a ring. This macro is
called RING_FOR. RING_FOR takes three arguments, the first is an iteration
variable: "node", the second is the "parent" element in the ring: "ring", the
third is a variable used by the iterator for working state (it holds a pointer
to the next node): "next". All arguments must be of type Ring. The "node" and
"next" variables must be declared and in scope already. All elements except
for the "parent" element are iterated over. The macro expands to a for
statement. During execution of the loop, the "node" variable (the first
argument to the macro) will be the value of successive elements in the Ring (at
the beginning of the statement in the body of the loop). .for.error: It is an
error (possibly unchecked) for the "node" and "next" variables to be modified
except implicitly by using this iterator. .for.safe: It is safe to delete the
current node during the iteration.
.for.ex:
An example:
Ring node, nextNode;
RING_FOR(node, &foo->barRing, nextNode) {
Bar bar = RING_ELT(Bar, FooRing, node);
frob(bar);
}
.for.ex.elt: Notice the idiomatic use of RING_ELT which is almost universal
when using RING_FOR.
SUBCLASS
.elt: RING_ELT is a macro that converts a pointer to a ring structure to a
pointer to the enclosing parent structure.
RING_ELT has three arguments which are, in order:
type, the type of a pointer to the enclosing structure,
field, the name of the ring structure field within it,
ring, the ring node.
The result is a pointer to the enclosing structure.
[ Why does RING_ELT not use PARENT or even offsetof? Apparently it's so that
it can cope with arrays of rings. GavinM 1997-04-15]
APPEND / REMOVE
.append: RingAppend appends a singleton ring to a ring (such that the newly
added element will be last in the iteration sequence).
.insert: RingInsert adds a singleton rung to a ring (such that the newly added
element will be first in the
iteration sequence).
.remove: RingRemove removes an element from a ring, the newly removed element
becomes a singleton ring. It is an error for the element to already be a
singleton.
.improve.join: it would be possible to add a RingJoin operation. This is not
done as it is not required.
NAMING
.naming: By convention, when one structure Foo contains one ring of Bar
structures, the field is Foo is usually known as barRing, and the field in Bar
is known as fooRing. If the Foo structure contains more than one ring of Bar
structures, then they will have names such as spongRing and frobRing.
DEQUES
This section documents where rings differ significantly from deques.
.head: Deques used a distinguished head structure for the head of the ring.
Rings still have a separate head structure, but it is not distinguished by type.
DEFECTS
This section documents known defects with the current design.
.app_for.misuse: It is easy to pass RingAppend and RING_FOR the arguments in
the wrong order as all the arguments have the same type. see .head.
.check.improve: There is no method for performing a full integrity check. This
could be added.
ATTACHMENT
"Ring Diagrams"

65
mps/design/root/index.txt Normal file
View file

@ -0,0 +1,65 @@
THE DESIGN OF THE ROOT MANAGER
design.mps.root
incomplete design
richard 1995-08-25
INTRODUCTION
.intro:
.readership:
BASICS
.root.def: The root node of the object graph is the node which defines whether
objects are accessible, and the place from which the mutator acts to change the
graph. In the MPS, a root is an object which describes part of the root node.
The root node is the total of all the roots attached to the space. [Note that
this combines two definitions of root: the accessibility is what defines a root
for tracing (see analysis.tracer.root.* and the mutator action for barriers
(see analysis.async-gc.root). pekka 1998-03-20]
.root.repr: Functionally, roots are defined by their scanning functions. Roots
_could_ be represented as function closures, i.e. a pointer to a C function and
some auxillary fields. The most general variant of roots is just that.
However, for reasons of efficiency, some special variants are separated out.
DETAILS
Creation
.create: A root becomes "active" as soon as it is created.
.create.col: The root inherits its colour from the mutator, since it can only
contain references copied there by the mutator from somewhere else. If the
mutator is grey for a trace when a root is created then that root will be used
to determine accessibility for that trace. More specifically, the root will be
scanned when that trace flips.
Destruction
.destroy: It's OK to destroy a root at any time, except perhaps concurrently
with scanning it, but that's prevented by the arena lock. If a root is
destroyed the references in it become invalid and unusable.
Invariants
.inv.white: Roots are never white for any trace, because they cannot be
condemned.
.inv.rank: Roots always have a single rank. A root without ranks would be a
root without references, which would be pointless. The tracer doesn't support
multiple ranks in a single colour.
Scanning
.method: Root scanning methods are provided by the client so that the MPS can
locate and scan the root set. See protocol.mps.root for details. [There are
some more notes about root methods in meeting.qa.1996-10-16.]

76
mps/design/scan/index.txt Normal file
View file

@ -0,0 +1,76 @@
THE DESIGN OF THE GENERIC SCANNER
design.mps.scan
incomplete design
richard 1995-08-25
SUMMARIES
Scanned Summary
.summary.subset: The summary of reference seens by scan (ss.unfixedSummary) is
a subset of the summary previously computed (SegSummary).
There are two reasons that it is not an equality relation:
1. If the segment has had objects forwarded onto it then its summary will get
unioned with the summary of the segment that the object was forwarded from.
This may increase the summary. The forwarded object of course may have a
smaller summary (if such a thing were to be computed) and so subsequent
scanning of the segment may reduce the summmary. (The forwarding process may
erroneously introduce zones into the destination's summary).
2. A write barrier hit will set the summary to RefSetUNIV.
The reason that ss.unfixedSummary is always a subset of the previous summary is
due to an "optimization" which has not been made in TraceFix. See
impl.c.trace.fix.fixed.all.
Partial Scans
.clever-summary: With enough cleverness, it's possible to have partial scans of
condemned segments contribute to the segment summary. [We had a system which
nearly worked -- see MMsrc(MMdevel_poolams at 1997/08/14 13:02:55 BST), but it
did not handle the situation in which a segment was not under the write barrier
when it was condemned.]
.clever-summary.acc: Each time we partially scan a segment, we accumulate the
post-scan summary of the scanned objects into a field in the group, called
'summarySoFar'. The post-scan summary is (summary \ white) U fixed.
.clever-summary.acc.condemn: The cumulative summary is only meaningful while
the segment is condemned. Otherwise it is set to RefSetEMPTY (a value which we
can check).
.clever-summary.acc.reclaim: Then when we reclaim the segment, we set the
segment summary to the cumulative summary, as it is a post-scan summary of all
the scanned objects.
.clever-summary.acc.other-trace: If the segment is scanned by another trace
while it is condemned, the cumulative summary must be set to the post-scan
summary of this scan (otherwise it becomes out-of-date).
.clever-summary.scan: The scan summary is expected to be a summary of all
scanned references in the segment. We don't know this accurately until we've
scanned everything in the segment. So we add in the segment summary each time.
.clever-summary.scan.fix: TraceScan also expects the scan state fixed summary
to include the post-scan summary of all references which were white. Since we
don't scan all white references, we need to add in an approximation to the
summary of all white references which we didn't scan. This is the intersection
of the segment summary and the white summary.
.clever-summary.wb: If the cumulative summary is smaller than the mutator's
summary, a write-barrier is needed to pervent the mutator from invalidating it.
This means that sometimes we'd have to put the segment under the write-barrier
at condemn [this is not an operation currently available to pool class
implementations pekka 1998-02-26], which might not be very efficient.
.clever-summary.method.wb: We need a new pool class method, called when the
write barrier is hit (or possibly any barrier hit). The generic method will do
the usual TraceAccess work, the trivial method will do nothing.
.clever-summary.acc.wb: When the write barrier is hit, we need to correct the
cumulative summary to the mutator summary. This is approximated by setting the
summary to RefSetUNIV.

273
mps/design/seg/index.txt Normal file
View file

@ -0,0 +1,273 @@
THE DESIGN OF THE MPS SEGMENT DATA STRUCTURE
design.mps.seg
incomplete design
drj 1997-04-03
INTRODUCTION
.intro: This document describes the MPS Segment data structure.
Document History
.hist.2: The initial draft (replacing various notes in revisions 0 and 1) was
drafted by Richard Brooksby <richard> on 1997-04-03 as part of editing
MMsrc!seg.c(MMdevel_action2.1).
.hist.3: Rewritten to separate segments and tracts, following
mail.tony.1998-11-02.10-26. tony 1999-04-16
OVERVIEW
.over.segments: Segments are the basic units of tracing and shielding. The MPM
also uses them as units of scanning and colour, although pool classes may
subdivide segments and be able to maintain colour on a finer grain (down to the
object level, for example).
.over.objects: The mutator's objects are stored in segments. Segments are
contiguous blocks of memory managed by some pool. .segments.pool: The
arrangement of objects within a segment is determined by the class of the pool
which owns the segment. The pool is associated with the segment indirectly via
the first tract of the segment.
.over.memory: The relationship between segments and areas of memory is
maintained by the segment module. Pools acquire tracts from the arena, and
release them back to the arena when they don't need them any longer. The
segment module can associate contiguous tracts owned by the same pool with a
segment. The segment module provides the methods SegBase, SegLimit, and SegSize
which map a segment onto the addresses of the memory block it represents.
.over.hierarchy: The Segment datastructure is designed to be subclassable (see
design.mps.protocol). The basic segment class (Seg) supports colour and
protection for use by the tracer, as well as support for a pool ring, and all
generic segment functions. Clients may use Seg directly, but will most probably
want to use a subclass with additional properties.
.over.hierarchy.gcseg: The segment module provides GCSeg - a subclass of Seg
which has full support for GC including buffering and the ability to be linked
onto the grey ring.
DATA STRUCTURE
typedef struct SegStruct { /* segment structure */
Sig sig; /* impl.h.misc.sig */
SegClass class; /* segment class structure */
Tract firstTract; /* first tract of segment */
RingStruct poolRing; /* link in list of segs in pool */
Addr limit; /* limit of segment */
unsigned depth : SHIELD_DEPTH_WIDTH; /* see impl.c.shield.def.depth */
AccessSet pm : AccessMAX; /* protection mode, impl.c.shield */
AccessSet sm : AccessMAX; /* shield mode, impl.c.shield */
TraceSet grey : TRACE_MAX; /* traces for which seg is grey */
TraceSet white : TRACE_MAX; /* traces for which seg is white */
TraceSet nailed : TRACE_MAX; /* traces for which seg has nailed objects */
RankSet rankSet : RankMAX; /* ranks of references in this seg */
} SegStruct;
typedef struct GCSegStruct { /* GC segment structure */
SegStruct segStruct; /* superclass fields must come first */
RingStruct greyRing; /* link in list of grey segs */
RefSet summary; /* summary of references out of seg */
Buffer buffer; /* non-NULL if seg is buffered */
Sig sig; /* design.mps.sig */
} GCSegStruct;
.field.rankSet: The "rankSet" field represents the set of ranks of the
references in the segment. It is initialized to empty by SegInit.
.field.rankSet.single: The Tracer only permits one rank per segment [ref?] so
this field is either empty or a singleton. .field.rankSet.empty: An empty
rankSet indicates that there are no references. If there are no references in
the segment then it cannot contain black or grey references.
.field.rankSet.start: If references are stored in the segment then it must be
updated, along with the summary (.field.summary.start).
.field.depth: The "depth" field is used by the Sheild (impl.c.shield) to manage
protection of the segment. It is initialized to zero by SegInit.
.field.sm: The "sm" field is used by the Shield (impl.c.shield) to manage
protection of the segment. It is initialized to AccessSetEMPTY by SegInit.
.field.pm: The "pm" field is used by the Shield (impl.c.shield) to manage
protection of the segment. It is initialized to AccessSetEMPTY by SegInit.
The field is used by both the shield and the ANSI fake protection
(impl.c.protan).
.field.black: The "black" field is the set of traces for which there may be
black objects (i.e. objects containing references, but no references to white
objects) in the segment. More precisely, if there is a black object for a
trace in the segment then that trace will appear in the "black" field. It is
initialized to TraceSetEMPTY by SegInit.
.field.grey: The "grey" field is the set of traces for which there may be grey
objects (i.e containing refrences to white objects) in the segment. More
precisely, if there is a reference to a white object for a trace in the segment
then that trace will appear in the "grey" field. It is initialized to
TraceSetEMPTY by SegInit.
.field.white: The "white" field is the set of traces for which there may be
white objects in the segment. More precisely, if there is a white object for a
trace in the segment then that trace will appear in the "white" field. It is
initialized to TraceSetEMPTY by SegInit.
.field.summary: The "summary" field is an approximation to the set of all
references in the segment. If there is a reference R in the segment, then
RefSetIsMember(summary, R) is TRUE. The summary is initialized to RefSetEMPTY
by SegInit. .field.summary.start: If references are stored in the segment then
it must be updated, along with rankSet (.field.rankSet.start).
.field.buffer: The "buffer" field is either NULL, or points to the descriptor
structure of the buffer which is currently allocating in the segment. The
field is initialized to NULL by SegInit. .field.buffer.owner: This buffer must
belong to the same pool as the segment, because only that pool has the right to
attach it.
INTERFACE
Splitting and Merging
.split-and-merge: There is support for splitting and merging segments, to give
pools the flexibility to rearrange their tracts among segments as they see fit.
.split: Segments may be split with the function SegSplit
Res SegSplit(Seg *segLoReturn, Seg *segHiReturn, Seg seg, Addr at,
Bool withReservoirPermit, ...);
If successful, segment "seg" is split at address "at", yielding two segments
which are returned in segLoReturn and segHiReturn for the low and high segments
respectively. The base of the low segment is the old base of "seg". The limit
of the low segment is "at". The base of the high segment is "at". This limit of
the high segment is the old limit of "seg". "seg" is effectively destroyed
during this operation (actually, it might be reused as one of the returned
segments). Segment subclasses may make use of the optional arguments; the
built-in classes do not.
.split.invariants: The client must ensure some invariants are met before
calling SegSplit:
.split.inv.align: "at" must be appropriately aligned to the arena alignment,
and lie between the base and limit of "seg". Justification: the split segments
cannot be represented if this is not so.
.split.inv.buffer: If "seg" is attached to a buffer, the buffered region must
not include address "at". Justification: the segment module is not in a
position to know how (or whether) a pool might wish to split a buffer. This
permits the buffer to remain attached to just one of the returned segments.
.split.state: Except as noted above, the segments returned have the same
properties as "seg". I.e. their colour, summary, rankset, nailedness etc. are
set to the values of "seg".
.merge: Segments may be merged with the function SegMerge
Res SegMerge(Seg *mergedSegReturn, Seg segLo, Seg segHi,
Bool withReservoirPermit, ...);
If successful, segments "segLo" and "segHi" are merged together, yielding a
segment which is returned in mergedSegReturn. "segLo" and "segHi" are
effectively destroyed during this operation (actually, one of them might be
reused as the merged segment). Segment subclasses may make use of the optional
arguments; the built-in classes do not.
.merge.invariants: The client must ensure some invariants are met before
calling SegMerge:
.merge.inv.abut: The limit of "segLo" must be the same as the base of "segHi".
Justification: the merged segment cannot be represented if this is not so.
.merge.inv.buffer: One or other of "segLo" and "segHi" may attached to a
buffer, but not both. Justification: the segment module does not support
attachment of a single seg to 2 buffers.
.merge.inv.similar: "segLo" and "segHi" must be sufficiently similar. Two
segments are sufficiently similar if they have identical values for each of the
following fields: class, sm, grey, white, nailed, rankSet. Justification: there
is no single choice of behaviour for cases where these fields are not
identical. The pool class must make it's own choices about this if it wishes to
permit more flexible merging. If so, it should be a simple matter for the pool
to arrange for the segments to look sufficiently similar before calling
SegMerge.
.merge.state: The merged segment will share the same state as "segLo" and
"segHi" for those fields which are identical (see .merge.inv.similar). The
summary will be the union of the summaries of "segLo" and "segHi".
EXTENSIBILITY
Splitting and Merging
.method.split: Segment subclasses may extend the support for segment splitting
by defining their own "split" method.
Res segSplit(Seg seg, Seg segHi,
Addr base, Addr mid, Addr limit,
Bool withReservoirPermit, va_list args)
On entry, "seg" is a segment with region [base,limit), "segHi" is
uninitialized, "mid" is the address at which the segment is to be split. The
method is responsible for destructively modifying "seg" and initializing
"segHi" so that on exit "seg" is a segment with region [base,mid) and segHi is
a segment with region [mid,limit). Usually a method would only directly modify
the fields defined for the segment subclass. This might involve allocation,
which may use the reservoir if "withReservoirPermit" is TRUE.
.method.split.next: A split method should always call the next method, either
before or after any class-specific code (see design.mps.protocol
.overview.next-method).
.method.merge: Segment subclasses may extend the support for segment merging by
defining their own "merge" method.
Res segMerge(Seg seg, Seg segHi,
Addr base, Addr mid, Addr limit,
Bool withReservoirPermit, va_list args)
On entry, "seg" is a segment with region [base,mid), "segHi" is a segment with
region [mid,limit), The method is responsible for destructively modifying "seg"
and finishing "segHi" so that on exit "seg" is a segment with region
[base,limit) and segHi is garbage. Usually a method would only modify the
fields defined for the segment subclass. This might involve allocation, which
may use the reservoir if "withReservoirPermit" is TRUE. .method.merge.next: A
merge method should always call the next method, either before or after any
class-specific code (see design.mps.protocol.overview.next-method).
.split-merge.shield: Split and merge methods may assume that the segments they
are manipulating are not in the shield cache. .split-merge.shield.flush: The
shield cache is flushed before any split or merge methods are invoked.
.split-merge.shield.re-flush: If a split or merge method performs an operation
on a segment which might cause the segment to be cached, the method must flush
the shield cache before returning or calling another split or merge method.
.split-merge.fail: Split and merge methods might fail, in which case segments
"seg" and "segHi" must be equivalently valid and configured at exit as they
were according to the entry conditions. It's simplest if the failure can be
detected before calling the next method (e.g. by allocating any objects early
in the method). .split-merge.fail.anti: If it's not possible to detect failure
before calling the next method, the appropriate anti-method must be used (see
design.mps.protocol.guide.fail.after-next). Split methods are anti-methods for
merge methods, and vice-versa. .split-merge.fail.anti.constrain: In general,
care should be taken when writing split and merge methods to ensure that they
really are anti-methods for each other. The anti-method must not fail if the
initial method succeeded. The anti-method should reverse any side effects of
the initial method, except where it's known to be safe to avoid this (see
.split-merge.fail.summary for an example of a safe case).
.split-merge.fail.anti.no: If this isn't possible (it might not be) then the
methods won't support after-next failure. This fact should be documented, if
the methods are intended to support further specialization. Note that using
va_arg with the "args" parameter is sufficient to make it impossible to reverse
all side effects.
.split-merge.fail.summary: The segment summary might not be restored exactly
after a failed merge operation. Each segment would be left with a summary which
is the union of the original summaries (see .merge.state). This increases the
conservatism in the summaries, but is otherwise safe.
.split-merge.unsupported: Segment classes need not support segment merging at
all. The function SegClassMixInNoSplitMerge is supplied to set the split and
merge methods to unsupporting methods that will report an error in checking
varieties.

28
mps/design/sig/index.txt Normal file
View file

@ -0,0 +1,28 @@
THE DESIGN OF THE MEMORY POOL SYSTEM SIGNATURE SYSTEM
design.mps.sig
incomplete design
richard 1995-08-25
TESTING:
.test.uniq: The unix command
sed -n '/^#define [a-zA-Z]*Sig/s/[^(]*(/(/p' *.[ch]| sort| uniq -c
will display all signatures defined in the mps along with a count of how many
times they are defined. If any counts are greater than 1, then the same
signature value is being used for different signatures. This is undesirable
and the problem should be investigated. People not using unix may still find
the RE useful.
TEXT:
Signatures are magic numbers which are written into structures
when they are created and invalidated (by overwriting with
SigInvalid) when they are destroyed. They provide a limited form
of run-time type checking and dynamic scope checking.
Signature values should be transliterations of the corresponding words into
hex, as guide.hex.trans. The first three hex digits should be the
transliteration of "SIG".

792
mps/design/splay/index.txt Normal file
View file

@ -0,0 +1,792 @@
DESIGN OF SPLAY TREES
design.mps.splay
draft doc
gavinm 1998-05-01
INTRODUCTION
.intro: This document explains the design of impl.c.splay, an implementation of
Splay Trees, including its interface and implementation.
.readership: This document is intended for any MM developer.
.source: The primary sources for this design are paper.st85(0) and
paper.sleator96(0). Also as CBS is a client, design.mps.cbs. As PoolMVFF is
an indirect client, design.mps.poolmvff(1). Also, as PoolMV2 is an
(obsolescent?) indirect client, design.mps.poolmv2.
.background: The following background documents influence the design:
guide.impl.c.adt(0).
Document History
.hist.0: Written by GavinM 1998-05-01, made draft 1998-05-27.
.hist.1: Added client properties. GavinM 1998-09-09
.hist.2: Polished for review (chiefly adding a DEFINITIONS section). drj
1999-03-10
.hist.3: Edited after review. tony 1999-03-31
OVERVIEW
.overview: Splay trees are a form of binary tree where each access brings the
accessed element (or the nearest element) to the root of the tree. The
restructuring of the tree caused by the access gives excellent amortised
performance, as the splay tree adapts its shape to usage patterns. Unused
nodes have essentially no time overhead. For a cute animation of splay trees,
see <URL:http://langevin.usc.edu/BST/SplayTree-Example.html>.
DEFINITIONS
.def.splay-tree: A "Splay Tree" is a self-adjusting binary tree as described in
paper.st85(0), paper.sleator96(0).
.def.node: A "node" is used in the typical datastructure sense to mean an
element of a tree (see also .type.splay.node).
.def.key: A "key" is a value associated with each node; the keys are totally
ordered by a client provided comparator.
.def.comparator: A "comparator" is a function that compares keys to determine
their ordering (see also .type.splay.compare.method).
.def.successor: Node N1 is the "successor" of node N2 if N1 and N2 are both in
the same tree, and the key of N1 immediately follows the key of N2 in the
ordering of all keys for the tree.
.def.left-child: Each node N contains a "left child", which is a (possibly
empty) sub-tree of nodes. The key of N is ordered after the keys of all nodes
in this sub-tree.
.def.right-child: Each node N contains a "right child", which is a (possibly
empty) sub-tree of nodes. The key of N is ordered before the keys of all nodes
in this sub-tree.
.def.neighbour: A node N which has key Kn is a "neighbour" of a key K if either
Kn is the first key in the total order which compares greater than K or if Kn
is the last key in the total order which compares less than K.
.def.first: A node is the "first" node in a set of nodes if its key compares
less than the keys of all other nodes in the set.
.def.last: A node is the "last" node in a set of nodes if its key compares
greater than the keys of all other nodes in the set.
.def.client-property: A "client property" is a value that the client may
associate with each node in addition to the key (a block size, for example).
This splay tree implementation provides support for efficiently finding the
first or last nodes with suitably large client property values. See also .prop
below.
REQUIREMENTS
.req: These requirements are drawn from those implied by design.mps.poolmv2,
design.mps.poolmvff(1), design.mps.cbs(2) and general inferred MPS requirements.
.req.order: Must maintain a set of abstract keys which is totally ordered for a
comparator.
.req.tree: The keys must be associated with nodes arranged in a Splay Tree.
.req.splay: Common operations must balance the tree by splaying it, to achieve
low amortized cost (see paper.st85(0)).
.req.add: Must be able to add new members. This is a common operation.
.req.remove: Must be able to remove members. This is a common operation.
.req.locate: Must be able to locate a member, given a key. This is a common
operation.
.req.neighbours: Must be able to locate the neighbouring members (in order) of
a non-member, given a key (see .def.neighbour). This is a common operation.
.req.iterate: Must be able to iterate over all members in order with reasonable
efficiency.
.req.protocol: Must support detection of protocol violations.
.req.debug: Must support debugging of clients.
.req.stack: Must do all non-debugging operations with stack usage bounded by a
constant size.
.req.adapt: Must adapt to regularities in usage pattern, for better performance.
.req.property: Must permit a client to associate a client property (such as a
size) with each node in the tree.
.req.property.change: Must permit a client to dynamically reassign client
properties to nodes in the tree. This is a common operation.
.req.property.find: Must support rapid finding of the first and last nodes
which have a suitably large value for their client property. This is a common
operation.
.req.root: Must be able to find the root of a splay tree (if one exists).
EXTERNAL TYPES
.type.splay.tree: SplayTree is the type of the main object at the root of the
splay tree. It is intended that the SplayTreeStruct can be embedded in another
structure (see .usage.client-tree for an example). No convenience functions
are provided for allocation or deallocation.
typedef struct SplayTreeStruct SplayTreeStruct, *SplayTree;
.type.splay.node: SplayNode is the type of a node of the splay tree.
SplayNodeStruct contains no fields to store the key associated with the node,
or the client property. Again, it is intended that the SplayNodeStruct can be
embedded in another structure, and that this is how the association will be
made (see .usage.client-node for an example). No convenience functions are
provided for allocation or deallocation.
typedef struct SplayNodeStruct SplayNodeStruct, *SplayNode;
.type.splay.compare.method: SplayCompareMethod is a pointer to a function with
the following prototype:
Compare compare(void *key, SplayNode node);
The function is required to compare the key with the key the client associates
with that splay tree node, and return the appropriate Compare value (see
.usage.compare for an example). The function compares a key with a node, rather
than a pair of keys or nodes as might seem more obvious. This is because the
details of the mapping between nodes and keys is left to the client (see
.type.splay.node), and the splaying operations compare keys with nodes (see
.impl.splay).
.type.splay.node.describe.method: SplayNodeDescribeMethod is a pointer to a
function with the following prototype:
Res nodeDescribe(SplayNode node, mps_lib_FILE *stream)
The function is required to write (via WriteF) a client-oriented representation
of the splay node. The output should be non-empty, short, and without return
characters. This is provided for debugging purposes only.
.type.splay.test.node.method: SplayTestNodeMethod is a pointer to a function
with the following prototype:
Bool testNode(SplayTree tree, SplayNode node, void *closureP, unsigned long
closureS);
The function is required to determine whether the node itself meets some client
determined property (see .prop and .usage.test.node for an example). Parameters
closureP and closureS describe the environment for the function (see
.function.splay.find.first and .function.splay.find.last).
.type.splay.test.tree.method: SplayTestTreeMethod is a pointer to a function
with the following prototype:
Bool testTree(SplayTree tree, SplayNode node, void *closureP, unsigned long
closureS);
The function is required to determine whether any of the nodes in the sub-tree
rooted at the given node meet some client determined property (see .prop and
.usage.test.tree for an example). In particular, it must be a precise (not
conservative) indication of whether there are any nodes in the sub-tree for
which the testNode method (see .type.splay.test.node.method) would return TRUE.
Parameters closureP and closureS describe the environment for the function (see
.function.splay.find.first and .function.splay.find.last).
.type.splay.update.node.method: SplayUpdateNodeMethod is a pointer to a
function with the following prototype:
void updateNode(SplayTree tree, SplayNode node, SplayNode leftChild,
SplayNode rightChild);
The function is required to update any client datastructures associated with a
node to maintain some client determined property (see .prop) given that the
children of the node have changed. If the node does not have one or both
children, then NULL will be passed as the relevant parameter. (See
.usage.callback for an example)
EXTERNAL FUNCTIONS
.function.no-thread: The interface functions are not designed to be either
thread-safe or re-entrant. Clients of the interface are responsible for
synchronization, and for ensuring that client-provided methods invoked by the
splay module (.type.splay.compare.method, .type.splay.test.node.method,
.type.splay.test.tree.method, .type.splay.update.node.method) do not call
functions of the splay module.
.function.splay.tree.check: This is a check function for the SplayTree type
(see guide.impl.c.adt.method.check & design.mps.check(0)):
Bool SplayTreeCheck(SplayTree tree);
.function.splay.node.check: This is a check function for the SplayNode type
(see guide.impl.c.adt.method.check & design.mps.check(0)):
Bool SplayNodeCheck(SplayNode node);
.function.splay.tree.init: This function initialises a SplayTree (see
guide.impl.c.adt.method.init). It requires a compare method that defines a
total ordering on nodes (see .req.order); the effect of supplying a compare
method that does not implement a total ordering is undefined. It also requires
an updateNode method, which will be used to keep client properties up to date
when the tree structure changes; the value SplayTrivUpdateNode may be used for
this method if there is no need to maintain client properties. (See
.usage.initialization for an example use).
void SplayTreeInit(SplayTree tree, SplayCompareMethod compare,
SplayUpdateNodeMethod updateNode);
.function.splay.tree.finish: This function clears the fields of a SplayTree
(see guide.impl.c.adt.method.finish). Note that it does not attempt to finish
or deallocate any associated SplayNode objects; clients wishing to destroy a
non-empty SplayTree must first explicitly descend the tree and call
SplayNodeFinish on each node from the bottom up.
void SplayTreeFinish(SplayTree tree);
.function.splay.node.init: This function initialises a SplayNode (see
guide.impl.c.adt.method.init).
void SplayNodeInit(SplayNode node);
.function.splay.node.finish: This function clears the fields of a SplayNode
(see guide.impl.c.adt.method.finish). Note that it does not attempt to finish
or deallocate any referenced SplayNode objects (see.function.splay.tree.finish).
void SplayNodeFinish(SplayNode node);
.function.splay.root: This function returns the root node of the tree, if any
(see .req.root). If the tree is empty, FALSE is returned and *nodeReturn is
not changed. Otherwise, TRUE is returned and *nodeReturn is set to the root.
Bool SplayRoot(SplayNode *nodeReturn, SplayTree tree);
.function.splay.tree.insert: This function is used to insert into a splay tree
a new node which is associated with the supplied key (see .req.add). It first
splays the tree at the key. If an attempt is made to insert a node that
compares CompareEQUAL to an existing node in the tree, then ResFAIL will be
returned and the node will not be inserted. (See .usage.insert for an example
use).
Res SplayTreeInsert(SplayTree tree, SplayNode node, void *key);
.function.splay.tree.delete: This function is used to delete from a splay tree
a node which is associated with the supplied key (see .req.remove). If the
tree does not contain the given node, or the given node does not compare
CompareEQUAL with the given key, then ResFAIL will be returned, and the node
will not be deleted. The function first splays the tree at the given key.
(See .usage.delete for an example use).
Res SplayTreeDelete(SplayTree tree, SplayNode node, void *key);
.function.splay.tree.search: This function searches the splay tree for a node
that compares CompareEQUAL to the given key (see .req.locate). It splays the
tree at the key. It returns ResFAIL if there is no such node in the tree,
otherwise *nodeReturn will be set to the node.
Res SplayTreeSearch(SplayNode *nodeReturn, SplayTree tree, void *key);
.function.splay.tree.neighbours: This function searches a splay tree for the
two nodes that are the neighbours of the given key (see .req.neighbours). It
splays the tree at the key. *leftReturn will be the neighbour which compares
less than the key if such a neighbour exists; otherwise it will be NULL.
*rightReturn will be the neighbour which compares greater than the key if such
a neighbour exists; otherwise it will be NULL. The function returns ResFAIL if
any node in the tree compares CompareEQUAL with the given key. (See
.usage.insert for an example use).
Res SplayTreeNeighbours(SplayNode *leftReturn, SplayNode *rightReturn,
SplayTree tree, void *key);
.function.splay.tree.first: This function splays the tree at the first node,
and returns that node (see .req.iterate). The supplied key should compare
CompareLESS with all nodes in the tree. It will return NULL if the tree has no
nodes.
SplayNode SplayTreeFirst(SplayTree tree, void *zeroKey);
.function.splay.tree.next: This function receives a node and key and returns
the successor node to that node (see .req.iterate). This function is intended
for use in iteration when the received node will be the current root of the
tree, but is robust against being interspersed with other splay operations
(provided the old node still exists). The supplied key must compare
CompareEQUAL to the supplied node. Note that use of this function rebalances
the tree for each node accessed. If many nodes are accessed as a result of
multiple uses, the resultant tree will be generally well balanced. But if the
tree was previously beneficially balanced for a small working set of accesses,
then this local optimization will be lost. (see .future.parent).
SplayNode SplayTreeNext(SplayTree tree, SplayNode oldNode, void *oldKey);
.function.splay.tree.describe: This function prints (using WriteF) to the
stream a textual representation of the given splay tree, using nodeDescribe to
print client-oriented representations of the nodes (see .req.debug).
Res SplayTreeDescribe(SplayTree tree, mps_lib_FILE *stream,
SplayNodeDescribeMethod nodeDescribe);
.function.splay.find.first: SplayFindFirst finds the first node in the tree
that satisfies some client property (as determined by the testNode and testTree
methods) (see .req.property.find). closureP and closureS are arbitrary values,
and are passed to the testNode and testTree methods which may use the values as
closure environments. If there is no satisfactory node, then FALSE is
returned, otherwise *nodeReturn is set to the node. (See .usage.delete for an
example use).
Bool SplayFindFirst(SplayNode *nodeReturn, SplayTree tree,
SplayTestNodeMethod testNode, SplayTestTreeMethod testTree, void *closureP,
unsigned long closureS);
.function.splay.find.last: SplayFindLast finds the last node in the tree that
satisfies some client property (as determined by the testNode and testTree
methods) (see .req.property.find). closureP and closureS are arbitrary values,
and are passed to the testNode and testTree methods which may use the values as
closure environments. If there is no satisfactory node, then FALSE is
returned, otherwise *nodeReturn is set to the node.
Bool SplayFindFirst(SplayNode *nodeReturn, SplayTree tree,
SplayTestNodeMethod testNode, SplayTestTreeMethod testTree, void *closureP,
unsigned long closureS);
.function.splay.node.refresh: SplayNodeRefresh must be called whenever the
client property (see .prop) at a node changes (see .req.property.change). It
will call the updateNode method on the given node, and any other nodes that may
require update. The client key for the node must also be supplied; the
function splays the tree at this key. (See .usage.insert for an example use).
void SplayNodeRefresh(SplayTree tree, SplayNode node, void *key);
CLIENT-DETERMINED PROPERTIES
.prop: To support .req.property.find, this splay tree implementation provides
additional features to permit clients to cache maximum (or minimum) values of
client properties for all the nodes in a subtree. The splay tree
implementation uses the cached values as part of SplayFindFirst and
SplayFindLast via the testNode and testTree methods. The client is free to
choose how to represent the client property, and how to compute and store the
cached value.
.prop.update: The cached values depend upon the topology of the tree, which may
vary as a result of operations on the tree. The client is given the
opportunity to compute new cache values whenever necessary, via the updateNode
method (see .function.splay.tree.init). This happens whenever the tree is
restructured. The client may use the SplayNodeRefresh method to indicate that
the client attributes at a node have changed (see .req.property.change). A call
to SplayNodeRefresh splays the tree at the specified node, which may provoke
calls to the updateNode method as a result of the tree restructuring. The
updateNode method will also be called whenever a new splay node is inserted
into the tree.
.prop.example: For example, if implementing an address ordered tree of free
blocks using a splay tree, a client might choose to use the base address of
each block as the key for each node, and the size of each block as the client
property. The client can then maintain as a cached value in each node the size
of the largest block in the subtree rooted at that node. This will permit a
fast search for the first or last block of at least a given size. See
.usage.callback for an example updateNode method for such a client.
.prop.ops: The splay operations must cause client properties for nodes to be
updated in the following circumstances:- (.impl.* for details):
.prop.ops.rotate: rotate left, rotate right -- We need to update the value at
the original root, and the new root, in that order.
.prop.ops.link: link left, link right -- We know that the line of right descent
from the root of the left tree and the line of left descent from the root of
the right tree will both need to be updated. This is performed at the assembly
stage. (We could update these chains every time we do a link left or link right
instead, but this would be less efficient)
.prop.ops.assemble: assemble -- This operation also invalidates the lines of
right and left descent of the left and right trees respectively which need to
be updated (see below). It also invalidates the root which must be updated
last.
.prop.ops.assemble.reverse: To correct the chains of the left and right trees
without requiring stack or high complexity, we use a judicious amount of
pointer reversal.
.prop.ops.assemble.traverse: During the assembly, after the root's children
have been transplanted, we correct the chains of the left and right trees. For
the left tree, we traverse the right child line, reversing pointers, until we
reach the node that was the last node prior to the transplantation of the
root's children. Then we update from that node back to the left tree's root,
restoring pointers. Updating the right tree is the same, mutatis mutandis.
(See .future.reverse for an alternative approach).
USAGE
.usage: Here's a simple example of a client which uses a splay tree to
implement an address ordered tree of free blocks. The significant client usages
of the splay tree interface might look as follows:-
.usage.client-tree: Tree structure to embed a SplayTree (see .type.splay.tree):
typedef struct FreeTreeStruct {
SplayTreeStruct splayTree; /* Embedded splay tree */
/* no obvious client fields for this simple example */
} FreeTreeStruct;
.usage.client-node: Node structure to embed a SplayNode (see .type.splay.node):
typedef struct FreeBlockStruct {
SplayNodeStruct splayNode; /* embedded splay node */
Addr base; /* base address of block is also the key */
Size size; /* size of block is also the client property */
Size maxSize; /* cached value for maximum size in subtree */
} FreeBlockStruct;
.usage.callback: updateNode callback method (see
.type.splay.update.node.method):
void FreeBlockUpdateNode(SplayTree tree, SplayNode node,
SplayNode leftChild, SplayNode rightChild)
{
/* Compute the maximum size of any block in this subtree. */
/* The value to cache is the maximum of the size of this block, */
/* the cached value for the left subtree (if any) and the cached */
/* value of the right subtree (if any) */
FreeBlock freeNode = FreeBlockOfSplayNode(node);
Size maxSize = freeNode.size;
if(leftChild != NULL) {
FreeBlock leftNode = FreeBlockOfSplayNode(leftChild);
if(leftNode.maxSize > maxSize)
maxSize = leftNode->maxSize;
}
if(rightChild != NULL) {
FreeBlock rightNode = FreeBlockOfSplayNode(rightChild);
if(rightNode.maxSize > maxSize)
maxSize = rightNode->maxSize;
}
freeNode->maxSize = maxSize;
}
.usage.compare: Comparison function (see .type.splay.compare.method):
Compare FreeBlockCompare(void *key, SplayNode node) {
Addr base1, base2, limit2;
FreeBlock freeNode = FreeBlockOfSplayNode(node);
base1 = (Addr *)key;
base2 = freeNode->base;
limit2 = AddrAdd(base2, freeNode->size);
if (base1 < base2)
return CompareLESS;
else if (base1 >= limit2)
return CompareGREATER;
else
return CompareEQUAL;
}
.usage.test.tree: Test tree function (see .type.splay.test.tree.method):
Bool FreeBlockTestTree(SplayTree tree, SplayNode node
void *closureP, unsigned long closureS) {
/* Closure environment has wanted size as value of closureS. */
/* Look at the cached value for the node to see if any */
/* blocks in the subtree are big enough. */
Size size = (Size)closureS;
FreeBlock freeNode = FreeBlockOfSplayNode(node);
return freeNode->maxSize >= size;
}
.usage.test.node: Test node function (see .type.splay.test.node.method):
Bool FreeBlockTestNode(SplayTree tree, SplayNode node
void *closureP, unsigned long closureS) {
/* Closure environment has wanted size as value of closureS. */
/* Look at the size of the node to see if is big enough. */
Size size = (Size)closureS;
FreeBlock freeNode = FreeBlockOfSplayNode(node);
return freeNode->size >= size;
}
.usage.initialization: Client's initialization function (see
.function.splay.tree.init):
void FreeTreeInit(FreeTree tree) {
/* Initialize the embedded splay tree. */
SplayTreeInit(&tree->splayTree, FreeBlockCompare, FreeBlockUpdateNode);
}
.usage.insert: Client function to add a new free block into the tree, merging
it with an existing block if possible:
void FreeTreeInsert(FreeTree tree, Addr base, Addr limit) {
SplayTree splayTree = &tree->splayTree;
SplayNode leftNeighbour, rightNeighbour;
void *key = (void *)base; /* use the base of the block as the key */
Res res;
/* Look for any neighbouring blocks. (.function.splay.tree.neighbours) */
res = SplayTreeNeighbours(&leftNeighbour, &rightNeighbour,
splayTree, key);
AVER(res == ResOK); /* this client doesn't duplicate free blocks */
/* Look to see if the neighbours are contiguous. */
if (leftNeighbour != NULL &&
FreeBlockLimitOfSplayNode(leftNeighbour) == base) {
/* Inserted block is contiguous with left neighbour, so merge it. */
/* The client housekeeping is left as an exercise to the reader. */
/* This changes the size of a block, which is the client */
/* property of the splay node. See .function.splay.node.refresh */
SplayNodeRefresh(tree, leftNeighbour, key);
} else if (rightNeighbour != NULL &&
FreeBlockBaseOfSplayNode(rightNeighbour) == limit) {
/* Inserted block is contiguous with right neighbour, so merge it. */
/* The client housekeeping is left as an exercise to the reader. */
/* This changes the size of a block, which is the client */
/* property of the splay node. See .function.splay.node.refresh */
SplayNodeRefresh(tree, rightNeighbour, key);
} else {
/* Not contiguous - so insert a new node */
FreeBlock newBlock = (FreeBlock)allocate(sizeof(FreeBlockStruct));
splayNode = &newBlock->splayNode;
newBlock->base = base;
newBlock->size = AddrOffset(base, limit);
SplayNodeInit(splayNode); /* .function.splay.node.init */
/* .function.splay.tree.insert */
res = SplayTreeInsert(splayTree, splayNode, key);
AVER(res == ResOK); /* this client doesn't duplicate free blocks */
}
}
.usage.delete: Client function to allocate the first block of a given size in
address order. For simplicity, this allocates the entire block:
Bool FreeTreeAllocate(Addr *baseReturn, Size *sizeReturn,
FreeTree tree, Size size) {
SplayTree splayTree = &tree->splayTree;
SplayNode splayNode;
Bool found;
/* look for the first node of at least the given size. */
/* closureP parameter is not used. See .function.splay.find.first. */
found = SplayFindFirst(&splayNode, splayTree,
FreeBlockTestNode, FreeBlockTestTree,
NULL, (unsigned long)size);
if (found) {
FreeBlock freeNode = FreeBlockOfSplayNode(splayNode);
Void *key = (void *)freeNode->base; /* use base of block as the key */
Res res;
/* allocate the block */
*baseReturn = freeNode->base;
*sizeReturn = freeNode->size;
/* remove the node from the splay tree - .function.splay.tree.delete */
res = SplayTreeDelete(splayTree, splayNode, key);
AVER(res == ResOK); /* Must be possible to delete node */
/* Delete the block */
deallocate(freeNode, (sizeof(FreeBlockStruct));
return TRUE;
} else {
/* No suitable block */
return FALSE;
}
}
IMPLEMENTATION
.impl: For more details of how splay trees work, see paper.st85(0). For more
details of how to implement operations on splay trees, see paper.sleator96(0).
Here we describe the operations involved.
Top-Down Splaying
.impl.top-down: The method chosen to implement the splaying operation is called
"top-down splay". This is described as "procedure top-down splay" in
paper.st85(0) - although the implementation here additionally permits attempts
to access items which are not known to be in the tree. Top-down splaying is
particularly efficient for the common case where the location of the node in a
tree is not known at the start of an operation. Tree restructuring happens as
the tree is descended, whilst looking for the node.
.impl.splay: The key to the operation of the splay tree is the internal
function SplaySplay. It searches the tree for a node with a given key and
returns whether it suceeded. In the process, it brings the found node, or an
arbitrary neighbour if not found, to the root of the tree. This
"bring-to-root" operation is performed top-down during the search, and it is
not the simplest possible bring-to-root operation, but the resulting tree is
well-balanced, and will give good amortised cost for future calls to
SplaySplay. (See paper.st85(0))
.impl.splay.how: To perform this top-down splay, the tree is broken into three
parts, a left tree, a middle tree and a right tree. We store the left tree and
right tree in the right and left children respectively of a "sides" node to
eliminate some boundary conditions. The initial condition is that the middle
tree is the entire splay tree, and the left and right trees are empty. We also
keep pointers to the last node in the left tree, and the first node in the
right tree. Note that, at all times, the three trees are each validly ordered,
and they form a partition with the ordering left, middle, right. The splay is
then performed by comparing the middle tree with the following six cases, and
performing the indicated operations, until none apply.
.impl.splay.cases: Note that paper.st85(0)(Fig. 3) describes only 3 cases: zig,
zig-zig and zig-zag. The additional cases described here are the symmetric
variants which are respectively called zag, zag-zag and zag-zig. In the
descriptions of these cases, "root" is the root of the middle tree; node->left
is the left child of node; node->right is the right child of node. The
comparison operators (<, >, ==) are defined to compare a key and a node in the
obvious way by comparing the supplied key with the node's associated key.
.impl.splay.zig: The "zig" case is where key < root, and either:
- key == root->left;
- key < root->left && root->left->left == NULL; or
- key > root->left && root->left->right == NULL.
The operation for the zig case is: link right (see .impl.link.right)
.impl.splay.zag: The "zag" case is where key > root, and either:
- key == root->right;
- key < root->right && root->right->left == NULL; or
- key > root->right && root->right->right == NULL.
The operation for the zag case is: link left (see .impl.link.left)
.impl.splay.zig.zig: The "zig-zig" case is where key < root && key < root->left
&& root->left->left != NULL. The operation for the zig-zig case is: rotate
right (see .impl.rotate.right) followed by link right (see .impl.link.right).
.impl.splay.zig.zag: The "zig-zag" case is where key < root && key > root->left
&& root->left->right != NULL. The operation for the zig-zag case is: link
right (see .impl.link.right) followed by link left (see .impl.link.left).
.impl.splay.zag.zig: The "zag-zig" case is where key > root && key <
root->right && root->right->left != NULL. The operation for the zag-zig case
is: link left (see .impl.link.left) followed by link right (see
.impl.link.right).
.impl.splay.zag.zag: The "zag-zag" case is where key > root && key >
root->right && root->right->right != NULL. The operation for the zag-zag case
is: rotate left (see .impl.rotate.left) followed by link left (see
.impl.link.left).
.impl.splay.terminal.null: A special terminal case is when root == NULL. This
can only happen at the beginning, and cannot arise from the operations above.
In this case, the splay operation must return NULL, and "not found".
.impl.splay.terminal.found: One typical terminal case is when key == root.
This case is tested for at the beginning, in which case "found" is returned
immediately. If this case happens as a result of other operations, the splay
operation is complete, the three trees are assembled (see .impl.assemble), and
"found" is returned.
.impl.splay.terminal.not-found: The other typical terminal cases are:
- key < root && root->left == NULL; and
- key > root && root->right == NULL.
In these cases, the splay operation is complete, the three trees are assembled
(see .impl.assemble), and "not found" is returned.
.impl.rotate.left: The "rotate left" operation (see paper.st85(0) Fig. 1)
rearranges the middle tree as follows (where any of sub-trees A, B and C may be
empty):
.impl.rotate.right: The "rotate right" operation (see paper.st85(0) Fig. 1)
rearranges the middle tree as follows (where any of sub-trees A, B and C may be
empty):
.impl.link.left: The "link left" operation (see paper.st85(0) Fig. 11a for
symmetric variant) rearranges the left and middle trees as follows (where any
of sub-trees A, B, L and R may be empty):
The last node of the left tree is now x.
.impl.link.right: The "link right" operation (see paper.st85(0) Fig. 11a)
rearranges the middle and right trees as follows (where any of sub-trees A, B,
L and R may be empty):
The first node of the right tree is now x.
.impl.assemble: The "assemble" operation (see paper.st85(0) Fig. 12) merges the
left and right trees with the middle tree as follows (where any of sub-trees A,
B, L and R may be empty):
Top-Level Operations
.impl.insert: SplayTreeInsert: (See paper.sleator96(0), chapter 4, function
insert). If the tree has no nodes, [how does it smell?] add the inserted node
and we're done; otherwise splay the tree around the supplied key. If the splay
successfully found a matching node, return failure. Otherwise, add the
inserted node as a new root, with the old (newly splayed, but non-matching)
root as its left or right child as appropriate, and the opposite child of the
old root as the other child of the new root.
.impl.delete: SplayTreeDelete: (See paper.sleator96(0), chapter 4, function
delete). Splay the tree around the supplied key. Check that the newly splayed
root is the same node as given by the caller, and that it matches the key;
return failure if not. If the given node (now at the root) has fewer than two
children, replace it (as root), with the non-null child or null. Otherwise,
set the root of the tree to be the left child (arbitrarily) of the node to be
deleted, and splay around the same key. The new root will be the last node in
the sub-tree and will have a null right child; this is set to be the right
child of the node to be deleted.
.impl.search: SplayTreeSearch: Splay the node around the supplied key. If the
splay found a matching node, return it; otherwise return failure.
.impl.neighbours: SplayTreeNeighbours: Splay the tree around the supplied key.
If the splay found a matching node, return failure. Otherwise, determine
whether the (non-matching) found node is the left or right neighbour of the key
(by comparison with the key). Set the tree root to be the right or left child
of that first neighbour respectively, and again splay the tree around the
supplied key. The new root will be the second neighbour, and will have a null
left or right child respectively. Set this null child to be the first
neighbour. Return the two neighbours.
.impl.neighbours.note: Note that it would be possible to implement
SplayTreeNeighbours with only one splay, and then a normal binary tree search
for the left or right neighbour of the root. This would be a cheaper
operation, but would give poorer amortised cost if the call to
SplayTreeNeighbours typically precedes a call to SplayTreeInsert (which is
expected to be a common usage pattern - see .usage.insert). It's also possible
to implement SplayTreeNeighbours by simply keeping track of both neighbours
during a single splay. This has about the same cost as a single splay, and
hence about the same amortised cost if the call to SplayTreeNeighbours typically precedes a call to
SplayTreeInsert.
.impl.next: SplayTreeNext: Splay the tree around the supplied oldKey. During
iteration the "old node" found is probably already at the root, in which case
this will be a null operation with little cost. If this old node has no right
child, return NULL. Otherwise, split the tree into a right tree (which contains
just the right child of the old node) and a left tree (which contains the old
node, its left child and no right child). The next node is the first node in
the right tree. Find this by splaying the right tree around oldKey (which is
known to compare CompareLESS than any keys in the right tree). Rejoin the full
tree, using the right tree as the root and setting the left child of root to be
the left tree. Return the root of this tree.
TESTING
.test: There is no plan to test splay trees directly. It is believed that the
testing described in design.mps.cbs.test will be sufficient to test this
implementation.
ERROR HANDLING
.error: This module detects and reports most common classes of protocol error.
The cases it doesn't handle will result in undefined behaviour and probably
cause an AVER to fire. These are:
.error.bad-pointer: Passing an invalid pointer in place of a SplayTree or
SplayNode.
.error.bad-compare: Initialising a SplayTree with a compare function that is
not a valid compare function, or which doesn't implement a total ordering on
splay nodes.
.error.bad-describe: Passing an invalid describe method to SplayTreeDescribe.
.error.out-of-stack: Stack exhaustion under SplayTreeDescribe.
FUTURE
.future.tree: It would be possible to split the splay tree module into two: one
that implements binary trees; and one that implements splay trees on top of a
binary tree.
.future.parent: The iterator could be made more efficient (in an amortized
sense) if it didn't splay at each node. To implement this (whilst meeting
.req.stack) we really need parent pointers from the nodes. We could use the
(first-child, right-sibling/parent) trick described in paper.st85 to implement
this, at a slight cost to all other tree operations, and an increase in code
complexity. paper.st85 doesn't describe how to distinguish the first-child
between left-child and right-child, and the right-sibling/parent between
right-sibling and parent. One could either use the comparator to make these
distinctions, or steal some bits from the pointers.
.future.reverse: The assembly phase could be made more efficient if the link
left and link right operations were modified to add to the left and right trees
with pointers reversed. This would remove the need for the assembly phase to
reverse them.

View file

@ -0,0 +1,90 @@
STACK SCANNER FOR DIGITAL UNIX / ALPHA SYSTEMS
design.mps.sso1al
draft doc
drj 1997-03-27
INTRODUCTION
.readership: Any MPS developer.
.intro: This is the design for Stack Scanner module that runs on DIGITAL UNIX /
Alpha systems (See os.o1 and arch.al). The design adheres to the general
design and interface described (probably not described actually) in
design.mps.ss.
.source.alpha: book.digital96 (Alpha Architecture Handbook) describes the Alpha
Architecture independently of any particular implementation. The instruction
mnemonics and the semantics for each instruction are specified in that document.
.source.as:
<URL:http://www.partner.digital.com/www-swdev/pages/Home/TECH/documents/Digital_
UNIX/V4.0/AA-PS31D-TET1_html/TITLE.html> (Assembly Language Programmer's Guide)
describes the assembler syntax and assembler directives. It also summarises
the calling conventions used. Chapters 1 and 6 were especially useful,
especially chapter 6.
.source.convention:
<URL:http://www.partner.digital.com/www-swdev/pages/Home/TECH/documents/Digital_
UNIX/V4.0/AA-PY8AC-TET1_html/TITLE.html> (Calling Standard for Alpha Systems)
describes the calling convetions used for Digital Alpha systems. Chapter 2 was
useful. But the whole document was not used as much as the previous 2
documents.
DEFINITIONS
.def.saved: Saved Register. A saved register is one whose value is defined to
be preserved across a procedure call according to the Calling Standard. They
are $9-$15, $26, and $30. $30 is the stack pointer.
.def.non-saved: Non-Saved Register. A non-save register is a register that is
assumed to be modified across a procedure call according to the Calling
Standard.
.def.tos: Top of Stack. The top of stack is the youngest portion of the stack.
.def.bos: Bottom of Stack. The bottom of stack is the oldest portion of the
stack.
.def.base: Base. Of a range of addresses, the base is the lowest address in
the range.
.def.limit: Limit. Of a range of addresses, the limit is "one past" the
highest address in the range.
OVERVIEW
.overview: The registers and the stack need to be scanned. This is achieved by
storing the contents of the registers into a frame at the top of the stack and
then passing the base and limit of the stack region, including the newly
created frame, to the function TraceScanAreaTagged. TraceScanAreaTagged
performs the actual scanning and fixing.
DETAIL DESIGN
Functions
.fun.stackscan:
Res StackScan(ScanState ss, Addr *StackBot);
.fun.stackscan.asm: The function is written in assembler.
.fun.stackscan.asm.justify: This is because the machine registers need to be
examined, and it is only possible to access the machine registers using
assembler.
.fun.stackscan.entry: On entry to this procedure all the non-saved (temporary)
registers that contain live pointers must have been saved in some root (usually
the stack) by the mutator (otherwise it would lose the values). Therefore only
the saved registers need to be stored by this procedure.
.fun.stackscan.assume.saved: We assume that all the saved registers are roots.
This is conservative since some of the saved registers might not be used.
.fun.stackscan.frame: A frame is be created on the top of the stack.
.fun.stackscan.frame.justify: This frame is used to store the saved registers
into so that they can be scanned.
.fun.stackscan.save: All the saved registers, apart from $30 the stack pointer,
are to be stored in the frame. .fun.stackscan.save.justify: This is so that
they can be scanned. The stack pointer itself is not scanned as the stack is
assumed to be a root (and therefore a priori alive).
.fun.stackscan.call: TraceScanAreaTagged is called with the current stack
pointer as the base and the (passed in) StackBot as the limit of the region to
be scanned. .fun.stackscan.call.justify: This function does the actual
scanning. The Stack on Alpha systems grows down so the stack pointer (which
points to the top of the stack) is lower in memory than the bottom of the stack.
.fun.stackscan.return: The return value from TraceScanAreaTagged is used as the
return value for StackScan.

View file

@ -0,0 +1,352 @@
THE DESIGN OF THE MPS TELEMETRY MECHANISM
design.mps.telemetry
incomplete design
richard 1997-07-07
INTRODUCTION:
This documents the design of the telemetry mechanism within the MPS.
.readership: This document is intended for any MPS developer.
.source: Various meetings and brainstorms, including
meeting.general.1997-03-04(0), mail.richard.1997-07-03.17-01(0),
mail.gavinm.1997-05-01.12-40(0).
Document History
.hist.0: 1997-04-11 GavinM Rewritten
.hist.1: 1997-07-07 GavinM Rewritten again after discussion in Pool Hall.
OVERVIEW:
Telemetry permits the emission of events from the MPS. These can be used to
drive a graphical tool, or to debug, or whatever. The system is flexible and
robust, but doesn't require heavy support from the client.
REQUIREMENTS:
.req.simple: It must be possible to generate code both for the MPS and any tool
without using complicated build tools.
.req.open: We must not constrain the nature of events before we are certain of
what we want them to be.
.req.multi: We must be able to send events to multiple streams.
.req.share: It must be possible to share event descriptions between the MPS and
any tool.
.req.version: It must be possible to version the set of events so that any tool
can detect whether it can understand the MPS.
.req.back: Tools should be able to understand older and newer version of the
MPS, so far as is appropriate.
.req.type: It must be possible to transmit a rich variety of types to the tool,
including doubles, and strings.
.req.port: It must be possible to transmit and receive events between different
platforms.
.req.control: It must be possible to control whether and what events are
transmitted at least at a coarse level.
.req.examine: There should be a cheap means to examine the contents of logs.
.req.pm: The event mechanism should provide for post mortem to detect what
significant events led up to death.
.req.perf: Events should not have a significant effect on performance when
unwanted.
.req.small: Telemetry streams should be small.
.req.avail: Events should be available in all varieties, subject to performance
requirements.
.req.impl: The plinth support for telemetry should be easy to write and
flexible.
.req.robust: The telemetry protocol should be robust against some forms of
corruption, e.g. packet loss.
.req.intern: It should be possible to support string-interning.
ARCHITECTURE:
.arch: Event annotations are scattered throughout the code, but there is a
central registration of event types and properties. Events are written to a
buffer via a specialist structure, and are optionally written to the plinth.
Events can take any number of parameters of a range of types, indicated as a
format both in the annotation and the the registry.
ANALYSIS:
.anal: The proposed order of development, with summary of requirements impact
is as follows:
v c e
s e o x r i
i m s r n a s a o n
m o u h s t p t m p m v i b t b
p p l a i y o r i e a a m u e a
l e t r o p r o n p r l i p s r c
e n i e n e t l e m f l l l t n k
.sol.format 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 Merged.
.sol.struct 0 0 0 0 0 + 0 0 0 0 + - 0 0 0 0 0 Merged.
.sol.string 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 + 0 Merged.
.sol.relation + 0 0 + 0 0 0 0 + 0 0 + 0 0 0 0 0 Merged.
.sol.dumper 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 Merged.
.sol.kind 0 - 0 0 0 0 0 + 0 + 0 0 0 0 0 0 0 Merged.
.sol.control 0 0 0 0 0 0 0 + 0 0 + 0 0 0 0 0 0 Merged.
.sol.variety 0 0 0 0 0 0 0 0 0 + + 0 + 0 0 0 0
[ Not yet ordered. ]
.sol.buffer 0 0 0 0 0 0 0 + 0 + + 0 0 0 0 0 0
.sol.traceback 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0
.sol.client 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + 0
.sol.head 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0
.sol.version 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 +
.sol.exit 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0
.sol.block 0 0 0 0 0 0 0 0 0 0 + - 0 0 + 0 0
.sol.code 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 +
.sol.msg 0 0 + 0 0 0 + 0 0 0 0 0 0 + + 0 0
.file-format: One of the objectives of this plan is to minimise the impact of
the changes to the log file format. This is to be achieved firstly by
completing all necessary support before changes are initiated, and secondly by
performing all changes at the same time.
IDEAS:
.sol.format: Event annotations indicate the types of their arguments, e.g.
EVENT_WD for a Word, and a double. (.req.type)
.sol.struct: Copy event data into a structure of the appropriate type, e.g.
EventWDStruct. (.req.type, .req.perf, but not .req.small because of padding)
.sol.string: Permit at most one string per event, at the end, and use the char
[1] hack, and specialised code; deduce the string length from the event length
and also NUL-terminate (.req.type, .req.intern)
.sol.buffer: Enter all events initially into internal buffers, and
conditionally send them to the message stream. (.req.pm, .req.control,
.req.perf)
.sol.variety: In optimized varieties, have internal events (see .sol.buffer)
for a subset of events and no external events; in normal varieties have all
internal events, and the potential for external events. (.req.avail, .req.pm,
.req.perf)
.sol.kind: Divide events by some coarse type into around 6 groups, probably
related to frequency. (.req.control, .req.pm, but not .req.open)
.sol.control: Hold flags to determine which events are emitted externally.
(.req.control, .req.perf)
.sol.dumper: Write a simple tool to dump event logs as text. (.req.examine)
.sol.msg: Redesign the plinth interface to send and receive messages, based on
any underlying IPC mechanism, e.g. append to file, TCP/IP, messages, shared
memory. (.req.robust, .req.impl, .req.port, .req.multi)
.sol.block: Buffer the events and send them as fixed size blocks, commencing
with a timestamp, and ending with padding. (.req.robust, .req.perf, but not
.req.small)
.sol.code: Commence each event with two bytes of event code, and two bytes of
length. (.req.small, .req.back)
.sol.head: Commence each event stream with a platform-independent header block
giving information about the session, version (see .sol.version), and file
format; file format will be sufficient to decode the (platform-dependent) rest
of the file. (.req.port)
.sol.exit: Provide a mechanism to flush events in the event of graceful sudden
death. (.req.pm)
.sol.version: Maintain a three part version number for the file comprising
major (incremented when the format of the entire file changes (other than
platform differences)), median (incremented when an existing event changes its
form or semantics), and minor (incremented when a new event type is added);
tools should normally fail when the median or major is unsupported.
(.req.version, .req.back)
.sol.relation: Event types will be defined in terms of a relation specifying
their name, code, optimised behaviour (see .sol.variety), kind (see .sol.kind),
and format (see .sol.format); both the MPS and tool can use this by suitable
#define hacks. (.req.simple. .req.share, .req.examine, .req.small (no format
information in messages))
.sol.traceback: Provide a mechanism to output recent events (see .sol.buffer)
as a form of backtrace when AVERs fire or from a debugger, or whatever.
(.req.pm)
.sol.client: Provide a mechanism for user events. (.req.intern)
IMPLEMENTATION:
Annotation
.annot: An event annotation is of the form:
EVENT_PAW(FooCreate, pointer, address, word);
.annot.format: Note that the format is indicated in the macro name. See
.format.
.annot.string: If there is a string in the format, it must be the last
parameter (and hence there can be only one). There is currrently a maximum
string length, defined by EventMaxStringLength in impl.h.eventcom.
.annot.type: The event type should be given as the first parameter to the event
macro, as registered in impl.h.eventdef.
.annot.param: The parameters of the event should be given as the remaining
parameters of the event macro, in order as indicated in the format.
Registration
.reg: All event types should be registered in impl.h.eventdef, in the form of a
relation.
.reg.just: This use of a relation macro enables great flexibility in the use of
this file.
.reg.rel: The relation is of the form:
RELATION(FooCreate, 0x1234, TRUE, Arena, PAW)
.reg.type: The first parameter of the relation is the event type. This needs
no prefix, and should correspond to that used in the annotation.
.reg.code: The second parameter is the event code, a 16-bit value used to
represent this event type. [Not yet used. GavinM 1997-07-18]
.reg.code.temp: On an interim basis, new events also have to be registered in
impl.h.eventcom. This will no longer be required when the event file format is
revised.
.reg.always: The third parameter is a boolean value indicating whether this
event type should be implemented in all varieties. See .control.buffer.
Unless your event is on the critical path (typically per reference or per
object), you will want this to be TRUE.
.reg.kind: The fourth parameter is a kind keyword indicating what category this
event falls into. See .control. The possible values are:
Arena -- per space or arena or global
Pool -- pool-related
Trace -- per trace or scan
Seg -- per segment
Ref -- per reference or fix
Object -- per object or allocation
This list can be seen in impl.h.event.
.reg.format: The fifth parameter is the format (see .format) and should
correspond to the annotation (see .annot.format).
.reg.dup: It is permissible for the one event type to be used for more than one
annotation. There are generally two reasons for this:
- Variable control flow for successful function completion;
- Platform/Otherwise-dependent implementations of a function.
Note that all annotations for one event type must have the same format (as
implied by .reg.format).
Format
.format: Where a format is used to indicate the type, it is a sequence of
letters from the following list:
P -- void *
A -- Addr
W -- Word
U -- unsigned int
S -- char *
D -- double
The corresponding event parameters must be assignment compatible with these
types.
.format.zero: If there are no parameters for an event, then the special format
"0" should be used.
.format.types: When an event has parameters whose type is not in the above
list, use the following guidelines: All C pointer types not representing
strings use P; Size and Count use W; *Set use U; others should be obvious.
.format.support: Every format used needs bespoke support in impl.h.eventgen.
It has not been possible to provide support for all possible formats, so such
support is added when required. .format.support.auto: There is a tool in
impl.pl.eventgen that will generate impl.h.eventgen automatically. It is used
as follows:
1. Claim the file eventgen.h.
2. Invoke eventgen.pl.
3. Check it compiles correctly in all varieties.
4. Check in eventgen.h.
Control
.control: There are two types of event control, buffer and output.
.control.buffer: Buffer control affects whether particular events implemented
at all, and is controlled statically by variety using the always value (see
.reg.always) for the event type. This is only relevant to release varieties.
[Not yet implemented. GavinM 1997-07-18]
.control.output: Output control affects whether events written to the internal
buffer are output via the plinth. This is set on a per-kind basis (see
.reg.kind), using a control bit table stored in EventKindControl. By default,
all event kinds are on (in variety.ti). You may switch some kinds off using a
debugger.
For example, to disable Ref events using gdb (see impl.h.event for numeric
codes):
break ArenaCreate
run
delete 1
call BTRes(EventKindControl, 4)
continue
.control.just: These controls are coarse, but very cheap.
.control.external: There will be an MPS interface function to control
EventKindControl.
.control.tool: The tools will be able to control EventKindControl.
Dumper Tool
.dumper: A primitive dumper tool is available in impl.c.eventcnv. For details,
see guide.mps.telemetry.
Allocation Replayer Tool
.replayer: A tool for replaying an allocation sequence from a log is available
in impl.c.replay. For details, see design.mps.telemetry.replayer.
TEXT:
.notes:
- Set always to FALSE for all Ref and Object events;
- Fix use of BT for size in bytes, guess then check, BTInit;
- Resolve protection transgression in impl.h.eventdef;
- Make EventKind* enum members so they can be used from the debugger.

View file

@ -0,0 +1,95 @@
TRACER
design.mps.trace
incomplete design
drj 1996-09-25
ARCHITECTURE:
.instance.limit: There will be a limit on the number of traces that can be
created at any one time. This effectively limits the number of concurrent
traces. This limitation is expressed in the symbol TRACE_MAX [currently set to
1, see request.mps.160020 "Multiple traces would not work" drj 1998-06-15].
.rate: [see mail.nickb.1997-07-31.14-37]. [Now revised? See
request.epcore.160062 and change.epcore.minnow.160062. drj 1998-06-15]
.exact.legal: Exact references should either point outside the arena (to
non-managed address space) or to a tract allocated to a pool. Exact references
that are to addresses which the arena has reserved but hasn't allocated memory
to are illegal (the exact reference couldn't possibly refer to a real object).
Depending on the future semantics of PoolDestroy we might need to adjust our
strategy here. See mail.dsm.1996-02-14.18-18 for a strategy of coping
gracefully with PoolDestroy. We check that this is the case in the fixer. It
may be sensible to make this check CRITICAL in certain configurations.
.fix.fixed.all: ss->fixedSummary is accumulated (in the fixer) for all the
pointers whether or not they are genuine references. We could accumulate fewer
pointers here; if a pointer fails the TractOfAddr test then we know it isn't a
reference, so we needn't accumulate it into the fixed summary. The design
allows this, but it breaks a useful post-condition on scanning (if the
accumulation of ss->fixedSummary was moved the accuracy of ss->fixedSummary
would vary according to the "width" of the white summary). See
mail.pekka.1998-02-04.16-48 for improvement suggestions.
ANALYSIS:
.fix.copy-fail: Fixing can always succeed, even if copying the referenced
object has failed (due to lack of memory, for example), by backing off to
treating a reference as ambiguous. Assuming that fixing an ambiguous reference
doesn't allocate memory (which is no longer true for AMC for example). See
request.dylan.170560 for a slightly more sophisticated way to proceed when you
can no longer allocate memory for copying.
IDEAS:
.flip.after: To avoid excessive barrier impact on the mutator immediately after
flip, we could scan during flip other objects which are "near" the roots, or
otherwise known to be likely to be accessed in the near future.
IMPLEMENTATION:
Speed
.fix: The fix path is critical to garbage collection speed. Abstractly fix is
applied to all the references in the non-white heap and all the references in
the copied heap. Remembered sets cut down the number of segments we have to
scan. The zone test cuts down the number of references we call fix on. The
speed of the remainder of the fix path is still critical to system
performance. Various modifications to and aspects of the system are concerned
with maintaining the speed along this path.
.fix.tractofaddr: TractOfAddr is called on every reference that passes the zone
test and is on the critical path, to determine whether the segment is white.
There is no need to examine the segment to perform this test, since whiteness
information is duplicated in tracts, specifically to optimize this test.
TractOfAddr itself is a simple class dispatch function (which dispatches to the
arena class's TractOfAddr method). Inlining the dispatch and inlining the
functions called by VMTractOfAddr makes a small but noticable difference to the
speed of the dylan compiler.
.fix.noaver: AVERs in the code add bulk to the code (reducing I-cache efficacy)
and add branches to the path (polluting the branch pedictors) resulting in a
slow down. Removing all the AVERs from the fix path improves the overall speed
of the dylan compiler by as much as 9%.
.fix.nocopy: AMCFix used to copy objects by using the format's copy method.
This involved a function call (through an indirection) and in dylan_copy a call
to dylan_skip (to recompute the length) and call to memcpy with general
parameters. Replacing this with a direct call to memcpy removes these
overheads and the call to memcpy now has aligned parameters. The call to
memcpy is inlined by the (C) compiler. This change results in a 4-5% speed-up
in the dylan compiler.
.reclaim: Because the reclaim phase of the trace (implemented by TraceReclaim)
examines every segment it is fairly time intensive. rit's profiles presented
in request.dylan.170551 show a gap between the two varieties variety.hi and
variety.wi.
.reclaim.noaver: Converting AVERs in the loops of TraceReclaim, PoolReclaim,
AMCReclaim (LOReclaim? AWLReclaim) will result in a noticeable speed
improvement [insert actual speed improvement here].

397
mps/design/type/index.txt Normal file
View file

@ -0,0 +1,397 @@
THE DESIGN OF THE GENERAL MPS TYPES
design.mps.type
incomplete doc
richard 1996-10-23
INTRODUCTION
.intro:
See impl.h.mpmtypes.
RATIONALE
Some types are declared to resolve a point of design, such as the best type to
use for array indexing.
Some types are declared so that the intention of code is clearer. For example,
Byte is necessarily unsigned char, but it's better to say Byte in your code if
it's what you mean.
CONCRETE TYPES
Bool
.bool: The Bool type is mostly defined so that the intention of code is
clearer. In C, boolean expressions evaluate to int, so Bool is in fact an
alias for int.
.bool.value: Bool has two values, TRUE and FALSE. These are defined to be 1 and
0 respectively, for compatibility with C boolean expressions (so one may set a
Bool to the result of a C boolean expression).
.bool.use: Bool is a type which should be used when a boolean value is
intended, for example, as the result of a function. Using a boolean type in C
is a tricky thing. Non-zero values are "true" (when used as control
conditions) but are not all equal to TRUE. Use with care.
.bool.check: BoolCheck simply checks whether the argument is TRUE (1) or FALSE
(0).
.bool.check.inline: The inline macro version of BoolCheck casts the int to
unsigned and checks that it is <= 1. This is safe, well-defined, uses the
argument exactly once, and generates reasonable code.
.bool.check.inline.smaller: In fact we can expect that the "inline" version of
BoolCheck to be smaller than the equivalent function call (on intel for
example, a function call will be 3 instructions (total 9 bytes), the inline
code for BoolCheck will be 1 instruction (total 3 bytes) (both sequences not
including the test which is the same length in either case)).
.bool.check.inline.why: As well as being smaller (see
.bool.check.inline.smaller) it is faster. On 1998-11-16 drj compared
w3i3mv\hi\amcss.exe running with and without the macro for BoolCheck on the PC
Aaron. "With" ran in 97.7% of the time (averaged over 3 runs).
Res
.res: Res is the type of result codes. A result code indicates the success or
failure of an operation, along with the reason for failure. Like Unix error
codes, the meaning of the code depends on the call that returned it. These
codes are just broad categories with mnemonic names for various sorts of
problems.
ResOK: The operation succeeded. Return parameters may only be updated if OK is
returned, otherwise they must be left untouched.
ResFAIL: Something went wrong which doesn't fall into any of the other
categories. The exact meaning depends on the call. See documentation.
ResRESOURCE: A needed resource could not be obtained. Which resource depends
on the call. See also MEMORY, which is a special case of this.
ResMEMORY: Needed memory (committed memory, not address space) could not be
obtained.
ResLIMIT: An internal limitation was reached. For example, the maximum number
of somethings was reached. We should avoid returning this by not including
static limitations in our code, as far as possible. (See rule.impl.constrain
and rule.impl.limits.)
ResUNIMPL: The operation, or some vital part of it, is unimplemented. This
might be returned by functions which are no longer supported, or by operations
which are included for future expansion, but not yet supported.
ResIO: An I/O error occurred. Exactly what depends on the function.
ResCOMMIT_LIMIT: The arena's commit limit would have been exceeded as a reult
of allocation.
ResPARAM: An invalid parameter was passed. Normally reserved for parameters
passed from the client.
.res.use: Res should be returned from any function which might fail. Any other
results of the function should be passed back in "return" parameters (pointers
to locations to fill in with the results). [This is documented elsewhere, I
think -- richard].res.use.spec: The most specific code should be returned.
Fun
.fun: Fun is the type of a pointer to a function about which nothing more is
known.
.fun.use: Fun should be used where it's necessary to handle a function without
calling it in a polymorphic way. For example, if you need to write a function
g which passes another function f through to a third function h, where h knows
the real type of f but g doesn't.
Word
.word: Word is an unsigned integral type which matches the size of the machine
word, i.e. the natural size of the machine registers and addresses.
.word.use: It should be used where an unsigned integer is required that might
range as large as the machine word.
.word.source: Word is derived from the macro MPS_T_WORD which is declared in
impl.h.mpstd according to the target platform.
.word.conv.c: Word is converted to mps_word_t in the MPS C Interface.
Byte
.byte: Byte is an unsigned integral type corresponding to the unit in which
most sizes are measured, and also the units of sizeof().
.byte.use: Byte should be used in preference to char or unsigned char wherever
it is necessary to deal with bytes directly.
.byte.source: Byte is a just pedagogic version of unsigned char, since char is
the unit of sizeof().
Index
.index: Index is an unsigned integral type which is large enough to hold any
array index.
.index.use: Index should be used where the maximum size of the array cannot be
statically determined. If the maximum size can be determined then the smallest
unsigned integer with a large enough range may be used instead.
Count
.count: Count is an unsigned integral type which is large enough to hold the
size of any collection of objects in the MPS.
.count.use: Count should be used for a number of objects (control or managed)
where the maximum number of objects cannot be statically determined. If the
maximum number can be statically determined then the smallest unsigned integer
with a large enough range may be used instead (although Count may be preferable
for clarity). [ Should Count be used to count things that aren't represented
by objects (e.g. a level)? I would say yes. gavinm 1998-07-21 ] [Only where
it can be determined that the maximum count is less than the number of
objects. pekka 1998-07-21]
Accumulation
.accumulation: Accumulation is an arithmetic type which is large enough to hold
accumulated totals of objects of bytes (e.g. total number of objects allocated,
total number of bytes allocated).
.accumulation.type: Currently it is double, but reason for the interface is so
that we can more easily change it if we want to (if we decide we need more
accuracy for example).
.accumulation.use: Currently the only way to use an Accumulation is to reset it
(AccumulatorReset) and accumulate (Accumulate) amounts into it. There is no
way to read it at the moment, but that's okay, because noone seems to want to.
.accumulation.future: Probably we should have methods which return the
accumulation into an unsigned long, and also a double; these functions should
return bools to indicate whether the accumulation can fit in the requested
type. Possibly we could have functions which returned scaled accumulations
(e.g. AccumulatorScale(a, d) would divide the Accumulation a by double d and
return the double result if the result fitted into a double).
Addr
.addr: Addr is the type used for "managed addresses", that is, addresses of
objects managed by the MPS.
.addr.def: Addr is defined as struct AddrStruct *, but AddrStruct is never
defined. This means that Addr is always an incomplete type, which prevents
accidental dereferencing, arithmetic, or assignment to other pointer types.
.addr.use: Addr should be used whenever the code needs to deal with addresses.
It should not be used for the addresses of memory manager data structures
themselves, so that the memory manager remains amenable to working in a
separate address space. Be careful not to confuse Addr with void *.
.addr.ops: Limited arithmetic is allowed on addresses using AddrAdd and
AddrOffset (impl.c.mpm). Addresses may also be compared using the relational
operators ==, !=, <, <=, >, and >=. .addr.ops.mem: We need efficient operators
similar to memset, memcpy, and memcmp on Addr; these are called AddrSet,
AddrCopy, and AddrComp. When Addr is compatible with void *, these are
implemented through the mps_lib_mem* functions in the plinth (impl.h.mpm) [and
in fact, no other implementation exists at present, pekka 1998-09-07].
.addr.conv.c: Addr is converted to mps_addr_t in the MPS C Interface.
mps_addr_t is defined to be the same as void *, so using the MPS C Interface
confines the memory manager to the same address space as the client data.
Size
.size: Size is an unsigned integral type large enough to hold the size of any
object which the MPS might manage.
.size.byte: Size should hold a size calculated in bytes. Warning: This may not
be true for all existing code.
.size.use: Size should be used whenever the code needs to deal with the size of
managed memory or client objects. Is should not be used for the sizes of the
memory manager's own data structures, so that the memory manager is amenable to
working in a separate address space. Be careful not to confuse it with size_t.
.size.ops: [Size operations?]
.size.conv.c: Size is converted to size_t in the MPS C Interface. This
constrains the memory manager to the same address space as the client data.
Align
.align: Align is an unsigned integral type which is used to represent the
alignment of managed addresses. All alignments are positive powers of two.
Align is large enough to hold the maximum possible alignment.
.align.use: Align should be used whenever the code needs to deal with the
alignment of a managed address.
.align.conv.c: Align is converted to mps_align_t in the MPS C Interface.
Shift
.shift: Shift is an unsigned integral type which can hold the amount by which a
Word can be shifted. It is therefore large enough to hold the word width (in
bits).
.shift.use: Shift should be used whenever a shift value (the right-hand operand
of the << or >> operators) is intended, to make the code clear. It should also
be used for structure fields which have this use.
.shift.conv.c: Shift is converted to mps_shift_t in the MPS C Interface.
Ref
.ref: Ref is a reference to a managed object (as opposed to any old managed
address). Ref should be used where a reference is intended.
[This isn't too clear -- richard]
RefSet
.refset: RefSet is a conservative approximation to a set of references. See
design.mps.refset.
Rank
.rank: Rank is an enumeration which represents the rank of a reference. The
ranks are:
RankAMBIG (0): the reference is ambiguous, i.e. must be assumed to be a
reference, and not update in case it isn't;
RankEXACT (1): the reference is exact, and refers to an object;
RankFINAL (2): the reference is exact and final, so special action is required
if only final or weak references remain to the object;
RankWEAK (3): the reference is exact and weak, so should be deleted if only
weak references remain to the object.
Rank is stored with segments and roots, and passed around.
Rank is converted to mps_rank_t in the MPS C Interface.
The ordering of the ranks is important. It is the order in which the
references must be scanned in order to respect the properties of references of
the ranks. Therefore they are declared explicitly with their integer values.
[Could Rank be a short?]
[This documentation should be expanded and moved to its own document, then
referenced from the implementation more thoroughly.]
Epoch
.epoch: An Epoch is a count of the number of flips that the mutator have
occurred. [Is it more general than that?] It is used in the implementation of
location dependencies (LDs).
Epoch is converted to mps_word_t in the MPS C Interface, as a field of mps_ld_s.
TraceId
.traceid: A TraceId is an unsigned integer which is less than TRACE_MAX. Each
running trace has a different TraceId which is used to index into tables and
bitfields used to remember the state of that trace.
TraceSet
.traceset: A TraceSet is a bitset of TraceIds, represented in the obvious way,
i.e.
member(ti, ts) <=> (2^ti & ts) != 0
TraceSets are used to represent colour in the Tracer. [Expand on this.]
AccessSet
.access-set: An AccessSet is a bitset of Access modes, which are AccessREAD and
AccessWRITE. AccessNONE is the empty AccessSet.
Attr
.attr: Pool attributes. A bitset of pool or pool class attributes, which are:
AttrFMT: the pool contains formatted objects;
AttrSCAN: the pool contains references and must be scanned for GC;
AttrPM_NO_READ: the pool may not be read protected;
AttrPM_NO_WRITE: the pool may not be write protected;
AttrALLOC: the pool supports the PoolAlloc interface;
AttrFREE: the pool supports the PoolFree interface;
AttrBUF: the pool supports the allocation buffer interface;
AttrBUF_RESERVE: the pool suppors the reserve/commit protocol on allocation
buffers;
AttrBUF_ALLOC: the pool supports the alloc protocol on allocation buffers;
AttrGC: the pool is garbage collecting, i.e. parts may be reclaimed;
AttrINCR_RB: the pool is incremental requiring a read barrier;
AttrINCR_WB: the pool is incremental requiring a write barrier.
There is an attribute field in the pool class (PoolClassStruct) which declares
the attributes of that class. These attributes are only used for consistency
checking at the moment. [no longer true that they are only used for consistency
checking -- drj 1998-05-07]
RootVar
.rootvar: The type RootVar is the type of the discriminator for the union
within RootStruct.
Serial
.serial: A Serial is a number which is assigned to a structure when it is
initialized. The serial number is taken from a field in the parent structure,
which is incremented. Thus, every instance of a structure has a unique "name"
which is a path of structures from the global root. For example:
space[3].pool[5].buffer[2]
Why? Consistency checking, debugging, and logging. Not well thought out.
Compare
.compare: Compare is the type of tri-state comparison values.
CompareLESS: Indicates that a value compares less than another value.
CompareEQUAL: Indicates that two value compares the same
CompareGREATER: Indicates that a value compares greater than another value.
ABSTRACT TYPES
.adts: The following types are abstract data types, implemented as pointers to
structures. For example, Ring is a pointer to a RingStruct. They are
described elsewhere [where?].
Ring, Buffer, AP, Format, LD, Lock, Pool, Space, PoolClass, Trace,
ScanState, Seg, Arena, VM, Root, Thread.
POINTERS
.pointer: The type Pointer is the same as "void *", and exists to sanctify
functions such as PointerAdd.

View file

@ -0,0 +1,90 @@
DESIGN OF THE MPS LIBRARY VERSION MECHANISM
design.mps.version-library
incomplete doc
drj 1998-08-19
INTRODUCTION
.intro: This describes the design of a mechanism to be used to determine the
version (that is, product, version, and release) of an MPS library.
READERSHIP
.readership: Any MPS developer.
SOURCE
.source: Various requirements demand such a mechanism. See
request.epcore.160021: There is no way to tell which version and release of the
MM one is using.
OVERVIEW
.overview: See design.mps.version for discussion and design of versions of
other aspects of the software. This document concentrates on a design for
determining which version of the library one has linked with. There are two
aspects to the design, allowing humans to determine the version of an MPS
library, and allowing programs to determine the version of an MPS library.
Only the former is currently designed (a method for humans to determine which
version of an MPS library is being used).
.overview.impl: The overall design is to have a distinctive string compiled
into the library binary. Various programs and tools will be able to extract
the string and display it. The string will identify the version of the MPS
begin used.
ARCHITECTURE
.arch.structure: The design consists of 3 components:
.arch.string: A string embedded into any delivered library binaries (which will
encode the necessary information).
.arch.proc: A process by which the string is modified appropriately whenever
releases are made.
.arch.tool: A tool and its documentation (it is expected that standard tools
can be used). The tool will be used to extract the version string from a
delivered library or an executable linked with the library.
.arch.not-here: Only the string component (:arch.string) is directly described
here. The other components are described elsewhere. (where?)
The string will contain information to identify the following items:
.arch.string.platform: the platform being used.
.arch.string.product: the name of the product.
.arch.string.variety: the variety of the product.
.arch.string.version: the version and release of the product.
IMPLEMENTATION
.impl.file: The string itself is a declared C object in the file version.c
(impl.c.version). It consists of a concatenation of various strings which are
defined in other modules.
.impl.variety: The string containing the name of the variety is the expansion
of the macro MPS_VARIETY_STRING defined by config.h (impl.h.config).
.impl.product: The string containing the name of the product is the expansion
of the macro MPS_PROD_STRING defined by config.h (impl.h.config).
.impl.platform: The string containing the name of the platform is the expansion
of the macro MPS_PF_STRING defined by mpstd.h (impl.h.mpstd).
.impl.date: The string contains the date and time of compilation by using the
__DATE__ and __TIME__ macros defined by ISO C (ISO C clause 6.8.8).
.impl.version: The string contains the version and release of the product.
This is by the expansion of the macro MPS_RELEASE which is defined in this
module (version.c).
.impl.usage: To make a release, the MPS_RELEASE macro (see
impl.c.version.release) is edited to contain the release name (e.g.,
"release.epcore.brisling"), and then changed back immediately after the release
checkpoint is made.

View file

@ -0,0 +1,23 @@
DESIGN OF MPS SOFTWARE VERSIONS
design.mps.version
incomplete doc
drj 1998-08-19
INTRODUCTION
.intro: This is the design of the support in the MPS for describing and
inspecting versions.
OVERVIEW
.overview: There different sorts of version under consideration: Versions of
the (MPS) library used (linked with), versions of the interface used (header
files in C) when compiling the client's program, versions of the documentation
used when the client was writing the program. There are issues of programmatic
and human access to these versions.
.overview.split: The design is split accordingly. See
design.mps.version-library for the design of a system for determining the
version of the library one is using. And other non-existant documents for the
others.

95
mps/design/vm/index.txt Normal file
View file

@ -0,0 +1,95 @@
THE DESIGN OF THE VIRTUAL MAPPING INTERFACE
design.mps.vm
incomplete design
richard 1998-05-11
.intro: This the design of the VM interface. The VM interface provides a
simple, low-level, operating-system independent interface to address-space.
Each call to VMCreate() reserves (from the operating-system) a single
contiguous range of addresses, and returns a VMStruct thereafter used to manage
this address-space. The VM interface has separate implementations for each
platform that supports it (at least conceptually, in practice some of them may
be the same). The VM module provides a mechanism to reserve large (relative to
the amount of RAM) amounts of address space, and functions to map (back with
RAM) and unmap portions of this address space.
.motivation: The VM is used by the VM Arena Class. It provides the basic
substrate to provide sparse address maps. Sparse address maps have at least
two uses: to encode information into the address of an object which is used in
tracing (the Zone Test) to speed things up; to avoid fragmentation at the
segment level and above (since the amount of address space reserved is large
compared to the RAM, the hope is that there will also be enough address space
somewhere to fit any particular segment in).
DEFINITIONS
.def.reserve: The "reserve" operation: Exclusively reserve a portion of the
virtual address space without arranging RAM or backing store for the virtual
addresses. The intention is that no other component in the process will make
use of the reserved virtual addresses, but in practice this may entail assuming
a certain amount of cooperation. When reserving address space, the requester
simply asks for a particular size, not a particular range of virtual
addresses. Accessing (read/write/execute) reserved addresses is illegal unless
those addresses have been mapped.
.def.map: The "map" operation: Arrange that a specified portion of the virtual
address space is mapped from the swap, effectively allocating RAM and/or swap
space for a particular range of addresses. If successful, accessing the
addresses is now legal. Only reserved addresses should be mapped.
.def.unmap: The "unmap" operation: The inverse of the map operation. Arrange
that a specified portion of the virtual address space is no longer mapped,
effectively freeing up the RAM and swap space that was in use. Accessing the
addresses is now illegal. The addresses return to the reserved state.
.def.vm: "VM" stands for Virtual Memory. Various meanings: A processor
architecture's virtual space and structure; The generic idea / interface /
implementation of the MPS VM module; The C structure (struct VMStruct) used to
encapsulate the functionality of the MPS VM module; An instance of such a
structure.
.def.vm.mps: In the MPS, a "VM" is a VMStruct, providing access to the single
contiguous range of address-space that was reserved (from the operating-system)
when VMCreate was called.
INTERFACE
.if.create: Res VMCreate(VM *VMReturn, Size size)
VMCreate is responsible both for allocating a VMStruct and for reserving an
amount of virtual address space. A VM is created and a pointer to it is
returned in the return parameter VMReturn. This VM has at least size bytes of
virtual memory reserved. If there's not enough space to allocate the VM,
ResMEMORY is returned. If there's not enough address space to reserve a block
of the given size, ResRESOURCE is returned. The reserved virtual memory can be
mapped and unmapped using VMMap and VMUnmap.
.if.destroy: void VMDestroy(VM vm)
A VM is destroyed by calling VMDestroy. Any address space that was mapped
through this VM is unmapped.
[lots of interfaces missing here]
NOTES
.diagram:
.testing: It is important to test that a VM implementation will work in extreme
cases. .testing.large: It must be able to reserve a large address space.
Clients will want multi-GB spaces, more than that OSs will allow. If they ask
for too much, mps_arena_create (and hence VMCreat4e) must fail in a predictable
way. .testing.larger: It must be possible to allocate in a large space;
sometimes commiting will fail, because there's not enough space to replace the
"reserve" mapping. See request.epcore.160201 for details. .testing.lots: It
must be possible to have lots of mappings. The OS must either combine adjacent
mappings or have lots of space in the kernel tables. See request.epcore.160117
for ideas on how to test this.

14
mps/design/vman/index.txt Normal file
View file

@ -0,0 +1,14 @@
ANSI FAKE VM
design.mps.vman
incomplete doc
drj 1996-11-07
.intro: The ANSI fake VM is an implementation of the MPS VM interface (see
design.mps.vm) using services provided by the ANSI C Library (standard.ansic.7)
(malloc and free as it happens).
.align: The VM is aligned to VMAN_ALIGN (defined in impl.h.mpmconf) by adding
VMAN_ALIGN to the reqested size, mallocing a block that large, then rounding
the pointer to the base of the block. vm->base is the aligned pointer,
vm->block is the pointer returned by malloc (used when during VMDestroy).

34
mps/design/vmo1/index.txt Normal file
View file

@ -0,0 +1,34 @@
VM MODULE ON DEC UNIX
design.mps.vmo1
incomplete doc
drj 1997-03-25
INTRODUCTION
.readership: Any MPS developer.
.intro: This is the design of the VM Module for DEC UNIX (aka OSF/1 os.o1). In
general aspects (including interface) the design is as for design.mps.vm.
DETAILS
Functions
.fun.unmap:
VMUnmap
It "unmaps" a region by replacing the existing mapping with a mapping using the
vm->none_fd file descriptor (see mumble mumble, VMCreate), and protection set
to PROT_NONE (ie no access). .fun.unmap.justify: Replacing the mapping in this
way means that the address space is still reserved and will not be used by
calls to mmap (perhaps in other libraries) which specify MAP_VARIABLE.
.fun.unmap.offset: The offset for this mapping is the offset of the region
being unmapped in the VM; this gives the same effect as if there was one
mapping of the vm->none_fd from the base to the limit of the VM (but "behind"
all the other mappings that have been created). .fun.unmap.offset.justify: If
this is not done (if for example the offset is always specified as 0) then the
VM will cause the kernel to create a new file reference for each mapping
created with VMUnmap; eventually the kernel refuses the mmap call because it
can't create a new file reference.

97
mps/design/vmso/index.txt Normal file
View file

@ -0,0 +1,97 @@
VM DESIGN FOR SOLARIS
design.mps.vmso
incomplete doc
drj 1998-05-08
INTRODUCTION
.intro: This is the design for the VM implementation on Solaris 2.x (see os.so
for OS details). The implementation is in MMsrc!vmso.c (impl.c.vm). The
design follows the design for and implements the contract of the generic VM
interface (design.mps.vm). To summarize: The VM module provides a mechanism to
reserve large (relative to the amount of RAM) amounts of address space, and
functions to map (back with RAM) and unmap portions of this address space.
.source: Much of the implementation (and hence the design) was inherited from
the SunOS4 implementation. Not that there's any design for that. You'll find
the mmap(2) (for the system call mmap) and the zero(7d) (for the device
/dev/zero) man pages useful as well. The generic interface and some generic
design is in design.mps.vm.
DEFINITIONS
.def: See design.mps.vm.def.* for definitions common to all VMs.
OVERVIEW
.over: The system calls mmap and munmap are used to access the underlying
functionality. They are used in slightly unusual ways, typically to overcome
baroque features or implementation details of the OS. .over.reserve: In order
to reserve address space, a mapping to a file (/etc/passwd as it happens) is
created with no protection allowed. .over.map: In order to map memory, a
mapping to /dev/zero is created. .over.destroy: When the VM is destroyed,
munmap is used to remove all the mappings previously created.
IMPLEMENTATION
.impl.create: VMCreate
.impl.create.vmstruct: Enough pages to hold the VMStruct are allocated by
creating a mapping to /dev/zero (a read/write private mapping), and using
initializing the memory as a VMStruct. .impl.create.reserve: The size
parameter is rounded up to page size and this amount of address space is
reserved. The address space is reserved by creating a shared mapping to
/etc/passwd with no access allowed (prot argument is PROT_NONE, flags argument
is MAP_SHARED). .impl.create.reserve.mmap.justify: mmap gives us a flexible
way to allocate address space without interfering with any other component in
the process. Because we don't specify MAP_FIXED we are guaranteed to get a
range of addresses that are not in use. Other components must cooperate by not
attempting to create mappings specifying MAP_FIXED and an address in the range
that the MPS has reserved. .impl.create.reserve.passwd.justify: Mapping
/etc/passwd like this worked on SunOS4 (so this implementation inherited it).
Mapping /dev/zero with prot=PROT_NONE and flags=MAP_PRIVATE does not work
because Solaris gratuitously allocates swap (even though you can't use the
memory). .impl.create.reserve.improve: However, it would appears that ORing in
MAP_NORESERVE mapping /dev/zero will reserve address space without allocating
swap, so this might be worth trying. I.e., with prot=PROT_NONE,
flags=MAP_PRIVATE|MAP_NORESERVE. However the following caveat comes from the
original implementation: "Experiments have shown that attempting to reserve
address space by mapping /dev/zero results in swap being reserved. This
appears to be a bug, so we work round it by using /etc/passwd, the only file we
can think of which is pretty much guaranteed to be around." So that might not
work after all.
.impl.map: VMMap
.impl.map.zero: A mapping to /dev/zero is created at the relevant addresses
(overriding the map to /etc/passwd that was previously in place for those
addresses). The prot argument is specified as PROT_READ|PROT_WRITE|PROT_EXEC
(so that any access is allowed), the flags argument as MAP_PRIVATE|MAP_FIXED
(MAP_PRIVATE means that the mapping is not shared with child processes (child
processes will have a mapping, but changes to the memory will not be shared).
MAP_FIXED guarantees that we get the mapping at the specified address). The
zero(7d) man page documents this as a way to create a "zero-initialized unnamed
memory object". .impl.map.error: If there's not enough swap space for the
mapping, mmap will return EAGAIN, not ENOMEM, although you might not think so
from the man page.
.impl.unmap: VMUnmap
.impl.unmap.reserve: The relevant addresses are returned to the reserved state
by creating a mapping to /etc/passwd (overriding the map /dev/zero that was
previously in place for those addresses). As for VMCreate (see
.impl.create.reserve above) the prot argument is PROT_NONE, but the flags
argument has the addition MAP_FIXED flags (so is MAP_SHARED|MAP_FIXED).
.impl.unmap.reserve.offset: The offset argument is specified to be the offset
of the addresses being unmapped from the base of the reserved VM area.
.impl.unmap.reserve.offset.justify: Not specifying the offset like this makes
Solaris create a separate mapping (in the kernel) each time Unmap is used,
eventually the call to mmap will fail. Specifying offset like this does not
cause Solaris to create any extra mappings, the existing mapping to /etc/passwd
gets reused.

View file

@ -0,0 +1,88 @@
THE DESIGN OF THE MPS WRITEF FUNCTION
design.mps.writef
draft doc
richard 1996-10-18
INTRODUCTION
.intro: This document describes the WriteF function, which allows formatted
output in a manner similar to ANSI C printf, but allows the MPM to operate in a
freestanding environment (see design.mps.exec-env).
.background: The documents design.mps.exec-env and design.mps.lib describe the
design of
the library interface and the reason that it exists.
DESIGN
.no-printf: There is no dependency on printf has been removed. The MPM only
depends on fputc and fputs, via the Library Interface (design.mps.lib). This
makes it much easier to deploy the MPS in a freestanding environment. This is
achieved by implementing our own internal output routines in mpm.c.
Our output requirements are few, so the code is short. The only output
function which should be used in the rest of the MPM is WriteF, which is
similar to fprintf:
Res WriteF(mps_lib_FILE *stream, ...);
WriteF expects a format string followed by zero or more items to insert into
the output, followed by another format string, more items, etc., then a NULL
format string, e.g.
WriteF(stream,
"Hello: $A\n", address,
"Spong: $U ($S)\n", number, string,
NULL);
This makes Describe methods much easier to do, e.g.:
WriteF(stream,
"Buffer $P ($U) {\n", (WriteFP)buffer, (WriteFU)buffer->serial,
" base $A init $A alloc $A limit $A\n",
(WriteFA)buffer->base, (WriteFA)buffer->ap.init,
(WriteFA)buffer->ap.alloc, (WriteFA)buffer->ap.limit,
" Pool $P\n", (WriteFP)buffer->pool,
" Seg $P\n", (WriteFP)buffer->seg,
" rank $U\n", (WriteFU)buffer->rank,
" alignment $W\n", (WriteFW)buffer->alignment,
" grey $B\n", (WriteFB)buffer->grey,
" shieldMode $B\n", (WriteFB)buffer->shieldMode,
" p $P i $U\n", (WriteFP)buffer->p, (WriteFU)buffer->i,
"} Buffer $P ($U)\n", (WriteFP)buffer, (WriteFU)buffer->serial,
NULL);
.types: For each format $X that WriteF supports, there is a type defined in
impl.h.mpmtypes WriteFX which is the promoted version of that type. These are
provided both to ensure promotion and to avoid any confusion about what type
should be used in a cast. It is easy to check the casts against the formats to
ensure that they correspond. .types.future: It is possibly that this type set
or similar may be used in future in some generalisation of varargs in the MPS.
.formats: The formats supported are as follows.
code name type example rendering
$A address Addr 9EF60010
$P pointer void * 9EF60100
$F function void *(*)() 9EF60100 (may be plaform-specific length
and format)
$S string char * hello
$C character char x
$W word unsigned long 00109AE0
$U decimal unsigned long 42
$B binary unsigned long 00000000000000001011011110010001
$$ dollar - $
Note that WriteFC is an int, because that is the default promotion of a char
(see .types).
.snazzy: We should resist the temptation to make WriteF an incredible snazzy
output engine. We only need it for Describe methods and assertion messages.
At the moment it's a very simple bit of code -- let's keep it that way.
.f: The F code is used for function pointers. They are currently printed as a
hexedecimal string of the appropriate length for the platform, and may one day
be extended to include function name lookup.