diff --git a/mps/design/arena/index.txt b/mps/design/arena/index.txt new file mode 100644 index 00000000000..21bd2ee0e94 --- /dev/null +++ b/mps/design/arena/index.txt @@ -0,0 +1,446 @@ + THE DESIGN OF THE MPS ARENA + design.mps.arena + incomplete design + pekka 1997-08-11 + +INTRODUCTION + + +.intro: This is the design of the arena structure. + +.readership: MM developers. + + +Document History + +.hist.0: Version 0 is a different document. + +.hist.1: First draft written by Pekka P. Pirinen 1997-08-11, based on +design.mps.space(0) and mail.richard.1997-04-25.11-52(0). + +.hist.2: Updated for separation of tracts and segments. tony 1999-04-16 + + +OVERVIEW + +.overview: The arena serves two purposes: A structure that is the top-level +state of the MPS, and as such contains a lot of fields which are considered +"global"; Provision of raw memory to pools. + +An arena is of a particular arena class, the class is selected when the arena +is created. Classes encapsulate both policy (such as how pools placement +preferences map into actual placement) and mechanism (such as where the memory +originates: OS VM, client provided, via malloc). Some behaviour (most in the +former "top-level datastructure" category) is implemented by generic arena +code, some by arena class code. To some extent the arena coordinates placement +policies between different pools active in the same arena, however this +functionality is likely to be replaced by something more modular and which does +a better job: The Locus Manager. + + +DEFINITIONS + +.def.tract: Pools request memory from the arena (using ArenaAlloc) as a block +comprising a contiguous sequence of units. The units are known as tracts. A +tract has a specific size (the arena alignment, often corresponds to the OS +page size) and all tracts are aligned to that size. Also used to mean the +datastructure used to manage tracts. + + +REQUIREMENTS + +[copied from design.mps.arena.vm(1) and edited slightly -- drj 1999-06-23] + +[Where do these come from? Need to identify and document the sources of +requirements so that they are traceable to client requirements. Most of these +come from the architectural design (design.mps.architecture) or the fix +function design (design.mps.fix). -- richard 1995-08-28] + +These requirements are the responsiblity of the class implementations as well +as the generic arena. However, some classes (ANSI arena, arenaan.c, in +particular) are not intended for production use so do not have to meet all the +speed and space requirements. + + +Block Management + +.req.fun.block.alloc: The Arena Manager must provide allocation of contiguous +blocks of memory. + +.req.fun.block.free: It must also provide freeing of contiguously allocated +blocks owned by a pool - whether or not the block was allocated via a single +request. + +.req.attr.block.size.min: The Arena Manager must support management of blocks +down to the size of the grain (page) provided by the virtual mapping interface +if a VM interface is being used, a comparable size otherwise. + +.req.attr.block.size.max: It must also support management of blocks up to the +maximum size allowed by the combination of operating system and architecture. +This is derived from req.dylan.attr.obj.max (at least). + +.req.attr.block.align.min: The alignment of blocks shall not be less than +MPS_PF_ALIGN (defined in "mpstd.h" included via "config.h") for the +architecture. This is so that pool classes can conveniently guarantee pool +allocated blocks are aligned to MPS_PF_ALIGN. (A trivial requirement) + +.req.attr.block.grain.max: The granularity of allocation shall not be more than +the grain size provided by the virtual mapping interface. + + +Address Translation + +.req.fun.trans: The Arena must provide a translation from any address to either +an indication that the address is not in any tract (if that is so) or the +following data associated with the tract containing that address: +.req.fun.trans.pool: The pool that allocated the tract. +.req.fun.trans.arbitrary: An arbitrary pointer value that the pool can +associate with the tract at any time. +.req.fun.trans.white: The tracer whiteness information. IE a bit for each +active trace that indicates whether this tract is white (contains white +objects). This is required so that the tracer resolve / preserve (aka "Fix") +protocol can run very quickly. + +.req.attr.trans.time: The translation shall take no more than @@@@ [something +not very large -- drj 1999-06-23] + +Iteration Protocol + +.req.iter: er, there's a tract iteration protocol which is presumably required +for some reason? + + +Arena Partition + +.req.fun.set: The Arena Manager must provide a method for approximating sets of +addresses. .req.fun.set.time: The determination of membership shall take no +more than ???? [something very small indeed]. (the non-obvious solution is +refsets) + + +Constraints + +.req.attr.space.overhead: req.dylan.attr.space.struct implies that the arena +must limit the space overhead. The arena is not the only part that introduces +an overhead (pool classes being the next most obvious), so multiple parts must +cooperate in order to meet the ultimate requirements. +.req.attr.time.overhead: Time overhead constraint? [how can there be a time +"overhead" on a necessary component? drj 1999-06-23] + + + +ARCHITECTURE + +Statics + +.static: There is no higher-level data structure than a arena, so in order to +support several arenas, we have to have some static data in impl.c.arena. See +impl.c.arena.static. + +.static.init: All the static data items are initialized when the first arena is +created. + +.static.serial: arenaSerial is a static Serial, containing the serial number of +the next arena to be created. The serial of any existing arena is less than +this. + +.static.ring: arenaRing is the sentinel of the ring of arenas. + +.static.ring.init: arenaRingInit is a bool showing whether the ring of arenas +has been initialized. + +.static.ring.lock: The ring of arenas has to be locked when traversing the +ring, to prevent arenas being added or removed. This is achieved by using the +(non-recursive) global lock facility, provided by the lock module. + +.static.check: The statics are checked each time any arena is checked. + + +Arena Classes + +.class: The Arena datastructure is designed to be subclassable (see +design.mps.protocol(0)). Clients can select what arena class they'd like when +instantiating one with mps_arena_create(). The arguments to mps_arena_create +are class dependent. + +.class.init: However, the generic ArenaInit is called from the class-specific +method, rather than vice versa, because the method is responsible for +allocating the memory for the arena descriptor and the arena lock in the first +place. Likewise, ArenaFinish is called from the finish method. + +.class.fields: The alignment (for tract allocations) and zoneShift (for +computing zone sizes and what zone an address is in) fields in the arena are +the responsibility of the each class, and are initialized by the init method. +The responsibility for maintaining the commitLimit, spareCommitted, +spareCommitLimit fields is shared between the (generic) arena and the arena +class. commitLimit (see .commit-limit below) is changed by the generic arena +code, but arena classes are responsible for ensuring the semantics. For +spareCommitted and spareCommitLimit see .spare-committed below. + +.class.abstract: The basic arena class (AbstractArenaClass) is abstract and +must not be instantiated. It provides little useful behaviour, and exists +primarily as the root of the tree of arena classes. Each concrete class must +specialize each of the class method fields, with the exception of the describe +method (which has a trivial implementation) and the extend, retract and +spareCommitExceeded methods which have non-callable methods for the benefit of +arena classes which don't implement these features. .class.abstract.null: The +abstract class does not provide dummy implementations of those methods which +must be overridden. Instead each abstract method is initialized to NULL. + + +Tracts + +.tract: The arena allocation function (ArenaAlloc) allocates a block of memory +to pools, of a size which is aligned to the arena alignment. Each alignment +unit (grain) of allocation is represented by an object called a Tract. Tracts +are the hook on which the segment module is implemented. Pools which don't use +segments may use tracts for associating their own data with each allocation +grain. + +.tract.structure: The tract structure definition looks as follows:- + +typedef struct TractStruct { /* Tract structure */ + Pool pool; /* MUST BE FIRST (design.mps.arena.tract.field.pool) */ + void *p; /* pointer for use of owning pool */ + Addr base; /* Base address of the tract */ + TraceSet white : TRACE_MAX; /* traces for which tract is white */ + unsigned int hasSeg : 1; /* does tract have a seg in p? */ +} TractStruct; + +.tract.field.pool: The pool field indicates to which pool the tract has been +allocated (.req.fun.trans.pool). Tracts are only valid when they are allocated +to pools. When tracts are not allocated to pools, arena classes are free to +reuse tract objects in undefined ways. A standard technique is for arena class +implementations to internally describe the objects as a union type of +TractStruct and some private representation, and to set the pool field to NULL +when the tract is not allocated. The pool field must come first so that the +private representation can share a common prefix with TractStruct. This permits +arena classes to determine from their private representation whether such an +object is allocated or not, without requiring an extra field. + +.tract.field.p: The p field is used by pools to associate tracts with other +data (.req.fun.trans.arbitrary). It's used by the segment module to indicate +which segment a tract belongs to. If a pool doesn't use segments it may use +the p field for its own purposes. This field has the non-specific type (void *) +so that pools can use it for any purpose. + +.tract.field.hasSeg: The hasSeg bit-field is a boolean which indicates whether +the p field is being used by the segment module. If this field is TRUE, then +the value of p is a Seg. hasSeg is typed as an unsigned int, rather than a +Bool. This ensures that there won't be sign conversion problems when converting +the bit-field value. + +.tract.field.base: The base field contains the base address of the memory +represented by the tract. + +.tract.field.white: The white bit-field indicates for which traces the tract is +white (.req.fun.trans.white). This information is also stored in the segment, +but is duplicated here for efficiency during a call to TraceFix (see +design.mps.trace.fix). + +.tract.limit: The limit of the tract's memory may be determined by adding the +arena alignment to the base address. + +.tract.iteration: Iteration over tracts is described in +design.mps.arena.tract-iter(0). + +.tract.if.tractofaddr: Function TractOfAddr finds the tract corresponding to an +address in memory. (.req.fun.trans). + +Bool TractOfAddr(Tract *tractReturn, Arena arena, Addr addr); + +If addr is an address which has been allocated to some pool, then returns TRUE, +and sets *tractReturn to the tract corresponding to that address. Otherwise, +returns false. This function is similar to TractOfBaseAddr (see +design.mps.arena.tract-iter.if.contig-base) but serves a more general purpose +and is less efficient. + +.tract.if.TRACT_OF_ADDR: TRACT_OF_ADDR is a macro version of TractOfAddr. It's +provided for efficiency during a call to TraceFix (see +design.mps.trace.fix.tractofaddr) + + +Control Pool + +.pool: Each arena has a "control pool", arena->controlPoolStruct, which is used +for allocating MPS control data structures (using ControlAlloc()). + + +Polling + +.poll: ArenaPoll is called "often" by other MM code (for instance, on buffer +fill or allocation). It is the entry point for doing tracing work. If the +polling clock exceeds a set threshold, and we're not already doing some tracing +work (i.e., insidePoll is not set), it calls TracePoll on all busy traces. +.poll.size: The actual clock is arena->fillMutatorSize. This is because +internal allocation is only significant when copy segments are being allocated, +and we don't want to have the pause times to shrink because of that. (There is +no current requirement for the trace rate to guard against running out of +memory. [clearly it really ought to though, we have a requirement to not run +out of memory (req.dylan.prot.fail-alloc, req.dylan.prot.consult), emergency +tracing should not be our only story. drj 1999-06-22]) BufferEmpty is not +taken into account, because the splinter will rarely be useable for allocation +and we are wary of the clock running backward. + +.poll.clamp: Polling is disabled when the arena is "clamped", in which case +arena->clamped is TRUE. Clamping the arena prevents background tracing work, +and further new garbage collections from starting. Clamping and releasing are +implemented by the ArenaClamp and ArenaRelease methods. + +.poll.park: The arena is "parked" by clamping it, then polling until there are +no active traces. This finishes all the active collections and prevents +further collection. Parking is implemented by the ArenaPark method. + + +Commit Limit + +.commit-limit: The arena supports a client configurable "commit limit" which is +a limit on the total amount of committed memory. The generic arena structure +contains a field to hold the value of the commit limit and the implementation +provides two functions for manipulating it (ArenaCommitLimit to read it and +ArenaSetCommitLimit to set it). Actually abiding by the contract of not +committing more memory than the commit limit is left up to the individual arena +classes. + +.commit-limit.err: When allocation from the arena would otherwise succeed but +cause the MPS to use more committed memory than specified by the commit limit +ArenaAlloc should refuse the request and return ResCOMMIT_LIMIT. +.commit-limit.err.multi: In the case where an ArenaAlloc request cannot be +fulfilled for more than one reason including exceeding the commit limit then +class implementations should strive to return a result code other than +ResCOMMIT_LIMIT (ie ResCOMMIT_LIMIT should only be returned if the _only_ +reason for failing the ArenaAlloc request is that the commit limit would be +exceeded). The (client) documentation allows implementations to be ambiguous +with respect to which result code in returned in such a situation however. + + +Spare Committed (aka "hysteresis") + +.spare-committed: (See symbol.mps.c.mps_arena_spare_committed(0)) The generic +arena structure contains two fields for the spare committed memory fund: +spareCommitted records the total number of spare committed bytes; +spareCommitLimit records the limit (set by the user) on the amount of spare +committed memory. spareCommitted is modified by the arena class but its value +is used by the generic arena code. There are two uses: a getter function for +this value is provided through the MPS interface +(mps_arena_spare_commit_limit_set), and by the SetSpareCommitLimit function to +determine whether the amount of spare committed memory needs to be reduced. +spareCommitLimit is mainpulated by generic arena code, however the associated +semantics are the reponsibility of the class. It is the class's resposibility +to ensure that it doesn't use more spare committed bytes than the value in +spareCommitLimit. + +.spare-commit-limit: The function ArenaSetSpareCommitLimit sets the +spareCommitLimit field. If the limit is set to a value lower than the amount +of spare committed memory (stored in spareCommitted) then the class specific +function spareCommitExceeded is called. + + +Locks + +.lock.ring: ArenaAccess is called when we fault on a barrier. The first thing +it does is claim the non-recursive global lock to protect the arena ring (see +design.mps.lock(0)). .lock.arena: After the arena ring lock is claimed, +ArenaEnter is called on one or more arenas. This claims the lock for that +arena. When the correct arena is identified or we run out of arenas, the lock +on the ring is released. + +.lock.avoid: Deadlocking is avoided as follows: + +.lock.avoid.mps: Firstly we require the MPS not to fault (i.e., when any of +these locks are held by a thread, that thread does not fault). + +.lock.avoid.thread: Secondly, we require that in a multi-threaded system, +memory fault handlers do not suspend threads (although the faulting thread +will, of course, wait for the fault handler to finish). + +.lock.avoid.conflict: Thirdly, we avoid conflicting deadlock between the arena +and global locks by ensuring we never claim the arena lock when the recursive +global lock is already held, and we never claim the binary global lock when the +arena lock is held. + + +Location Dependencies + +.ld: Location dependencies use fields in the arena to maintain a history of +summaries of moved objects, and to keep a notion of time, so that the staleness +of location dependency can be determined. + + +Finalization + +.final: There is a pool which is optionally (and dynamically) instantiated to +implement finalization. The fields finalPool and isFinalPool are used. + + +IMPLEMENTATION + + +Tract Cache + +.tract.cache: When tracts are allocated to pools (by ArenaAlloc), the first +tract of the block and it's base address are cached in arena fields lastTract +and lastTractBase. The function TractOfBaseAddr (see +design.mps.arena.tract-iter.if.block-base(0)) checks against these cached +values and only calls the class method on a cache miss. This optimizes for the +common case where a pool allocates a block and then iterates over all its +tracts (e.g. to attach them to a segment). + +.tract.uncache: When blocks of memory are freed by pools, ArenaFree checks to +see if the cached value for the most recently allocated tract (see +.tract.cache) is being freed. If so, the cache is invalid, and must be reset. +The lastTract and lastTractBase fields are set to NULL. + + +Control Pool + +.pool.init: The control pool is initialized by a call to PoolInit() during +ArenaCreate(). + +.pool.ready: All the other fields in the arena are made checkable before +calling PoolInit(), so PoolInit can call ArenaCheck(arena). The pool itself +is, of course, not checkable, so we have a field arena->poolReady, which is +false until after the return from PoolInit. ArenaCheck only checks the pool +if(poolReady). + + +Traces + +.trace: arena->trace[ti] is valid if and only if +TraceSetIsMember(arena->busyTraces, ti). + +.trace.create: Since the arena created by ArenaCreate has arena->busyTraces = +TraceSetEMPTY, none of the traces are meaningful. + +.trace.invalid: Invalid traces have signature SigInvalid, which can be checked. + + +Polling + +.poll.fields: There are three fields of a arena used for polling: +pollThreshold, insidePoll, and clamped (see above). pollThreshold is the +threshold for the next poll: it is set at the end of ArenaPoll to the current +polling time plus ARENA_POLL_MAX. + + +Location Dependencies + +.ld.epoch: arena->epoch is the "current epoch". This is the number of 'flips' +of traces in the arena since the arena was created. From the mutator's point +of view locations chanage atomically at flip. + +.ld.history: arena->history is an array of ARENA_LD_LENGTH RefSets. These are +the summaries of moved objects since the last ARENA_LD_LENGTH epochs. If e is +one of these recent epochs, arena->history[e % ARENA_LD_LENGTH] is a summary of +(the original locations of) objects moved since epoch e. + +.ld.prehistory: arena->prehistory is a RefSet summarizing the original +locations of all objects ever moved. When considering whether a really old +location dependency is stale, it is compared with this summary. + + +Roots + +.root-ring: The arena holds a member of a ring of roots in the arena. It holds +an incremental serial which is the serial of the next root. + diff --git a/mps/design/arenavm/index.txt b/mps/design/arenavm/index.txt new file mode 100644 index 00000000000..8195d4b7eca --- /dev/null +++ b/mps/design/arenavm/index.txt @@ -0,0 +1,167 @@ + VIRTUAL MEMORY ARENA + design.mps.arena.vm + incomplete doc + drj 1996-07-16 + +INTRODUCTION + +.intro: This document describes the detailed design of the Virtual Memory Arena +Class of the Memory Pool System. The VM Arena Class is just one class +available in the MPS. The generic arena part is described in design.mps.arena. + + +OVERVIEW + +.overview: VM arenas provide blocks of memory to all other parts of the MPS in +the form of "tracts" using the virtual mapping interface (design.mps.vm) to the +operating system. The VM Arena Class is not expected to be provided on +platforms that do not have virtual memory (like MacOS, os.s7(1)). + +.overview.gc: The VM Arena Class provides some special services on these blocks +in order to facilitate garbage collection: + +.overview.gc.zone: Allocation of blocks with specific zones. This means that +the generic fix function (design.mps.fix) can use a fast refset test to +eliminate references to addresses that are not in the condemned set. This +assumes that a pool class that uses this placement appropriately is being used +(such as the generation placement policy used by AMC, see design.mps.poolamc(1) +) and that the pool selects the condemned sets to coincide with zone stripes. + +.overview.gc.tract: A fast translation from addresses to tract. (See +design.mps.arena.req.fun.trans) + + +NOTES + +.note.refset: Some of this document simply assumes that RefSets (see the +horribly incomplete design.mps.refset) have been chosen as the solution for +design.mps.arena.req.fun.set. It's a lot simpler that way. Both to write and +understand. + + +REQUIREMENTS + + +Most of the requirements are in fact on the generic arena (See design.mps.arena +.req.*). However, many of those requirements can only be met by a suitable +arena class design. + +Requirements particular to this arena class: + +Placement + +.req.fun.place: It must be possible for pools to obtain tracts at particular +addresses. Such addresses shall be declared by the pool specifying what refset +zones the tracts should lie in and what refset zones the tracts should not lie +in. It is acceptable for the arena to not always honour the request in terms +of placement if it has run out of suitable addresses. + +Arena Partition + +.req.fun.set: See design.mps.arena.req.fun.set. The approximation to sets of +address must cooperate with the placement mechanism in the way required by +.req.fun.place (above). + + +ARCHITECTURE + +.arch.memory: The underlying memory is obtained from whatever Virtual Memory +interface (see design.mps.vm). Explain why this is used ### + + +SOLUTION IDEAS + +.idea.grain: Set the arena granularity to the grain provided by the virtual +mapping module. + +.idea.mem: Get a single large contiguous address area from the virtual mapping +interface and divide that up. + +.idea.table: Maintain a table with one entry per grain in order to provide fast +mapping (shift and add) between addresses and table entries. + +.idea.table.figure: + + +.idea.map: Store the pointers (.req.fun.trans) in the table directly for every +grain. + +.idea.zones: Partition the managed address space into zones (see idea.zones) +and provide the set approximation as a reference signature. + +.idea.first-fit: Use a simple first-fit allocation policy for tracts within +each zone (.idea.zones). Store the freelist in the table (.idea.table). + +.idea.base: Store information about each contiguous area (allocated of free) in +the table entry (.idea.table) corresponding to the base address of the area. + +.idea.shadow: Use the table (.idea.table) as a "shadow" of the operating +system's page table. Keep information such as last access, protection, etc. in +this table, since we can't get at this information otherwise. + +.idea.barrier: Use the table (.idea.table) to implement the software barrier. +Each segment can have a read and/or write barrier placed on it by each +process. (.idea.barrier.bits: Store a bit-pattern which remembers which +process protected what.) This will give a fast translation from a +barrier-protected address to the barrier handler via the process table. + +.idea.demand-table: For a 1Gb managed address space with a 4Kb page size, the +table will have 256K-entries. At (say) four words per entry, this is 4Mb of +table. Although this is only an 0.4%, the table shouldn't be preallocated or +initially it is an infinite overhead, and with 1Mb active, it is a 300% +overhead! The address space for the table should be reserved, but the pages +for it mapped and unmapped on demand. By storing the table in a tract, the +status of the table's pages can be determined by looking at it's own entries in +itself, and thus the translation lookup (.req.fun.trans) is slowed to two +lookups rather than one. + +.idea.pool: Make the Arena Manager a pool class. Arena intialization becomes +pool creation. Tract allocation becomes PoolAlloc. Other operations become +class-specific operations on the "arena pool". + + +DATA STRUCTURES + +.tables: There are two table data structures: a page table, and an alloc table. + +.table.page.map: Each page in the VM has a corresponding page table entry. + +.table.page.linear: The table is a linear array of PageStruct entries; there is +a simple mapping between the index in the table and the base address in the VM +(viz. base-address = arena-base + (index * page-size), one way, index = +(base-address - arena-base) / page-size, the other). + +.table.page.partial: The table is partially mapped on an "as-needed" basis. +The function unusedTablePages identifies entirely unused pages occupied by the +page table itself (ie those pages of the page table which are occupied by +PageStructs which all describe free pages). Tract allocation and freeing use +this function to map and unmap the page table with no hysteresis. (there is +restriction on the parameters you may pass to unusedTablePages) + +.table.page.tract: Each page table entry contains a tract, which is only valid +if it is allocated to a pool. If it is not allocated to a pool, the fields of +the tract are used for other purposes. (See design.mps.arena.tract.field.pool) + +.table.alloc: The alloc table is a simple bit table (implemented using the BT +module, design.mps.bt). + +.table.alloc.map: Each page in the VM has a corresponding alloc table entry. + +.table.alloc.semantics: The bit in the alloc table is set iff the corresponding +page is allocated (to a pool). + + + +NOTES + + +.fig.page: How the pages in the arena area are represented in the tables. + +.fig.count: How a count table can be used to partially map the page table, as +proposed in request.dylan.170049.sol.map. + + - arenavm diagrams + +ATTACHMENT + "arenavm diagrams" + diff --git a/mps/design/bt/index.txt b/mps/design/bt/index.txt new file mode 100644 index 00000000000..3e2a60d5228 --- /dev/null +++ b/mps/design/bt/index.txt @@ -0,0 +1,610 @@ + BIT TABLES + design.mps.bt + draft doc + drj 1997-03-04 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: This is the design of the Bit Tables module. A Bit Table is a linear +array of bits. A Bit Table of length n is indexed using an integer from 0 to +(but not including) n. Each bit in a Bit Table can hold either the value 0 +(aka FALSE) or 1 (aka TRUE). A variety of operations are provided including: +set, reset, and retrieve, individual bits; set and reset a contiguous range of +bits; search for a contiguous range of reset bits; making a "negative image" +copy of a range. + + +HISTORY + +.history.0-3: The history for versions 0-3 is lost pending possible +reconstruction. + +.history.4: Prepared for review. Added full requirements section. Made +notation more consistent throughout. Documented all functions. drj 1999-04-29 + + +DEFINITIONS + +.def.set: Set. Used as a verb meaning to assign the value 1 or TRUE to a bit. +Used descriptively to denote a bit containing the value 1. Note 1 and TRUE are +synonyms in MPS C code (see design.mps.type(0).bool.value). + +.def.reset: Reset. Used as a verb meaning to assign the value 0 or FALSE to a +bit. Used descriptively to denote a bit containing the value 0. Note 0 and +FALSE are synonyms in MPS C code (see design.mps.type(0).bool.value). + +[consider using "fill/empty" or "mark/clear" instead of "set/reset", set/reset +is probably a hangover from drj's z80 hacking days -- drj 1999-04-26] + +.def.bt: Bit Table. A Bit Table is a mapping from [0,n) to {0,1} for some n +represented as a linear array of bits. .def.bt.justify: They are called bit +tables because a single bit is used to encode whether the image of a particular +integer under the map is 0 or 1. + +.def.range: Range. A contiguous sequence of bits in a Bit Table. Ranges are +typically specified as a base--limit pair where the range includes the position +specified by the base, but excludes that specified by the limit. The +mathematical interval notation for half-open intervals, [base, limit), is used. + + +REQUIREMENTS + +.req.bit: The storage for a Bit Table of n bits shall take no more than a small +constant addition to the storage required for n bits. .req.bit.why: This is so +that clients can make some predictions about how much storage their algorithms +use. A small constant is allowed over the minimal for two reasons: inevitable +implementation overheads (such as only being able to allocate storage in +multiples of 32 bits), extra storage for robustness or speed (such as signature +and length fields). +.req.create: A means to create Bit Tables. .req.create.why: Obvious. +.req.destroy: A means to destroy Bit Tables. .req.destroy.why: Obvious. +.req.ops: The following operations shall be supported: + .req.ops.get: Get. Get the value of a bit at a specified index. + .req.ops.set: Set. Set a bit at a specified index. + .req.ops.reset: Reset. Reset a bit at a specified index. + .req.ops.minimal.why: Get, Set, Reset, are the minimal operations. All +possible mappings can be created and inspected using these operations. + .req.ops.set.range: SetRange. Set a range of bits. .req.ops.set.range.why: +It's expected that clients will often want to set a range of bits; providing +this operation allows the implementation of the BT module to make the operation +efficient. + .req.ops.reset.range: ResetRange. Reset a range of bits. +.req.ops.reset.range.why: as for SetRange, see .req.ops.set.range.why. + .req.ops.test.range.set: IsSetRange. Test whether a range of bits are all +set. .req.ops.test.range.set.why: Mostly for checking. For example, often +clients will know that a range they are about to reset is currently all set, +they can use this operation to assert that fact. + .req.ops.test.range.reset: IsResetRange. Test whether a range of bits are +all reset. .req.ops.test.range.reset.why: As for IsSetRange, see +.req.ops.test.range.set.why. + .req.ops.find: Find a range (which we'll denote [i,j)) of at least L reset +bits that lies in a specified subrange of the entire Bit Table. Various find +operations are required according to the (additional) properties of the +required range: + .req.ops.find.short.low: FindShortResetRange. Of all candidate ranges, +find the range with least j (find the leftmost range that has at least L reset +bits and return just enough of that). .req.ops.find.short.low.why: Required by +client and VM arenas to allocate segments. The arenas implement definite +placement policies (such as lowest addressed segment first) so they need the +lowest (or highest) range that will do. It's not currently useful to allocate +segments larger than the requested size, so finding a short range is +sufficient. + .req.ops.find.short.high: FindShortResetRangeHigh. Of all candidate +ranges, find the range with greatest i (find the rightmost range that has at +least L reset bits and return just enough of that). +.req.ops.find.short.high.why: Required by arenas to implement a specific +segment placement policy (highest addressed segment first). + .req.ops.find.long.low: FindLongResetRange. Of all candidate ranges, +identify the ranges with least i and of those find the one with greatest j +(find the leftmost range that has at least L reset bits and return all of it). +.req.ops.find.long.low.why: Required by the mark and sweep Pool Classes (AMS, +AWL, LO) for allocating objects (filling a buffer). It's more efficient to +fill a buffer with as much memory as is conveniently possible. There's no +strong reason to find the lowest range but it's bound to have some beneficial +(small) cache effect and makes the algorithm more predictable. + .req.ops.find.long.high: FindLongResetRangeHigh. Provided, but not +required, see .non-req.ops.find.long.high. + .req.ops.copy: Copy a range of bits from one Bit Table to another Bit Table . +Various copy operations are required: + .req.ops.copy.simple: Copy a range of bits from one Bit Table to the same +position in another Bit Table. .req.ops.copy.why: Required to support copying +of the tables for the "low" segment during segment merging and splitting, for +pools using tables (e.g. PoolClassAMS). + .req.ops.copy.offset: Copy a range of bits from one Bit Table to an offset +position in another Bit Table. .req.ops.copy.why: Required to support copying +of the tables for the "high" segment during segment merging and splitting, for +pools which support this (currently none, as of 2000-01-17). + .req.ops.copy.invert: Copy a range of bits from one Bit Table to the same +position in another Bit Table inverting all the bits in the target copy. +.req.ops.copy.invert.why: Required by colour manipulation code in PoolClassAMS +and PoolClassLO. +.req.speed: Operations shall take no more than a few memory operations per bit +manipulated. .req.speed.why: Any slower would be gratuitous. +.req.speed.fast: The following operations shall be very fast: + +.req.speed.fast.find.short: +FindShortResRange (the operation used to meet .req.ops.find.short.low) +FindShortResRangeHigh (the operation used to meet .req.ops.find.short.high) + +.req.speed.fast.find.short.why: These two are used by the client arena +(design.mps.arena.client) and the VM arena (design.mps.arena.vm) for finding +segments in page tables. The operation will be used sufficiently often that +its speed will noticeably affect the overall speed of the MPS. They will be +called with a length equal to the number of pages in a segment. Typical values +of this length depend on the pool classes used and their configuration, but we +can expect length to be small (1 to 16) usually. We can expect the Bit Table +to be populated densely where it is populated at all, that is set bits will +tend to be clustered together in subranges. + +.req.speed.fast.find.long: +FindLongResRange (the operation used to meet .req.ops.find.long.low) + +.req.speed.fast.find.long.why: +Used in the allocator for PoolClassAWL (design.mps.poolawl(1)), PoolClassAMS +(design.mps.poolams(2)), PoolClassEPVM (design.mps.poolepvm(0)). Of these AWL +and EPVM have speed requirements. For AWL the length of range to be found will +be the length of a Dylan table in words. According to +mail.tony.1999-05-05.11-36(0), only objects are allocated in AWL +(though not all objects are allocated in AWL), and the mean +length of an object is 486 Words. No data for EPVM alas. + +.req.speed.fast.other.why: We might expect mark and sweep pools to make use of +Bit Tables, the MPS has general requirements to support efficient mark and +sweep pools, so that imposes general speed requirements on Bit Tables. + + +NON REQUIREMENTS + +The following are not requirements but the current design could support them +with little modification or does support them. Often they used to be +requirements, but are no longer, or were added speculatively or experimentally +but aren't currently used. + + .non-req.ops.test.range.same: RangesSame. Test whether two ranges that +occupy the same positions in different Bit Tables are the same. This used to +be required by PoolClassAMS, but is no longer. Currently (1999-05-04) the +functionality still exists. + .non-req.ops.find.long.high: FindLongResetRangeHigh. (see .req.ops.find) Of +all candidate ranges, identify the ranges with greatest j and of those find the +one with least i (find the rightmost range that has at least L reset bits and +return all of it). Provided for symmetry but only currently used by the BT +tests and cbstest.c. + + + + +BACKGROUND + +.background: Originally Bit Tables were used and implemented by PoolClassLO +(design.mps.poollo). It was decided to lift them out into a separate module +when designing the Pool to manage Dylan Weak Tables which is also a mark and +sweep pool and will make use of Bit Tables (see design.mps.poolawl). +.background.analysis: analysis.mps.bt(0) contains some of the analysis of the +design decisions that were and were not made in this document. + + +CLIENTS + +.clients: Bit Tables are used throughout the MPS but the important uses are: In +the client and VM arenas (design.mps.arena.client(0) and +design.mps.arena.vm(1)) a bit table is used to record whether each page is free +or not; several pool classes (PoolClassLO, PoolClassEPVM, PoolClassAMS) use bit +tables to record which locations are free and also to store colour. + + +OVERVIEW + +.over: Mostly, the design is as simple as possible. The significant +complications are iteration (see .iteration below) and searching (see +.fun.find-res-range below) because both of these are required to be fast. + + +INTERFACE + +.if.representation.abstract: A Bit Table is represented by the type BT. + +.if.declare: The module declares a type BT and a prototype for each of the +functions below. The type is declared in impl.h.mpmtypes, the prototypes are +declared in impl.h.mpm. Some of the functions are in fact implemented as +macros in the usual way (doc.mps.ref-man.if-conv(0).macro.std). + +.if.general.index: Many of the functions specified below take indexes. If +otherwise unspecified an index must be in the interval [0,n) (note, up to, but +not including, n) where n is the number of bits in the relevant Bit Table (as +passed to the BTCreate function). .if.general.range: Where a range is +specified by two indexes base and limit, base, which specifies the beginning of +the range, must be in the interval [0,n), limit, which specifies the end of the +range, must be in the interval [1,n] (note can be n), and base must be strictly +less than limit (empty ranges are not allowed). Sometimes i and j are used +instead of base and limit. + +.if.create: +Res BTCreate(BT *btReturn, Arena arena, Count n) + +Attempts to create a table of length n in the arena control pool, putting the +table in '*btReturn'. Returns ResOK if and only if the table is created OK. +The initial values of the bits in the table are undefined (so the client should +probably call BTResRange on the entire range before using the BT). Meets +.req.create. + +.if.destroy: +void BTDestroy(BT t, Arena arena, Count n); + +Destroys the table t, which must have been created with BTCreate (.if.create). +The value of argument n must be same as the value of the argument passed to +BTCreate. Meets .req.destroy. + + +.if.size: +size_t BTSize(unsigned long n); + +BTSize(n) returns the number of bytes needed for a Bit Table of n bits. It is +a checked error (an assertion will fail) for n to exceed ULONG_MAX - +MPS_WORD_WIDTH + 1. This is used by clients that allocate storage for the BT +themselves. Before BTCreate and BTDestroy were implemented that was the only +way to allocate a Bit Table, but is now deprecated. + +.if.get: +int BTGet(BT t, Index i); + +BTGet(t, i) returns the ith bit of the table t (i.e. the image of i under the +mapping). Meets .req.ops.get. + +.if.set: +void BTSet(BT t, Index i); + +BTSet(t, i) sets the ith bit of the table t (to 1). BTGet(t, i) will now +return 1. Meets .req.ops.set. + +.if.res: +void BTRes(BT t, Index i); + +BTRes(t, i) resets the ith bit of the table t (to 0). BTGet(t, i) will now +return 0. Meets .req.ops.res. + +.if.set-range: +void BTSetRange(BT t, Index base, Index limit); + +BTSetRange(t, base, limit) sets the range of bits [base, limit) in the table +t. BTGet(t, x) will now return 1 for base<=x> MPS_WORD_SHIFT. The latter +expression is used in the code. .index.word.justify: The compiler is more +likely to generate good code without the divide. .index.sub-word: ib is the +"sub-word-index" which is the index of the bit referred to by the bit-index in +the above word. ib = i % MPS_WORD_WIDTH. Since MPS_WORD_WIDTH is a +power-of-two, this is the same as ib = i & ~((Word)-1<>5) + (i&31); } + +.iteration: Many of the following functions involve iteration over ranges in a +Bit Table. This is performed on whole words rather than individual bits, +whenever possible (to improve speed). This is implemented internally by the +macros ACT_ON_RANGE & ACT_ON_RANGE_HIGH for iterating over the range forwards +and backwards respectively. These macros do not form part of the interface of +the module, but are used extensively in the implementation. The macros are +often used even when speed is not an issue because it simplifies the +implementation and makes it more uniform. The iteration macros take the +parameters (base, limit, single_action, bits_action, word_action). + + base, limit are of type Index and define the range of the iteration + single_action is the name of a macro which will be used for iterating over +bits in the table individually. This macro must take a single Index parameter +corresponding to the index for the bit. The macro must not use break or +continue because it will be called from within a loop from the expansion of +ACT_ON_RANGE. + bits_action is the name of a macro which will be used for iterating over +part-words. This macro must take parameters (wordIndex, base, limit) where +wordIndex is the index into the array of words, and base & limit define a range +of bits within the indexed word. + word_action is the name of a macro which will be used for iterating over +whole-words. This macro must take the parameter (wordIndex) where wordIndex is +the index of the whole-word in the array. The macro must not use break or +continue because it will be called from within a loop from the expansion of +ACT_ON_RANGE. + +.iteration.exit: The code in the single_action, bits_action, and word_action +macros is allowed to use 'return' or 'goto' to terminate the iteration early. +This is used by the test (.fun.test.*) and find (.fun.find.*) operations. + +.iteration.small: If the range is sufficiently small only the single_action +macro will be used as this is more efficient in practice. The choice of what +constitutes a small range is made entirely on the basis of experimental +performance results (and currently, 1999-04-27, a "small range" is 6 bits or +fewer. See change.mps.epcore.brisling.160181 for some justification). +Otherwise (for a bigger range) bits_action is used on the part words at either +end of the range (or the whole of the range it if it fits in a single word), +and word_action is used on the words that comprise the inner portion of the +range. + +The implementation of ACT_ON_RANGE (and ACT_ON_RANGE_HIGH) is simple enough. +It decides which macros it should invoke and invokes them. single_action and +word_action are invoked inside loops. + + +.fun.get: BTGet. +The bit-index will be converted in the usual way, see .index. The relevant +Word will be read out of the Bit Table and shifted right by the sub-Word index +(this brings the relevant bit down to the least significant bit of the Word), +the Word will then be masked with 1 producing the answer. + +.fun.set: BTSet + +.fun.res: BTRes + +In both BTSet and BTRes a mask is constructed by shifting 1 left by the +sub-word-index (see .index). For BTSet the mask is ORed into the relevant word +(thereby setting a single bit). For BTRes the mask is inverted and ANDed into +the relevant word (thereby resetting a single bit). + +.fun.set-range: BTSetRange +ACT_ON_RANGE (see .iteration above) is used with macros that set a single bit +(using BTSet), set a range of bits in a word, and set a whole word. + +.fun.res-range: BTResRange +This is implemented similarly to BTSetRange (.fun.set-range) except using BTRes +& reverse bit masking logic. + +.fun.test.range.set: BTIsSetRange +ACT_ON_RANGE (see .iteration above) is used with macros that test whether all +the relevant bits are set; if some of the relevant bits are not set then +'return FALSE' is used to terminate the iteration early and return from the +BTIsSetRange function. If the iteration completes then TRUE is returned. + +.fun.test.range.reset: BTIsResRange +As for BTIsSetRange (.fun.test.range.set above) but testing whether the bits +are reset. + +.fun.test.range.same: BTRangesSame +As for BTIsSetRange (.fun.test.range.set above) but testing whether +corresponding ranges in the two Bit Tables are the same. Note there are no +speed requirements, but ACT_ON_RANGE is used for simplicitly and uniformity. + +.fun.find: The four external find functions (BTFindShortResRange, +BTFindShortResRangeHigh, BTFindLongResRange, BTFindLongResRangeHigh) simply +call through to one of the two internal functions: BTFindResRange, +BTFindResRangeHigh. BTFindResRange and BTFindResRangeHigh both have the +following prototype (with a different name obviously): + +Bool BTFindResRange(Index *baseReturn, Index *limitReturn, + BT bt, + Index searchBase, Index searchLimit, + unsigned long minLength, + unsigned long maxLength) + +There are two length parameters, one specifying the minimum length of the range +to be found, the other the maximum length. For BTFindShort* maxLength is equal +to minLength when passed; for BTFindLong* maxLength is equal to the maximum +possible range (searchLimit - searchBase). + +.fun.find-res-range: BTFindResRange +Iterate within the search boundaries, identifying candidate ranges by searching +for a reset bit. The Boyer-Moore algorithm (reference please?) is used (it's +particularly easy when there are only two symbols, 0 and 1, in the alphabet). +For each candidate range, iterate backwards over the bits from the end of the +range towards the beginning. If a set bit is found, this candidate has failed +and a new candidate range is selected. If when scanning for the set bit a +range of reset bits was found before finding the set bit, then this (small) +range of reset bits is used as the start of the next candidate. Additionally +the end of this small range of reset bits (the end of the failed candidate +range) is remembered so that we don't have to iterate over this range again. +But if no reset bits were found in the candidate range, then iterate again +(starting from the end of the failed candidate) to look for one. If during the +backwards search no set bit is found, then we have found a sufficiently large +range of reset bits; now extend the valid range as far as possible up to the +maximum length by iterating forwards up to the maximum limit looking for a set +bit. The iterations make use of the ACT_ON_RANGE & ACT_ON_RANGE_HIGH macros +using of 'goto' to effect an early termination of the iteration when a +set/reset (as appropriate) bit is found. The macro ACTION_FIND_SET_BIT is used +in the iterations, it efficiently finds the first (that is, with lowest index +or weight) set bit in a word or subword. + +.fun.find-res-range.improve: Various other performance improvements have been +suggested in the past, including some from request.epcore.170534. Here is a +list of potential improvements which all sound plausible, but which have not +led to performance improvements in practice: + +.fun.find-res-range.improve.step.partial: When the top index in a candidate +range fails, skip partial words as well as whole words, using +e.g. lookup tables. + +.fun.find-res-range.improve.lookup: When testing a candidate run, +examine multiple bits at once (e.g. 8), using lookup tables for (e.g) +index of first set bit, index of last set bit, number of reset bits, +length of maximum run of reset bits. + +.fun.find-res-range-high: BTFindResRangeHigh +Exactly the same algorithm as in BTFindResRange (see .fun.find-res-range +above), but moving over the table in the opposite direction. + +.fun.copy-simple-range: BTCopyRange. +Uses ACT_ON_RANGE (see .iteration above) with the obvious implementation. +Should be fast. + +.fun.copy-offset-range: BTCopyOffsetRange. +Uses a simple iteration loop, reading bits with BTGet and setting them with +BTSet. Doesn't use ACT_ON_RANGE because the two ranges will not, in general, be +similarly word-aligned. + +.fun.copy-invert-range: BTCopyInvertRange. +Uses ACT_ON_RANGE (see .iteration above) with the obvious implementation. +Should be fast - although there are no speed requirements. + + +TESTING + +.test: The following tests are available / have been used during development. + +.test.btcv: MMsrc!btcv.c. This is supposed to be a coverage test, intended to +execute all of the module's code in at least some minimal way. + +.test.cbstest: MMsrc!cbstest.c. This was written as a test of the CBS module +(design.mps.cbs(2)). It compares the functional operation of a CBS with that +of a BT so is a good functional test of either module. + +.test.mmqa.120: MMQA_test_function!210.c. This is used because it has a fair +amount of segment allocation and freeing so exercises the arena code that uses +Bit Tables. + +.test.bttest: MMsrc!bttest.c. This is an interactive test that can be used to +exercise some of the BT functionality by hand. + +.test.dylan: It is possible to modify Dylan so that it uses Bit Tables more +extensively. See change.mps.epcore.brisling.160181 TEST1 and TEST2. + diff --git a/mps/design/buffer/index.txt b/mps/design/buffer/index.txt new file mode 100644 index 00000000000..1ab5165395b --- /dev/null +++ b/mps/design/buffer/index.txt @@ -0,0 +1,699 @@ + ALLOCATION BUFFERS AND ALLOCATION POINTS + design.mps.buffer + incomplete design + richard 1996-09-02 + +INTRODUCTION + +.scope: This document describes the design of allocation buffers and allocation +points. + +.purpose: The purpose of this document is to record design decisions made +concerning allocation buffers and allocation points and justify those decisions +in terms of requirements. + +.readership: The document is intended for reading by any Memory Management +Group developer. + + +HISTORY + +.history.0-1: The history for versions 0-1 is lost pending possible +reconstruction. + +.history.2: Added class hierarchy and subclassing information. + + +SOURCE + +.source.mail: Much of the juicy stuff about buffers is only floating around in +mail discussions. You might like to try searching the archives if you can't +find what you want here. + +.source.synchronize: For a discussion of the syncronization issues: +mail.richard.1995-05-24.10-18, mail.ptw.1995-05-19.19-15, +mail.richard.1995-05-19.17-10 +[drj - I believe that the sequence for flip in ptw's message is incorrect. The +operations should be in the other order] + +.source.interface: For a description of the buffer interface in C prototypes: +mail.richard.1997-04-28.09-25(0) + +.source.qa: Discussions with QA were useful in pinning down the semantics and +understanding of some obscure but important boundary cases. See +mail.richard.tucker.1997-05-12.09-45(0) et seq (mail subject: "notes on our +allocation points discussion"). + + +REQUIREMENTS + +.req.fast: Allocation must be very fast +.req.thread-safe: Must run safely in a multi-threaded environment +.req.no-synch: Must avoid the use of thread-synchronization (.req.fast) +.req.manual: Support manual memory management +.req.exact: Support exact collectors +.req.ambig: Support ambiguous collectors +.req.count: Must record (approximately) the amount of allocation (in bytes). +Actually not a requirement any more, but once was put forward as a Dylan +requirement. Bits of the code still reflect this requirement. See +request.dylan.170554. + + +CLASSES + +.class.hierarchy: The Buffer datastructure is designed to be subclassable (see +design.mps.protocol). + +.class.hierarchy.buffer: The basic buffer class (BufferClass) supports basic +allocation-point buffering, and is appropriate for those manual pools which +don't use segments (.req.manual). The Buffer class doesn't support reference +ranks (i.e. the buffers have RankSetEMPTY). Clients may use BufferClass +directly, or create their own subclasses (see .subclassing). + +.class.hierarchy.segbuf: Class SegBufClass is also provided for the use of +pools which additionally need to associate buffers with segments. SegBufClass +is a subclass of BufferClass. Manual pools may find it convenient to use +SegBufClass, but it is primarily intended for automatic pools (.req.exact, +.req.ambig). An instance of SegBufClass may be attached to a region of memory +that lies within a single segment. The segment is associated with the buffer, +and may be accessed with the BufferSeg function. SegBufClass also supports +references at any rank set. Hence this class or one of its subclasses should be +used by all automatic pools (with the possible exception of leaf pools). The +rank sets of buffers and the segments they are attached to must match. Clients +may use SegBufClass directly, or create their own subclasses (see +.subclassing). + +.class.hierarchy.rankbuf: Class RankBufClass is also provided as a subclass of +SegBufClass. The only way in which this differs from its superclass is that +the rankset of a RankBufClass is set during initialization to the singleton +rank passed as an additional parameter to BufferCreate. Instances of +RankBufClass are of the same type as instances of SegBufClass, i.e., SegBuf. +Clients may use RankBufClass directly, or create their own subclasses (see +.subclassing). + +.class.create: The buffer creation functions (BufferCreate and BufferCreateV) +take a class parameter, which determines the class of buffer to be created. + +.class.choice: Pools which support buffered allocation should specify a default +class for buffers. This class will be used when a buffer is created in the +normal fashion by MPS clients (for example by a call to mps_ap_create). Pools +specify the default class by means of the bufferClass field in the pool class +object. This should be a pointer to a function of type PoolBufferClassMethod. +The normal class "Ensure" function (e.g. Ensure*Buffer*Class) has the +appropriate type. + +.subclassing: Pools may create their own subclasses of the standard buffer +classes. This is sometimes useful if the pool needs to add an extra field to +the buffer. The convenience macro DEFINE_BUFFER_CLASS may be used to define +subclasses of buffer classes. See design.mps.protocol.int.define-special. +.replay: To work with the allocation replayer (see +design.mps.telemetry.replayer), the buffer class has to emit an event for each +call to an external interface, containing all the parameters passed by the +user. If a new event type is required to carry this information, the replayer +(impl.c.eventrep) must then be extended to recreate the call. +.replay.pool-buffer: The replayer must also be updated if the association of +buffer class to pool or the buffer class hierarchy is changed. + +.class.method: Buffer classes provide the following methods (these should not +be confused with the pool class methods related to the buffer protocol, +described in .method.*): + +.class.method.init: "init" is a class-specific initialization method called +from BufferInitV. It receives the optional (vararg) parameters passed to +BufferInitV. Client-defined methods must call their superclass method (via a +next-method call) before performing any class-specific behaviour. .replay.init +: The init method should emit a BufferInit event (if there aren't any +extra parameters, = ""). + +.class.method.finish: "finish" is a class-specific finish method called from +BufferFinish. Client-defined methods must call their superclass method (via a +next-method call) after performing any class-specific behaviour. + +.class.method.attach: "attach" is a class-specific method called whenever a +buffer is attached to memory, via BufferAttach. Client-defined methods must +call their superclass method (via a next-method call) before performing any +class-specific behaviour. + +.class.method.detach: "detach" is a class-specific method called whenever a +buffer is detached from memory, via BufferDetach. Client-defined methods must +call their superclass method (via a next-method call) after performing any +class-specific behaviour. + +.class.method.seg: "seg" is a class-specific accessor method which returns the +segment attached to a buffer (or NULL if there isn't one). It is called from +BufferSeg. Clients should not need to define their own methods for this. + +.class.method.rankSet: "rankSet" is a class-specific accessor method which +returns the rank set of a buffer. It is called from BufferRankSet. Clients +should not need to define their own methods for this. + +.class.method.setRankSet: "setRankSet" is a class-specific setter method which +sets the rank set of a buffer. It is called from BufferSetRankSet. Clients +should not need to define their own methods for this. + +.class.method.describe: "describe" is a class-specific method called to +describe a buffer, via BufferDescribe. Client-defined methods must call their +superclass method (via a next-method call) before describing any class-specific +state. + + +NOTES + +.logging.control: Buffers have a separate control for whether they are logged +or not, this is because they are particularly high volume. This is a boolean +flags (bufferLogging) in the ArenaStruct. + +.count: Counting the allocation volume is done by maintaining two fields in the +buffer struct: .count.fields: fillSize, emptySize. .count.monotonic: both of +these fields are monotonically increasing. .count.fillsize: fillSize is an +accumulated total of the size of all the fills (as a result of calling the +PoolClass BufferFill method) that happen on the buffer. .count.emptysize: +emptySize is an accumulated total of the size of all the empties than happen on +the buffer (which are notified to the pool using the PoolClass BufferEmpty +method). .count.generic: These fields are maintained by the generic buffer +code (in BufferAttach and BufferDetach). + +.count.other: Similar count fields are maintained in the pool and the arena. +They are maintained on an internal (buffers used internally by MPS) and +external (buffers used for mutator APs) basis. The fields are also updated by +the buffer code. The fields are: in the pool, +{fill|empty}{Mutator|Internal}Size (4 fields); in the arena, +{fill|empty}{Mutator|Internal}Size allocMutatorSize (5 fields). + + .count.alloc.how: The amount of allocation in the buffer just after an empty +is (fillSize - emptySize). At other times this computation will include space +that the buffer has the use of (between base and init) but which may not get +allocated in (because the remaining space may be too large for the next reserve +so some or all of it may get emptied). The arena field allocMutatorSize is +incremented by the allocated size (between base and init) whenever a buffer is +detached. Symmetrically this field is decremented by by the pre-allocated size +(between base and init) whenever a buffer is attached. The overall count is +asymptotically correct. + +.count.type: All the count fields are type double. .count.type.justify: This +is because double is the type most likely to give us enough precision. Because +of the lack of genuine requirements the type isn't so important. It's nice to +have it more precise than long. Which double usually is. + + + +From the whiteboard: + +REQ +atomic update of words +guarantee order of reads and write to certain memory locations. + +FLIP +limit:=0 +record init for scanner + +COMMIT +init:=alloc +if(limit = 0) ... + +L written only by MM +A \ written only by client (except during synchronized MM op) +I / +I read by MM during flip + + +States + +BUSY +READY +TRAPPED +RESET +[drj: there are many more states] + + +Misc + +.misc: During buffer ops all field values can change. Might trash perfectly +good ("valid"?) object if pool isn't careful. + + +Not from the whiteboard. + + +SYNCHRONIZATION + +Buffers provide a loose form of synchronization between the mutator and the +collector. + +The crucial synchronization issues are between the operation the pool performs +on flip and the mutator's commit operation. + +Commit + +read init +write init +Memory Barrier +read limit + +Flip + +write limit +Memory Barrier +read init + +Commit consists of two parts. The first is the update to init. This is a +declaration that the new object just before init is now correctly formatted and +can be scanned. The second is a check to see if the buffer has been +"tripped". The ordering of the two parts is crucial. + +Note that the declaration that the object is correctly formatted is independent +of whether the buffer has been tripped or not. In particular a pool can scan +up to the init pointer (including the newly declared object) whether or not the +pool will cause the commit to fail. In the case where the pool scans the +object, but then causes the commit to fail (and presumably the allocation to +occur somewhere else), the pool will have scanned a "dead" object, but this is +just another example of conservatism in the general sense. + +Not that the read of init in the Flip sequence can in fact be arbitrarily +delayed (as long as it is read before a buffered segment is scanned). + +On processors with Relaxed Memory Order (such as the DEC Alpha), Memory +Barriers will need to be placed at the points indicated. + + * DESIGN + * + * design.mps.buffer. + * + * An allocation buffer is an interface to a pool which provides + * very fast allocation, and defers the need for synchronization in + * a multi-threaded environment. + * + * Pools which contain formatted objects must be synchronized so + * that the pool can know when an object is valid. Allocation from + * such pools is done in two stages: reserve and commit. The client + * first reserves memory, then initializes it, then commits. + * Committing the memory declares that it contains a valid formatted + * object. Under certain conditions, some pools may cause the + * commit operation to fail. (See the documentation for the pool.) + * Failure to commit indicates that the whole allocation failed and + * must be restarted. When using a pool which introduces the + * possibility of commit failing, the allocation sequence could look + * something like this: + * + * do { + * res = BufferReserve(&p, buffer, size); + * if(res != ResOK) return res; // allocation fails, reason res + * initialize(p); // p now points at valid object + * } while(!BufferCommit(buffer, p, size)); + * + * Pools which do not contain formatted objects can use a one-step + * allocation as usual. Effectively any random rubbish counts as a + * "valid object" to such pools. + * + * An allocation buffer is an area of memory which is pre-allocated + * from a pool, plus a buffer descriptor, which contains, inter + * alia, four pointers: base, init, alloc, and limit. Base points + * to the base address of the area, limit to the last address plus + * one. Init points to the first uninitialized address in the + * buffer, and alloc points to the first unallocated address. + * + * L . - - - - - . ^ + * | | Higher addresses -' + * | junk | + * | | the "busy" state, after Reserve + * A |-----------| + * | uninit | + * I |-----------| + * | init | + * | | Lower addresses -. + * B `-----------' v + * + * L . - - - - - . ^ + * | | Higher addresses -' + * | junk | + * | | the "ready" state, after Commit + * A=I |-----------| + * | | + * | | + * | init | + * | | Lower addresses -. + * B `-----------' v + * + * Access to these pointers is restricted in order to allow + * synchronization between the pool and the client. The client may + * only write to init and alloc, but in a restricted and atomic way + * detailed below. The pool may read the contents of the buffer + * descriptor at _any_ time. During calls to the fill and trip + * methods, the pool may update any or all of the fields + * in the buffer descriptor. The pool may update the limit at _any_ + * time. + * + * Access to buffers by these methods is not synchronized. If a buffer + * is to be used by more than one thread then it is the client's + * responsibility to ensure exclusive access. It is recommended that + * a buffer be used by only a single thread. + * + * [Only one thread may use a buffer at once, unless the client + * places a mutual exclusion around the buffer access in the usual + * way. In such cases it is usually better to create one buffer for + * each thread.] + * + * Here are pseudo-code descriptions of the reserve and commit + * operations. These may be implemented in-line by the client. + * Note that the client is responsible for ensuring that the size + * (and therefore the alloc and init pointers) are aligned according + * to the buffer's alignment. + * + * Reserve(buf, size) ; size must be aligned to pool + * if buf->limit - buf->alloc >= size then + * buf->alloc +=size ; must be atomic update + * p = buf->init + * else + * res = BufferFill(&p, buf, size) ; buf contents may change + * + * Commit(buf, p, size) + * buf->init = buf->alloc ; must be atomic update + * if buf->limit == 0 then + * res = BufferTrip(buf, p, size) ; buf contents may change + * else + * res = True + * (returns True on successful commit) + * + * The pool must allocate the buffer descriptor and initialize it by + * calling BufferInit. The descriptor this creates will fall + * through to the fill method on the first allocation. In general, + * pools should not assign resources to the buffer until the first + * allocation, since the buffer may never be used. + * + * The pool may update the base, init, alloc, and limit fields when + * the fallback methods are called. In addition, the pool may set + * the limit to zero at any time. The effect of this is either: + * + * 1. cause the _next_ allocation in the buffer to fall through to + * the buffer fill method, and allow the buffer to be flushed + * and relocated; + * + * 2. cause the buffer trip method to be called if the client was + * between reserve and commit. + * + * A buffer may not be relocated under other circumstances because + * there is a race between updating the descriptor and the client + * allocation sequence. + + +.method.create: + +BufferCreate + +Create an allocation buffer in a pool. + +The buffer is created in the "ready" state. + +A buffer structure is allocated from the space control pool and partially +initialized (in particularly neither the signature nor the serial field are +initialized). The pool class's bufferCreate method is then called. This +method can update (some undefined subset of) the fields of the structure; it +should return with the buffer in the "ready" state (or fail). The remainder of +the initialization then occurs. + +If and only if successful then a valid buffer is returned. + + +.method.destroy: + +BufferDestroy + +Destroy frees a buffer descriptor. The buffer must be in the "ready" state, +i.e. not between a Reserve and Commit. Allocation in the area of memory to +which the descriptor refers must cease after Destroy is called. + +Destroying an allocation buffer does not affect objects which have been +allocated, it just frees resources associated with the buffer itself. + +The pool class's bufferDestroy method is called and then the buffer structure +is uninitialized and freed. + + +.method.check: + +BufferCheck + +The check method is straightforward, the non-trivial dependencies checked are: + The ordering constraints between base, init, alloc, and limit. + The alignment constraints on base, init, alloc, and limit. + That the buffer's rank is identical to the segment's rank. + +.method.set-reset: + +/* BufferSet/Reset -- set/reset a buffer + * + * Set sets the buffer base, init, alloc, and limit fields so that + * the buffer is ready to start allocating in area of memory. The + * alloc field is a copy of the init field. + * + * Reset sets the seg, base, init, alloc, and limit fields to + * zero, so that the next reserve request will call the fill + * method. + */ + +.method.set.unbusy: BufferSet must only be applied to buffers that are not busy. +.method.reset.unbusy: BufferReset must only be applied to buffers that are not +busy. + + +.method.accessors: + +/* Buffer Information + * + * BufferIsReset returns TRUE if and only if the buffer is in the + * reset state, i.e. with base, init, alloc, and limit set to zero. + * + * BufferIsReady returns TRUE iff the buffer is not between a + * reserve and commit. The result is only reliable if the client is + * not currently using the buffer, since it may update the alloc and + * init pointers asynchronously. + * + * BufferAP returns the APStruct substructure of a buffer. + * + * BufferOfAP is a thread-safe (impl.c.mpsi.thread-safety) method of + * getting the buffer which owns an APStruct. + * + * BufferSpace is a thread-safe (impl.c.mpsi.thread-safety) method of + * getting the space which owns a buffer. + * + * BufferPool returns the pool to which a buffer is attached. + */ + +.method.ofap: + +.method.ofap.thread-safe: +BufferOfAP must be thread safe (see impl.c.mpsi.thread-safety). This is +achieved simply because the underlying operation involved is simply a +subtraction. + +.method.space: +.method.space.thread-safe: +BufferSpace must be thread safe (see impl.c.mpsi.thread-safety). This is +achieved simple because the underlying operation is a read of +shared-non-mutable data (see design.mps.thread-safety). + +.method.reserve: + +/* BufferReserve -- reserve memory from an allocation buffer + * + * This is a provided version of the reserve procedure described + * above. The size must be aligned according to the buffer + * alignment. Iff successful, ResOK is returned and + * *pReturn updated with a pointer to the reserved memory. + * Otherwise *pReturn is not touched. The reserved memory is not + * guaranteed to have any particular contents. The memory must be + * initialized with a valid object (according to the pool to which + * the buffer belongs) and then passed to the Commit method (see + * below). Reserve may not be applied twice to a buffer without a + * Commit in-between. In other words, Reserve/Commit pairs do not + * nest. + */ + +Res BufferReserve(Addr *pReturn, Buffer buffer, Size size) +{ + Addr next; + + AVER(pReturn != NULL); + AVERT(Buffer, buffer); + AVER(size > 0); + AVER(SizeIsAligned(size, BufferPool(buffer)->alignment)); + AVER(BufferIsReady(buffer)); + + /* Is there enough room in the unallocated portion of the buffer to */ + /* satisfy the request? If so, just increase the alloc marker and */ + /* return a pointer to the area below it. */ + + next = AddrAdd(buffer->ap.alloc, size); + if(next > buffer->ap.alloc && next <= buffer->ap.limit) + { + buffer->ap.alloc = next; + *pReturn = buffer->ap.init; + return ResOK; + } + + /* If the buffer can't accommodate the request, fall through to the */ + /* pool-specific allocation method. */ + + return BufferFill(pReturn, buffer, size); +} + +.method.fill: + +/* BufferFill -- refill an empty buffer + * + * If there is not enough space in a buffer to allocate in-line, + * BufferFill must be called to "refill" the buffer. (See the + * description of the in-line Reserve method in the leader comment.) + */ + +Res BufferFill(Addr *pReturn, Buffer buffer, Size size) +{ + Res res; + Pool pool; + + AVER(pReturn != NULL); + AVERT(Buffer, buffer); + AVER(size > 0); + AVER(SizeIsAligned(size, BufferPool(buffer)->alignment)); + AVER(BufferIsReady(buffer)); + + pool = BufferPool(buffer); + res = (*pool->class->bufferFill)(pReturn, pool, buffer, size); + + AVERT(Buffer, buffer); + + return res; +} + + +.method.commit: + +/* BufferCommit -- commit memory previously reserved + * + * Commit notifies the pool that memory which has been previously + * reserved (see above) has been initialized with a valid object + * (according to the pool to which the buffer belongs). The pointer + * p must be the same as that returned by Reserve, and the size must + * match the size passed to Reserve. + * + * Commit may not be applied twice to a buffer without a reserve + * in-between. In other words, objects must be reserved, + * initialized, then committed only once. + * + * Commit returns TRUE iff successful. If commit fails and returns + * FALSE, the client may try to allocate again by going back to the + * reserve stage, and may not use the memory at p again for any + * purpose. + * + * Some classes of pool may cause commit to fail under rare + * circumstances. + */ + +Bool BufferCommit(Buffer buffer, Addr p, Size size) +{ + AVERT(Buffer, buffer); + AVER(size > 0); + AVER(SizeIsAligned(size, BufferPool(buffer)->alignment)); + /* Buffer is "busy" */ + AVER(!BufferIsReady(buffer)); + + /* See design.mps.collection.flip. + * If a flip occurs before this point, when the pool reads + * buffer->init it will point below the object, so it will be trashed + * and the commit must fail when trip is called. The pool will also + * read p (during the call to trip) which points to the invalid + * object at init. + */ + + AVER(p == buffer->ap.init); + AVER(AddrAdd(buffer->ap.init, size) == buffer->ap.alloc); + + /* Atomically update the init pointer to declare that the object */ + /* is initialized (though it may be invalid if a flip occurred). */ + + buffer->ap.init = buffer->ap.alloc; + + /* .improve.memory-barrier: Memory barrier here on the DEC Alpha + * (and other relaxed memory order architectures). */ + + /* If a flip occurs at this point, the pool will see init */ + /* above the object, which is valid, so it will be collected. */ + /* The commit must succeed when trip is called. The pointer */ + /* p will have been fixed up. */ + + /* Trip the buffer if a flip has occurred. */ + + if(buffer->ap.limit == 0) + return BufferTrip(buffer, p, size); + + /* No flip occurred, so succeed. */ + + return TRUE; +} + + +.method.trip: + +BufferTrip -- act on a tripped buffer + +The pool which owns a buffer may asynchronously set the buffer limit to zero in +order to get control over the buffer. If this occurs after a Reserve (but +before the corresponding commit), then the Commit method calls BufferTrip and +the Commit method returns with BufferTrip's return value. (See the description +of Commit.) + +.method.trip.precondition: +At the time trip is called (see Commit), the following are true: +.method.trip.precondition.limit: limit == 0 +.method.trip.precondition.init: init == alloc +.method.trip.precondition.p: p+size == alloc + + + +.method.expose-cover: + +Expose / Cover + +BufferExpose/Cover are used by collectors that want to allocate in a forwarding +buffer. Since the forwarding buffer may be Shielded the potential problem of +handling a recursive fault appears (mutator causes a page fault, collector +fixes some objects causing allocation, the allocation takes place is a +protected area of memory which cause a page fault). BufferExpose guarantees +that allocation can take place in the buffer without causing a page fault, +BufferCover removes this guarantee. + +[The following paragraph is reverse-constructed conjecture] +BufferExpose puts the buffer in an "exposed" state, BufferCover puts the buffer +in a "covered" state. BufferExpose can only be called if the buffer is +"covered". BufferCover can only be called if the buffer is "exposed". +[Is this part of the "Protection/Suspension Protocol"? (see mail with that +subject)] + + + + + + + +------------------------------- + +Here are a number of diagrams showing how buffers behave. In general, the +horizontal axis corresponds to mutator action (reserve, commit) and the +vertical axis corresponds to collector action. I'm not sure which of the +diagrams are the same as each other, and which are best or most complete when +they are different, but they all attempt to show essentially the same +information. It's very difficult to get all the details in. These diagrams were +drawn by richard, rit, gavinm, &c, &c in April 1997. In general, the later +diagrams are, I suspect, more correct, complete and useful than the earlier +ones. I have put them all here for the record. rit 1998-02-09 + +Buffer Diagram: +Buffer States + +Buffer States (3-column) +Buffer States (4-column) +Buffer States (gavinised) +Buffer States (interleaved) +Buffer States (richardized) + + diff --git a/mps/design/cbs/index.txt b/mps/design/cbs/index.txt new file mode 100644 index 00000000000..578a48ac538 --- /dev/null +++ b/mps/design/cbs/index.txt @@ -0,0 +1,559 @@ + DESIGN FOR COALESCING BLOCK STRUCTURE + design.mps.cbs + incomplete doc + gavinm 1998-05-01 + +INTRODUCTION + +.intro: This is the design for impl.c.cbs, which implements a data structure +for the management of non-intersecting memory ranges, with eager coalescence. + +.readership: This document is intended for any MM developer. + +.source: design.mps.poolmv2, design.mps.poolmvff. + +.overview: The "coalescing block structure" is a set of addresses (or a subset +of address space), with provision for efficient management of contiguous +ranges, including insertion and deletion, high level communication with the +client about the size of contiguous ranges, and detection of protocol +violations. + + +Document History + +.hist.0: This document was derived from the outline in design.mps.poolmv2(2). +Written by Gavin Matthews 1998-05-01. + +.hist.1: Updated by Gavin Matthews 1998-07-22 in response to approval comments +in change.epcore.anchovy.160040 There is too much fragmentation in trapping +memory. + +.hist.2: Updated by Gavin Matthews (as part of change.epcore.brisling.160158: +MVFF cannot be instantiated with 4-byte alignment) to document new alignment +restrictions. + + +DEFINITIONS + +.def.range: A (contiguous) range of addresses is a semi-open interval on +address space. + +.def.isolated: A contiguous range is isolated with respect to some property it +has, if adjacent elements do not have that property. + +.def.interesting: A block is interesting if it is of at least the minimum +interesting size specified by the client. + + +REQUIREMENTS + +.req.set: Must maintain a set of addresses. + +.req.fast: Common operations must have a low amortized cost. + +.req.add: Must be able to add address ranges to the set. + +.req.remove: Must be able to remove address ranges from the set. + +.req.size: Must report concisely to the client when isolated contiguous ranges +of at least a certain size appear and disappear. + +.req.iterate: Must support the iteration of all isolated contiguous ranges. +This will not be a common operation. + +.req.protocol: Must detect protocol violations. + +.req.debug: Must support debugging of client code. + +.req.small: Must have a small space overhead for the storage of typical subsets +of address space and not have abysmal overhead for the storage of any subset of +address space. + +.req.align: Must support an alignment (the alignment of all addresses +specifying ranges) of down to sizeof(void *) without losing memory. + + +INTERFACE + +.header: CBS is used through impl.h.cbs. + + +External Types + +.type.cbs: CBS is the main data-structure for manipulating a CBS. It is +intended that a CBSStruct be embedded in another structure. No convenience +functions are provided for the allocation or deallocation of the CBS. + typedef struct CBSStruct CBSStruct, *CBS; + +.type.cbs.block: CBSBlock is the data-structure that represents an isolated +contiguous range held by the CBS. It is returned by the new and delete methods +described below. + typedef struct CBSBlockStruct CBSBlockStruct, *CBSBlock; + +.type.cbs.method: The following methods are provided as callbacks to advise the +client of certain events. The implementation of these functions should not +cause any CBS function to be called on the same CBS. In this respect, the CBS +module is not re-entrant. + +.type.cbs.change.size.method: CBSChangeSizeMethod is the function pointer type, +four instances of which are optionally registered via CBSInit. + typedef void (*CBSChangeSizeMethod)(CBS cbs, CBSBlock block, Size oldSize, +SizeNewSize); +These callbacks are invoked under CBSInsert, CBSDelete, or CBSSetMinSize in +certain circumstances. Unless otherwise stated, oldSize and newSize will both +be non-zero, and different. The accessors CBSBlockBase, CBSBlockLimit, and +CBSBlockSize may be called from within these callbacks, except within the +delete callback when newSize is zero. See .impl.callback for implementation +details. + +.type.cbs.iterate.method: CBSIterateMethod is a function pointer type is a +client method invoked by the CBS module for every isolated contiguous range in +address order, when passed to the CBSIterate or CBSIterateLarge functions. The +function returns a boolean indicating whether to continue with the iteration. + typedef Bool (*CBSIterateMethod)(CBS cbs, CBSBlock block, void *closureP, +unsigned long closureS); + + +External Functions + +.function.cbs.init: CBSInit is the function that initialises the CBS +structure. It performs allocation in the supplied arena. Four methods are +passed in as function pointers (see .type.* above), any of which may be NULL. +It receives a minimum size, which is used when determining whether to call the +optional methods. The mayUseInline boolean indicates whether the CBS may use +the memory in the ranges as a low-memory fallback (see .impl.low-mem). The +alignment indicates the alignment of ranges to be maintained. An initialised +CBS contains no ranges. + Res CBSInit(Arena arena, CBS cbs, CBSChangeSizeMethod new, +CBSChangeSizeMethod delete, CBSChangeSizeMethod grow, CBSChangeSizeMethod +shrink, Size minSize, Align alignment, Bool mayUseInline); + +.function.cbs.init.may-use-inline: If mayUseInline is set, then alignment must +be at least sizeof(void *). In this mode, the CBS will never fail to insert or +delete ranges, even if memory for control structures becomes short. Note that, +in such cases, the CBS may defer notification of new/grow events, but will +report available blocks in CBSFindFirst and CBSFindLast. Such low memory +conditions will be rare and transitory. See .align for more details. + +.function.cbs.finish: CBSFinish is the function that finishes the CBS structure +and discards any other resources associated with the CBS. + void CBSFinish(CBS cbs); + +.function.cbs.insert: CBSInsert is the function used to add a contiguous range +specified by [base,limit) to the CBS. If any part of the range is already in +the CBS, then ResFAIL is returned, and the CBS is unchanged. This function may +cause allocation; if this allocation fails, and any contingency mechanism +fails, then ResMEMORY is returned, and the CBS is unchanged. + Res CBSInsert(CBS cbs, Addr base, Addr limit); + +.function.cbs.insert.callback: CBSInsert will invoke callbacks as follows: + new: when a new block is created that is interesting. oldSize == 0; newSize +>= minSize. + new: when an uninteresting block coalesces to become interesting. 0 < +oldSize < minSize <= newSize. + delete: when two interesting blocks are coalesced. grow will also be invoked +in this case on the larger of the two blocks. newSize == 0; oldSize >= minSize. + grow: when an interesting block grows in size. minSize <= oldSize < newSize. + +.function.cbs.delete: CBSDelete is the function used to remove a contiguous +range specified by [base,limit) from the CBS. If any part of the range is not +in the CBS, then ResFAIL is returned, and the CBS is unchanged. This function +may cause allocation; if this allocation fails, and any contingency mechanism +fails, then ResMEMORY is returned, and the CBS is unchanged. + Res CBSDelete(CBS cbs, Addr base, Addr limit); + +.function.cbs.delete.callback: CBSDelete will invoke callbacks as follows: + delete: when an interesting block is entirely removed. newSize == 0; oldSize +>= minSize. + delete: when an interesting block becomes uninteresting. 0 < newSize < +minSize <= oldSize. + new: when a block is split into two blocks, both of which are interesting. +shrink will also be invoked in this case on the larger of the two blocks. +oldSize == 0; newSize >= minSize. + shrink: when an interesting block shrinks in size, but remains interesting. +minSize <= newSize < oldSize. + +.function.cbs.iterate: CBSIterate is the function used to iterate all isolated +contiguous ranges in a CBS. It receives a pointer, unsigned long closure pair +to pass on to the iterator method, and an iterator method to invoke on every +range in address order. If the iterator method returns FALSE, then the +iteration is terminated. + void CBSIterate(CBS cbs, CBSIterateMethod iterate, void *closureP, unsigned +long closureS); + +.function.cbs.iterate.large: CBSIterateLarge is the function used to iterate +all isolated contiguous ranges of size greater than or equal to the client +indicated minimum size in a CBS. It receives a pointer, unsigned long closure +pair to pass on to the iterator method, and an iterator method to invoke on +every large range in address order. If the iterator method returns FALSE, then +the iteration is terminated. + void CBSIterateLarge(CBS cbs, CBSIterateMethod iterate, void *closureP, +unsigned long closureS); + +.function.cbs.set.min-size: CBSSetMinSize is the function used to change the +minimum size of interest in a CBS. This minimum size is used to determine +whether to invoke the client callbacks from CBSInsert and CBSDelete. This +function will invoke either the new or delete callback for all blocks that are +(in the semi-open interval) between the old and new values. oldSize and +newSize will be the same in these cases. + void CBSSetMinSize(CBS cbs, Size minSize); + +.function.cbs.describe: CBSDescribe is a function that prints a textual +representation of the CBS to the given stream, indicating the contiguous ranges +in order, as well as the structure of the underlying splay tree +implementation. It is provided for debugging purposes only. + Res CBSDescribe(CBS cbs, mps_lib_FILE *stream); + +.function.cbs.block.base: The CBSBlockBase function returns the base of the +range represented by the CBSBlock. This function may not be called from the +delete callback when the block is being deleted entirely. + Addr CBSBlockBase(CBSBlock block); +Note that the value of the base of a particular CBSBlock is not guaranteed to +remain constant across calls to CBSDelete and CBSInsert, regardless of whether +a callback is invoked. + +.function.cbs.block.limit: The CBSBlockLimit function returns the limit of the +range represented by the CBSBlock. This function may not be called from the +delete callback when the block is being deleted entirely. + Addr CBSBlockLimit(CBSBlock block); +Note that the value of the limit of a particular CBSBlock is not guaranteed to +remain constant across calls to CBSDelete and CBSInsert, regardless of whether +a callback is invoked. + +.function.cbs.block.size: The CBSBlockSize function returns the size of the +range represented by the CBSBlock. This function may not be called from the +delete callback when the block is being deleted entirely. + Size CBSBlockSize(CBSBlock block); +Note that the value of the size of a particular CBSBlock is not guaranteed to +remain constant across calls to CBSDelete and CBSInsert, regardless of whether +a callback is invoked. + +.function.cbs.block.describe: The CBSBlockDescribe function prints a textual +representation of the CBSBlock to the given stream. It is provided for +debugging purposes only. + Res CBSBlockDescribe(CBSBlock block, mps_lib_FILE *stream); + +.function.cbs.find.first: The CBSFindFirst function locates the first block (in +address order) within the CBS of at least the specified size, and returns its +range. If there are no such blocks, it returns FALSE. It optionally deletes +the top, bottom, or all of the found range, depending on the findDelete +argument (this saves a separate call to CBSDelete, and uses the knowledge of +exactly where we found the range). + Bool CBSFindFirst(Addr *baseReturn, Addr *limitReturn, CBS cbs, Size size, +CBSFindDelete findDelete); + enum { + CBSFindDeleteNONE, /* don't delete after finding */ + CBSFindDeleteLOW, /* delete precise size from low end */ + CBSFindDeleteHIGH, /* delete precise size from high end */ + CBSFindDeleteENTIRE /* delete entire range */ + }; + +.function.cbs.find.last: The CBSFindLast function locates the last block (in +address order) within the CBS of at least the specified size, and returns its +range. If there are no such blocks, it returns FALSE. Like CBSFindFirst, it +optionally deletes the range. + Bool CBSFindLast(Addr *baseReturn, Addr *limitReturn, CBS cbs, Size size, +CBSFindDelete findDelete); + +.function.cbs.find.largest: The CBSFindLargest function locates the largest +block within the CBS, and returns its range. If there are no blocks, it +returns FALSE. Like CBSFindFirst, it optionally deletes the range (specifying +CBSFindDeleteLOW or CBSFindDeleteHIGH has the same effect as +CBSFindDeleteENTIRE). + Bool CBSFindLargest(Addr *baseReturn, Addr *limitReturn, CBS cbs, +CBSFindDelete findDelete) + + +Alignment + +.align: When mayUseInline is specified to permit inline data structures and +hence avoid losing memory in low memory situations, the alignments that the CBS +supports are constrained by three requirements: + - The smallest possible range (namely one that is the alignment in size) must +be large enough to contain a single void * pointer (see +.impl.low-mem.inline.grain); + - Any larger range (namely one that is at least twice the alignment in size) +must be large enough to contain two void * pointers (see +.impl.low-mem.inline.block); + - It must be valid on all platforms to access a void * pointer stored at the +start of an aligned range. + +All alignments that meet these requirements are aligned to sizeof(void *), so +we take that as the minimum alignment. + + +IMPLEMENTATION + + +.impl: Note that this section is concerned with describing various aspects of +the implementation. It does not form part of the interface definition. + + +Size Change Callback Protocol + +.impl.callback: The size change callback protocol concerns the mechanism for +informing the client of the appearance and disappearance of interesting +ranges. The intention is that each range has an identity (represented by the +CBSBlock). When blocks are split, the larger fragment retains the identity. +When blocks are merged, the new block has the identity of the larger fragment. + +.impl.callback.delete: Consider the case when the minimum size is , +and CBSDelete is called to remove a range of size . The two (possibly +non-existant) neighbouring ranges have (possibly zero) sizes and +. is part of the CBSBlock . + +.impl.callback.delete.delete: The delete callback will be called in this case +if and only if: + left + middle + right >= minSize && left < minSize && right < minSize +That is, the combined range is interesting, but neither remaining fragment is. +It will be called with the following parameters: + block: middleBlock + oldSize: left + middle + right + newSize: left >= right ? left : right + +.impl.callback.delete.new: The new callback will be called in this case if and +only if: + left >= minSize && right >= minSize +That is, both remaining fragments are interesting. It will be called with the +following parameters: + block: a new block + oldSize: 0 + newSize: left >= right ? right : left + +.impl.callback.delete.shrink: The shrink callback will be called in this case +if and only if: + left + middle + right >= minSize && (left >= minSize || right >= minSize) +That is, at least one of the remaining fragments is still interesting. It will +be called with the following parameters: + block: middleBlock + oldSize: left + middle + right + newSize: left >= right ? left : right + +.impl.callback.insert: Consider the case when the minimum size is , +and CBSInsert is called to add a range of size . The two (possibly +non-existant) neighbouring blocks are and , and have +(possibly zero) sizes and . + +.impl.callback.insert.delete: The delete callback will be called in this case +if and only if: + left >= minSize && right >= minSize +That is, both neighbours were interesting. It will be called with the +following parameters: + block: left >= right ? rightBlock : leftBlock + oldSize: left >= right ? right : left + newSize: 0 + +.impl.callback.insert.new: The new callback will be called in this case if and +only if: + left + middle + right >= minSize && left < minSize && right < minSize +That is, the combined block is interesting, but neither neighbour was. It will +be called with the following parameters: + block: left >= right ? leftBlock : rightBlock + oldSize: left >= right ? left : right + newSize: left + middle + right + +.impl.callback.insert.grow: The grow callback will be called in this case if +and only if: + left + middle + right >= minSize && (left >= minSize || right >= minSize) +That is, at least one of the neighbours was interesting. It will be called +with the following parameters: + block: left >= right ? leftBlock : rightBlock + oldSize: left >= right ? left : right + newSize: left + middle + right + + +Splay Tree + +.impl.splay: The CBS is principally implemented using a splay tree (see +design.mps.splay). Each splay tree node is embedded in a CBSBlock that +represents a semi-open address range. The key passed for comparison is the +base of another range. + +.impl.splay.fast-find: CBSFindFirst and CBSFindLast use the update/refresh +facility of splay trees to store, in each CBSBlock, an accurate summary of the +maximum block size in the tree rooted at the corresponding splay node. This +allows rapid location of the first or last suitable block, and very rapid +failure if there is no suitable block. + +.impl.find-largest: CBSFindLargest simply finds out the size of the largest +block in the CBS from the root of the tree (using SplayRoot), and does +SplayFindFirst for a block of that size. This is O(log(n)) in the size of the +free list, so it's about the best you can do without maintaining a separate +priority queue, just to do CBSFindLargest. Except when the emergency lists +(see .impl.low-mem) are in use, they are also searched. + + +Low Memory Behaviour + +.impl.low-mem: Low memory situations cause problems when the CBS tries to +allocate a new CBSBlock structure for a new isolated range as a result of +either CBSInsert or CBSDelete, and there is insufficient memory to allocation +the CBSBlock structure: + +.impl.low-mem.no-inline: If mayUseInline is FALSE, then the range is not added +to the CBS, and the call to CBSInsert or CBSDelete returns ResMEMORY. + +.impl.low-mem.inline: If mayUseInline is TRUE: + +.impl.low-mem.inline.block: If the range is large enough to contain an inline +block descriptor consisting of two pointers, then it is kept on an emergency +block list. The CBS will eagerly attempt to add this block back into the splay +tree during subsequent calls to CBSInsert and CBSDelete. The CBS will also +keep its emergency block list in address order, and will coalesce this list +eagerly. Some performance degradation will be seen when the emergency block +list is in use. Ranges on this emergency block list will not be made available +to the CBS's client via callbacks. CBSIterate* will not iterate over ranges on +this list. + +.impl.low-mem.inline.block.structure: The two pointers stored are to the next +such block (or NULL), and to the limit of the block, in that order. + +.impl.low-mem.inline.grain: Otherwise, the range must be large enough to +contain an inline grain descriptor consisting of one pointer, then it is kept +on an emergency grain list. The CBS will eagerly attempt to add this grain +back into either the splay tree or the emergency block list during subsequent +calls to CBSInsert and CBSDelete. The CBS will also keep its emergency grain +list in address order. Some performance degradation will be seen when the +emergency grain list is in use. Ranges on this emergency grain list will not +be made available to the CBS's client via callbacks. CBSIterate* will not +iterate over ranges on this list. + +.impl.low-mem.inline.grain.structure: The pointer stored is to the next such +grain, or NULL. + + +The CBS Block + +.impl.cbs.block: The block contains a base-limit pair and a splay tree node. + +.impl.cbs.block.special: The base and limit may be equal if the block is +halfway through being deleted. + +.impl.cbs.block.special.just: This conflates values and status, but is +justified because block size is very important. + + +TESTING + +.test: The following testing will be performed on this module: + +.test.cbstest: There is a stress test for this module in impl.c.cbstest. This +allocates a large block of memory and then simulates the allocation and +deallocation of ranges within this block using both a CBS and a BT. It makes +both valid and invalid requests, and compares the CBS response to the correct +behaviour as determined by the BT. It also iterates the ranges in the CBS, +comparing them to the BT. It also invokes the CBS describe method, but makes +no automatic test of the resulting output. It does not currently test the +callbacks. + +.test.pool: Several pools (currently MV2 and MVFF) are implemented on top of a +CBS. These pool are subject to testing in development, QA, and are/will be +heavily exercised by customers. + + +NOTES FOR FUTURE DEVELOPMENT + +.future.not-splay: The initial implementation of CBSs is based on splay trees. +It could be revised to use any other data structure that meets the requirements +(especially .req.fast). + +.future.hybrid: It would be possible to attenuate the problem of .risk.overhead +(below) by using a single word bit set to represent the membership in a +(possibly aligned) word-width of grains. This might be used for block sizes +less than a word-width of grains, converting them when they reach all free in +the bit set. Note that this would make coalescence slightly less eager, by up +to (word-width - 1). + + +RISKS + +.risk.overhead: Clients should note that the current implementation of CBSs has +a space overhead proportional to the number of isolated contiguous ranges. [ +Four words per range. ] If the CBS contains every other grain in an area, then +the overhead will be large compared to the size of that area. [ Four words per +two grains. ] See .future.hybrid for a suggestion to solve this problem. An +alternative solution is to use CBSs only for managing long ranges. +--- +The following relates to a pending re-design and does not yet relate to any +working source version. GavinM 1998-09-25 + +The CBS system provides its services by combining the services provided by +three subsidiary CBS modules: + + - CBSST -- Splay Tree: Based on out-of-line splay trees; must allocate to +insert isolated, which may therefore fail. + + - CBSBL -- Block List: Based on a singly-linked list of variable sized ranges +with inline descriptors; ranges must be at least large enough to store the +inline descriptor. + + - CBSGL -- Grain List: Based on a singly-linked list of fixed size ranges +with inline descriptors; the ranges must be the alignment of the CBS. + +The three sub-modules have a lot in common. Although their methods are not +invoked via a dispatcher, they have been given consistent interfaces, and +consistent internal appearance, to aid maintenance. + +Methods supported by sub-modules (not all sub-modules support all methods): + + - MergeRange -- Finds any ranges in the specific CBS adjacent to the supplied +one. If there are any, it extends the ranges, possibly deleting one of them. +This cannot fail, but should return FALSE if there is an intersection between +the supplied range and a range in the specific CBS. + + - InsertIsolatedRange -- Adds a range to the specific CBS that is not +adjacent to any range already in there. Depending on the specific CBS, this +may be able to fail for allocation reasons, in which case it should return +FALSE. It should AVER if the range is adjacent to or intersects with a range +already there. + + - RemoveAdjacentRanges -- Finds and removes from the specific CBS any ranges +that are adjacent to the supplied range. Should return FALSE if the supplied +range intersects with any ranges already there. + + - DeleteRange -- Finds and deletes the supplied range from the specific CBS. +Returns a tri-state result: + - Success -- The range was successfully deleted. This may have involved +the creation of a new range, which should be done via CBSInsertIsolatedRange. + - ProtocolError -- Either some non-trivial strict subset of the supplied +range was in the specific CBS, or a range adjacent to the supplied range was in +the specific CBS. Either of these indicates a protocol error. + - NoIntersection -- The supplied range was not found in the CBS. This may +or not be a protocol error, depending on the invocation context. + + - FindFirst -- Returns the first (in address order) range in the specific CBS +that is at least as large as the supplied size, or FALSE if there is no such +range. + + - FindFirstBefore -- As FindFirst, but only finds ranges prior to the +supplied address. + + - FindLast -- As FindFirst, but finds the last such range in address order. + + - FindLastAfter -- FindLast's equivalent of FindFirstBefore. + + - Init -- Initialise the control structure embedded in the CBS. + + - Finish -- Finish the control structure embedded in the CBS. + + - InlineDescriptorSize -- Returns the aligned size of the inline descriptor. + + - Check -- Checks the control structure embedded in the CBS. + +The CBS supplies the following utilities: + + - CBSAlignment -- Returns the alignment of the CBS. + + - CBSMayUseInline -- Returns whether the CBS may use the memory in the ranges +stored. + + - CBSInsertIsolatedRange -- Wrapper for CBS*InsertIsolatedRange. + +Internally, the CBS* sub-modules each have an internal structure CBS*Block that +represents an isolated range within the module. It supports the following +methods (for sub-module internal use): + - BlockBase -- Returns the base of the associated range; + - BlockLimit + - BlockRange + - BlockSize + diff --git a/mps/design/check/index.txt b/mps/design/check/index.txt new file mode 100644 index 00000000000..36315daf42c --- /dev/null +++ b/mps/design/check/index.txt @@ -0,0 +1,73 @@ + DESIGN OF CHECKING IN MPS + design.mps.check + incomplete design + gavinm 1996-08-05 + + +INTRODUCTION: + +This documents the design of structure checking within the MPS + + +IMPLEMENTATION: + +.level: There are three levels of checking: + .level.sig: The lowest level checks only that the structure has a valid +Signature (see design.mps.sig). + .level.shallow: Shallow checking checks all local fields (including +signature) and also checks the signatures of any parent or child structures. + .level.deep: Deep checking checks all local fields (including signatures), +the signatures of any parent structures, and does full recursive checking on +any child structures. + .level.control: control over the levels of checking is via the definition of +at most one of the macros TARGET_CHECK_SHALLOW (which if defined gives +.level.shallow), TARGET_CHECK_DEEP (which if defined gives .level.deep). If +neither macro is defined then .level.sig is used. These macros are not +intended to be manipulated directly by developers, they should use the +interface in impl.h.target. + +.order: Because deep checking (.level.deep) uses unchecked recursion, it is +important that child relationships are acyclic (.macro.down). + +.fun: Every abstract data type which is a structure pointer should have a +function Check which takes a pointer of type and returns a Bool. +It should check all fields in order, using one of the macros in .macro, or +document why not. + +.fun.omit: The only fields which should be omitted from a check function are +those for +which there is no meaningful check (e.g. unlimited unsigned integer with no +relation to +other fields). + +.fun.return: Although the function returns a Bool, if the assert handler +returns (or there is no assert handler), then this is taken to mean "ignore and +continue", and the check function hence returns TRUE. + +.macro: Checking is implemented by invoking four macros in impl.h.assert: + .macro.sig: CHECKS(type, val) checks the signature only, and should be called +precisely on type and the received object pointer. + .macro.local: CHECKL(cond) checks a local field (depending on level (see +.level)), and should be called on each local field that is not an abstract data +type structure pointer itself (apart from the signature), with an appropriate +normally-true test condition. + .macro.up: CHECKU(type, val) checks a parent abstract data type structure +pointer, performing at most signature checks (depending on level (see +.level)). It should be called with the parent type and pointer. + .macro.down: CHECKD(type, val) checks a child abstract data type structure +pointer, possibly invoking Check (depending on level (see .level)). It +should be called with the child type and pointer. + +.full-type: CHECKS, CHECKD, CHECKU, all operate only on fully fledged types. +This means the type has to provide a function Bool TypeCheck(Type type) where +Type is substituted for the name of the type (eg, PoolCheck), and the +expression obj->sig must be a valid value of type Sig whenever obj is a valid +value of type Type. + +.type.no-sig: This tag is to be referenced in implementations whenver the form +CHECKL(ThingCheck(thing)) is used instead of CHECK{U,D}(Thing, thing) because +Thing is not a fully fledged type (.full-type). + + + + diff --git a/mps/design/collection/index.txt b/mps/design/collection/index.txt new file mode 100644 index 00000000000..2993740ba18 --- /dev/null +++ b/mps/design/collection/index.txt @@ -0,0 +1,287 @@ + THE COLLECTION FRAMEWORK + design.mps.collection + incomplete design + pekka 1998-03-20 + +INTRODUCTION + +.intro: This document describes the Collection Framework. It's a framework for +implementing garbage collection techniques and integrating them into a system +of collectors that all cooperate in recycling garbage. + + +Document History + +.hist.0: Version 0 was a different document. + +.hist.1: Version 1 was a different document. + +.hist.2: Written in January and February 1998 by Pekka P. Pirinen on the basis +of the current implementation of the MPS, analysis.async-gc, [that note on the +independence of collections] and analysis.tracer. + + +OVERVIEW + +.framework: MPS provides a framework that allows the integration of many +different types of GC strategies and provides many of the basic services that +those strategies use. .framework.cover: The framework subsumes most major GC +strategies and allows many efficient techniques, like in-line allocation or +software barriers. + +.framework.overhead: The overhead due to cooperation is low. [But not +non-existent. Can we say something useful about it?] + +.framework.benefits: The ability to combine collectors contributes +significantly to the flexibility of the system. The reduction in code +duplication contributes to reliability and integrity. The services of the +framework make it easier to write new MM strategies and collectors. + +.framework.mpm: The Collection Framework is merely a part of the structure of +the MPM. See design.mps.architecture and design.mps.arch [Those two documents +should be combined into one. Pekka 1998-01-15] for the big picture. Other +notable components that the MPM manages to integrate into a single framework +are manually-managed memory [another missing document here?] and finalization +services (see design.mps.finalize). + +.see-also: This document assumes basic familiarity with the ideas of pool (see +design.mps.arch.pools) and segment (see design.mps.seg.over.*). + + +COLLECTION ABSTRACTIONS + +Colours, scanning and fixing + +.state: The framework knows about the three colours of the tri-state +abstraction and free blocks. Recording the state of each object is the +responsibility of the pool, but the framework gets told about changes in the +states and keeps track of colours in each segment. Specifically, it records +whether a segment might contain white, grey and black objects wrt. each active +trace (see .tracer) [black not currently implemented -- Pekka 1998-01-04]. (A +segment might contain objects of all colours at once, or none.) This +information is approximate, because when an object changes colour, or dies, it +usually is too expensive to determine if it was the last object of its former +colour. + +.state.transitions: The possible state transitions are as follows: + +free ---alloc--> black (or grey) or white or none +none --condemn-> white +none --refine--> grey +grey ---scan---> black +white ----fix---> grey (or black) +black --revert--> grey +white --reclaim-> free +black --reclaim-> none + +.none-is-black: Outside of a trace, objects don't really have colour, but +technically, the colour is black. Objects are only allocated grey or white +during a trace, and by the time the trace has finished, they are either dead or +black, like the other surviving objects. We might then reuse the colour field +for another trace, so it's convenient to set the colour to black when +allocating outside a trace. This means that refining the foundation +(analysis.tracer.phase.condemn.refine), actually turns black segments grey, +rather than vice versa, but the principle is the same. + +.scan-fix: "Scanning" an object means applying the "fix" function to all +references in that object. Fixing is the generic name for the operation that +takes a reference to a white object and makes it non-white (usually grey, but +black is a possibility, and so is changing the reference as we do for weak +references). Typical examples of fix methods are copying the object into +to-space or setting its mark bit. + +.cooperation: The separation of scanning and fixing is what allows different GC +techniques to cooperate. The scanning is done by a method on the pool that the +scanned object resides in, and the fixing is done by a method on the pool that +the reference points to. + +.scan-all: Pools provide a method to scan all the grey objects in a segment. + + +Reference sets + +.refsets: The cost of scanning can be significantly reduced by storing +remembered sets. We have chosen a very compact and efficient implementation, +called reference sets, or refsets for short (see idea.remember +[design.mps.refset is empty! Perhaps some of this should go there. -- Pekka +1998-02-19]). This makes the cost of maintaining them low, so we maintain them +for all references out of all scannable segments. + +.refsets.approx: You might describe refsets as summaries of all references out +of an area of memory, so they are only approximations of remembered sets. When +a refset indicates that an interesting reference might be present in a segment, +we still have to scan the segment to find it. + +.refsets.scan: The refset information is collected during scanning. The scan +state protocol provides a way for the pool and the format scan methods to +cooperate in this, and to pass this information to the tracer module which +checks it and updates the segment (see design.mps.scan [Actually, there's very +little doc there. Pekka 1998-02-17]). + +.refsets.maintain: The MPS tries to maintain the refset information when it +moves or changes object. + +.refsets.pollution: Ambiguous references and pointers outside the arena will +introduce spurious zones into the refsets. We put up with this to keep the +scanning costs down. Consistency checks on refsets have to take this into +account. + +.refsets.write-barrier: A write-barrier are needed to keep the mutator from +invalidating the refsets when writing to a segment. We need one on any +scannable segment whose refset is not a superset of the mutator's (and that the +mutator can see). If we know what the mutator is writing and whether it's a +reference, we can just add that reference to the refset (figuring out whether +anything can be removed from the refset is too expensive). If we don't know or +if we cannot afford to keep the barrier up, the framework can union the +mutator's refset to the segment's refset. + +.refset.mutator: The mutator's refset could be computed during root scanning in +the usual way, and then kept up to date by using a read-barrier. It's not a +problem that the mutator can create new pointers out of nothing behind the +read-barrier, as they won't be real references. However, this is probably not +cost-effective, since it would cause lots of barrier hits. We'd need a +read-barrier on every scannable segment whose refset is not a subset of the +mutator's (and that the mutator can see). So instead we approximate the +mutator's refset with the universal refset. + + +THE TRACER + +.tracer: The tracer is an engine for implementing multiple garbage collection +processes. Each process (called a "trace") proceeds independently of the +others through five phases as described in analysis.tracer. The following +sections describe how the action of each phase fits into the framework. See +design.mps.trace for details [No, there's not much there, either. Possibly +some of this section should go there. Pekka 1998-02-18]). + +.combine: The tracer can also combine several traces for some actions, like +scanning a segment or a root. The methods the tracer calls to do the work get +an argument that tells them which traces they are expected to act for. [extend +this@@@@] + +.trace.begin: Traces are started by external request, usually from a client +function or an action (see design.mps.action). + +.trace.progress: The tracer gets time slices from the arena to work on a given +trace [This is just a provisional arrangement, in lieu of real progress +control. Pekka 1998-02-18]. In each slice, it selects a small amount of work +to do, based on the state of the trace, and does it, using facilities provided +by the pools. .trace.scan: A typical unit of work is to scan a single +segment. The tracer can choose to do this for multiple traces at once, +provided the segment is grey for more than one trace. + +.trace.barrier: Barrier hits might also cause a need to scan a segment (see +.hw-barriers.hit). Again, the tracer can choose to combine traces, when it +does this. + +.mutator-colour: The framework keeps track of the colour of the mutator +separately for each trace. + + +The Condemn Phase + +.phase.condemn: The agent that creates the trace (see .trace.begin) determines +the condemned set and colours it white. The tracer then examines the refsets +on all scannable segments, and if it can deduce some segment cannot refer to +the white set, it's immediately coloured black, otherwise the pool is asked to +grey any objects in the segment that might need to be scanned (in copying +pools, this is typically the whole segment). + +.phase.condemn.zones: To get the maximum benefit from the refsets, we try to +arrange that the zones are a minimal superset (e.g., generations uniquely +occupy zones) and a maximal subset (there's nothing else in the zone) of the +condemned set. This needs to be arranged at allocation time (or when copying +during collection, which is much like allocation) [soon, this will be handled +by segment loci, see design.mps.locus]. + +.phase.condemn.mutator: At this point, the mutator might reference any objects, +i.e., it is grey. Allocation can be in any colour, most commonly white [more +could be said about this]. + + +The Grey Mutator Phase + +.phase.grey-mutator: Grey segments are chosen according to some sort of +progress control and scanned by the pool to make them black. Eventually, the +tracer will decide to flip or it runs out of grey segments, and proceeds to the +next phase. [Currently, this phase has not been implemented; all traces flip +immediately after condemn. Pekka 1998-02-18] + +.phase.grey-mutator.copy: At this stage, we don't want to copy condemned +objects, because we would need an additional barrier to keep the mutator's view +of the heap consistent (see analysis.async-gc.copied.pointers-and-new-copy). + +.phase.grey-mutator.ambig: This is a good time to get all ambiguous scanning +out of the way, because we usually can't do any after the flip [write a +detailed explanation of this some day] and because it doesn't cause any copying. + + +The Flip Phase + +.phase.flip: The roots (see design.mps.root) are scanned. This has to be an +atomic action as far as the mutator is concerned, so all threads are suspended +for the duration. + +.phase.flip.mutator: After this, the mutator is black: if we use a strong +barrier (analysis.async-gc.strong), this means it cannot refer to white +objects. Allocation will be in black (could be grey as well, but there's no +point to it). + + +The Black Mutator Phase + +.phase.black-mutator: Grey segments are chosen according to some sort of +progress control and scanned by the pool to make them black. Eventually, the +tracer runs out of segments that are grey for this trace, and proceeds to the +next phase. + +.phase.black-mutator.copy: At this stage white objects can be relocated, +because the mutator cannot see them (as long as a strong barrier is used, as we +must do for a copying collection, see analysis.async-gc.copied.pointers). + + +The Reclaim Phase + +.phase.reclaim: The tracer finds the remaining white segments and asks the pool +to reclaim any white objects in them. + +.phase.reclaim.barrier: Once a trace has started reclaiming objects, the others +shouldn't try to scan any objects that are white for it, because they might +have dangling pointers in them [xref doc yet to be written]. [Currently, we +reclaim atomically, but it could be incremental, or even overlapped with a new +trace on the same condemned set. Pekka 1997-12-31] + + +BARRIERS + +[An introduction and a discussion of general principles should go here. This +is a completely undesigned area.] + + +Hardware Barriers + +.hw-barriers: Hardware barrier services cannot, by their very nature, be +independently provided to each trace. A segment is either protected or not, +and we have to set the protection on a segment if any trace needs a hardware +barrier on it. + +.hw-barriers.supported: The framework currently supports segment-oriented +Appel-Ellis-Li barriers (analysis.async-gc.barrier.appel-ellis-li), and +write-barriers for keeping the refsets up-to-date. It would not be hard to add +Steele barriers (analysis.async-gc.barrier.steele.scalable). + +.hw-barriers.hit: When a barrier hit happens, the arena determines which +segment it was on. The segment colour info is used to determine whether it had +trace barriers on it, and if so, the appropriate barrier action is performed, +using the methods of the owning pool. If the segment was write-protected, its +refset is unioned with the refset of the mutator [in practice, RefSetUNIV]. + +.hw-barriers.hit.multiple: Fortunately, if we get a barrier hit on a segment +with multiple trace barriers on it, we can scan it for all the traces that it +had a barrier for, see .combine.@@@@ + + +Software barriers + +[@@@@Have to say something about software barriers] + diff --git a/mps/design/config/index.txt b/mps/design/config/index.txt new file mode 100644 index 00000000000..00ad6b6f34d --- /dev/null +++ b/mps/design/config/index.txt @@ -0,0 +1,433 @@ + THE DESIGN OF MPS CONFIGURATION + design.mps.config + incomplete design + richard 1997-02-19 + +INTRODUCTION + +.intro: This document describes how the MPS configuration is parameterized so +that it can target different architectures, operating systems, build +environments, varieties, and products. + +.bg: For background see [build system mail, configuration mail, +meeting.general.something] + + +Document History + +.hist.0: Initial draft created by Richard Brooksby on 1997-02-19 +based on discussions of configuration at meeting.general.1997-02-05. + +.hist.1: Various improvements and clarifications to the draft discussed between +Richard and Nick Barnes at meeting.general.1997-02-19. + + +REQUIREMENTS + +.req.arch: Allow architecture specific configurations of the MPS. + +.req.os: Allow operating system specific configurations of the MPS. + +.req.builder: Allow build environment (compiler, etc.) specific configurations +of the MPS. + +.req.prod: Allow product specific configurations of the MPS. + +.req.var: Allow configurations with different amounts of instrumentation +(assertions, metering, etc.). + +.req.impact: The configuration system should have a minimal effect on +maintainability of the implementation. + +.req.port: The system should be easy to port across operating systems. + +.req.maint: Maintenance of the configuration and build system should not +consume much developer time. + + + +DEFINITIONS + +.def.platform: A platform is a combination of an architecture (.def.arch), an +operating system (.def.os), and a builder (.def.builder). The set of supported +platforms is platform.*. + +.def.arch: An architecture is processor type with associated calling +conventions and other binary interface stuff. + +.def.os: An operating system is the interface to external resources. + +.def.builder: A builder is the tools (C compiler, etc.) used to make the target +(.def.target). + +.def.var: A variety is a combination of annotations such as assertions, +metering, etc. + +.def.prod: A product is the intended product into which the MPS will fit, e.g. +ScriptWorks, Dylan, etc. + +.def.target: The target is the result of the build. + + +OVERVIEW + +- No automatically generated code. Use only C compiler and linker. +- Simple build function (design.mps.buildsys.????) +- Avoid conditional code spaghetti in implementations. +- Dependency on a particular configuration should be minimized and localized +when developing code. + + +THE BUILD SYSTEM + + +Abstract Build Function + +.build.fun: The MPS implementation assumes only a simple "build function" which +takes a set of sources, possibly in several languages, compiles them with a set +of predefined preprocessor symbols, and links the result with a set of +libraries to form the target: + + target := build(, , ) + +.build.sep: Separate compilation and linkage can be seen as a memoization of +this function, and is not strictly necessary for the build. + +.build.cc: A consequence of this approach is that it should always be possible +to build a complete target with a single UNIX command line calling the compiler +driver (usually "cc" or "gcc"), for example: + + cc -o main -DCONFIG_VAR_DF foo.c bar.c baz.s -lz + +.build.defs: The "defs" are the set of preprocessor macros which are to be +predefined when compiling the module sources. + + CONFIG_VAR_ + CONFIG_PROD_ + +The variety-codes are the 2 letter code that appears after "variety." in the +tag of the relevant variety document (see variety.*) converted to upper case. +Currently (1998-11-09): HI, CI, TI, HE, CE, WI, WE, II + +The product-codes are currently (1998-11-09) MPS, DYLAN, EPCORE. + +Exactly one CONFIG_VAR define must be present. + +Exactly one CONFIG_PROD define must be present. + +.build.srcs: The "srcs" are the set of sources that must be compiled in order +to build the target. The set of sources may vary depending on the +configuration. For example, different sets of sources may be required to build +different products. [This is a dependency between the makefile (or whatever) +and the module configuration in config.h.] + +.build.libs: The "libs" are the set of libraries to which the compiled sources +must be linked in order to build the target. For example, when building a test +program, it might include the ANSI C library and an operating system interface +library. + + +File Structure + +.file.dir: Each product consists of a single directory (corresponding to a HOPE +compound) containing all the sources for the whole family of targets. +.file.base: The names of sources must be unique in the first eight characters +in order to conform to FAT filesystem naming restrictions. .file.ext: The +extension may be up to three characters and directly indicates the source +language. + +[Where is the set of valid extensions and languages defined?] + + +Modules and Naming + +.mod.unique: Each module has an identifier which is unique within the MPS. +.mod.impls: Each module has one or more implementations which may be in any +language supported by the relevant build environment. .mod.primary: The +primary implementation of a module is written in target-independent ANSI C in a +source file with the same name as the module. [This seems to be with an "an" +suffix now. GavinM 1997-08-07] .mod.secondary: The names of other +implementations should begin with the same prefix (the module id or a shortened +version of it) and be suffixed with on or more target parameter codes (defined +below). In particular, the names of assembly language sources must include the +target parameter code for the relevant architecture. + + +Build System Rationale + +.build.rat: This simple design makes it possible to build the MPS using many +different tools. Microsoft Visual C++, Metrowerks Codewarrior, and other +graphical development tools do not support much in the way of generated +sources, staged building, or other such stuff. The Visual C and Metrowerks +"project" files correspond closely to a closure of the build function +(.build.fun). The simplicity of the build function has also made it easy to +set up builds using NMAKE (DOS), MPW (Macintosh), and to get the MPS up and +running on other platforms such as FreeBSD and Linux in very little time. The +cost of maintaining the build systems on these various platforms is also +reduced to a minimum, allowing the MM Group to concentrate on primary +development. The source code is kept simple and straightforward. When looking +at MPS sources you can tell exactly what is going to be generated with very +little context. The sources are not munged beyond the standard ANSI C +preprocessor. + +.build.port: The portability requirement (.req.port) implies that the build +system must use only standard tools that will be available on all conceivable +target platforms. Experience of development environments on the Macintosh +(Metrowerks Codewarrior) and Windows NT (Visual C++) indicates that we cannot +assume much sophistication in the use of file structure by development +environments. The best that we can hope for is the ability to combine a fixed +list of source files, libraries, and predefined preprocessor symbols into a +single target. + +.build.maint: The maintainability requirement (.req.maint) implies that we +don't spend time trying to develop a set of tools to support anything more +complicated than the simple build function described above. The effort in +constructing and maintaining a portable system of this kind is considerable. +Such efforts have failed in EP. + + +IMPLEMENTATION + +[ Now in impl.h.config, may symbols out of date. GavinM 1997-08-07 ] + +.impl: The two implementation files impl.h.config and impl.h.mpstd can be seen +as preprocessor programs which "accept" build parameters and "emit" +configuration parameters (.fig.impl). The build parameters are defined either +by the builder (in the case of target detection) or by the build function (in +the case of selecting the variety and product). + +.fig.impl: + + build parameters configuration parameters + + CONFIG_VAR_DF --> config.h --> MPS_VAR_DF, ASSERT_MPM, etc. + + CONFIG_PROD_EPCORE --> config.h --> ARENA_CLIENT, PROT_NONE, +JUNKBYTE=0x39, etc. + + _WIN32 --> mpstd.h --> MPS_OS_W3, etc. + +.impl.dep: No source code, other than the directives in impl.h.config and +impl.h.mpstd, should depend on any build parameters. That is, identifers +beginning "CONFIG_" should only appear in impl.h.config. Code may depend on +configuration parameters in certain, limited ways, as defined below (.conf). + + +Target Platform Detection + +.pf: The target platform is "detected" by the preprocessor directives in +impl.h.mpstd. +.pf.form: This file consists of sets of directives of the form: + + #elif + #define MPS_PF_ + #define MPS_OS_ + #define MPS_ARCH_ + #define MPS_BUILD_ + #define MPS_T_WORD + #define MPS_WORD_SHIFT + #define MPS_PF_ALIGN + +.pf.detect: The conjunction of builder predefinitions is a constant expression +which detects the target platform. It is a logical AND of expressions which +look for preprocessor symbols defined by the build environment to indicate the +target. These must be accompanied by a reference to the build tool +documentation from which the symbols came. For example: + + /* Visual C++ 2.0, Books Online, C/C++ Book, Preprocessor Reference, */ + /* Chapter 1: The Preprocessor, Macros, Predefined + + #elif defined(_MSC_VER) && defined(_WIN32) && defined(_M_IX86) + +.pf.codes: The declarations of the platform, operating system, architecture, +and builder codes define preprocessor macros corresponding the the target +detected (.pfm.detect). For example: + + #define MPS_PF_W3I3MV + #define MPS_OS_W3 + #define MPS_ARCH_I3 + #define MPS_BUILD_MV + +.pf.word: The declaration of MPS_T_WORD defines the unsigned integral type +which corresponds, on the detected target, to the machine word. It is used to +defined the MPS Word type (design.mps.type.word). [Insert backwards ref +there.] For example: + + #define MPS_T_WORD unsigned long + +.pf.word-width: The declaration of MPS_WORD_WIDTH defines the number of bits in +the type defined by MPS_T_WORD (.pf.word) on the target. For example: + + #define MPS_WORD_WIDTH 32 + +.pf.word-shift: The declaration of MPS_WORD_SHIFT defines the log to the base 2 +of MPS_WORD_WIDTH. For example: + + #define MPS_WORD_SHIFT 5 + +.pf.pf-align: The declaration of MPS_PF_ALIGN defines the minimum alignment +which must be used for a memory block to permit any normal processor memory +access. In other words, it is the maximum alignment required by the processor +for normal memory access. For example: + + #define MPS_PF_ALIGN 4 + + +Target Varieties + +.var: The target variety is handled by preprocessor directives in +impl.h.config. .var.form: The file contains sets of directives of the form: + + #elif defined(CONFIG_VAR_DF) + #define MPS_VAR_DF + #define ASSERT_MPSI + #define ASSERT_MPM + etc. + +.var.detect: The configured variety is one of the variety preprocessor +definitions passed to the build function (.build.defs), e.g. CONFIG_VAR_DF. +[These are decoupled so that it's possible to tell the difference between +overridden settings etc. Explain.] + +.var.symbols: The directives should define whatever symbols are necessary to +control annotations. These symbols parameterize other parts of the code, such +as the declaration of assertions, etc. The symbols should all begin with the +prefix "MPS_VAR_". + + +Target Product + +.prod: The target product is handled by preprocessor directives in +impl.h.config. .prod.form: The file contains sets of directives of the form: + + #elif defined(CONFIG_PROD_EPCORE) + #define PROT_NONE + #define THREAD_NONE + #define ARENA_CLIENT + etc. + +[Tidy this up:] +Note, anything which can be configured, is configured, even if it's just +configured to "NONE" meaning nothing. This makes sure that you can't choose +something by omission. Where these symbols are used there will be a #error to +catch the unused case. +[This is a general principle which applies to other configuration stuff too.] + + +SOURCE CODE CONFIGURATION + +.conf: This section describes how the configuration may affect the source code +of the MPS. + +.conf.limit: The form of dependency allowed is carefully limited to ensure that +code remains maintainable and portable (.req.impact). + +.conf.min: The dependency of code on configuration parameters should be kept to +a minimum in order to keep the system maintainable (.req.impact). + + +Configuration Parameters + +.conf.params: The compilation of a module is parameterized by: + + MPS_ARCH_ + MPS_OS_ + MPS_BUILDER_ + MPS_PF_ + MPS_VAR_ + MPS_PROD_ + + +Abstract and Concrete Module Interfaces + +Basic principle: the caller musn't be affected by configuration of a module. +This reduces complexity and dependency of configuration. + +All callers use the same abstract interface. Caller code does not change. + +Abstract interface includes: + - method definitions (logical function prototypes which may be macro methods) + - names of types + - names of constants + - names of structures and fields which form part of the interface, and +possibly their types, depending on the protocol defined + - the protocols + +The abstract interface to a module may not be altered by a configuration +parameter. However, the concrete interface may vary. + + +Configuring Module Implementations + +For example, this isn't allowed, because there is a change in the interface. + + #if defined(PROT_FOO) + void ProtSpong(Foo foo, Bar bar); + #else + int ProtSpong(Bar bar, Foo foo); + #endif + +This example shows how: + + #ifdef PROTECTION + void ProtSync(Space space); + /* more decls. */ + #else /* PROTECTION not */ + #define ProtSync(space) NOOP + /* more decls. */ + #endif /* PROTECTION */ + +or + + #if defined(PROT_FOO) + typedef struct ProtStruct { + int foo; + } ProtStruct; + #define ProtSpong(prot) X((prot)->foo) + #elif defined(PROT_BAR) + typedef struct ProtStruct { + float bar; + } ProtStruct; + #define ProtSpong(prot) Y((prot)->bar) + #else + #error "No PROT_* configured." + #endif + +Configuration parameters may not be used to vary implementations in .c files. +For example, this sort of thing: + + int map(void *base, size_t size) + { + #if defined(MPS_OS_W3) + VirtualAlloc(foo, bar, base, size); + #elif defined(MPS_OS_SU) + mmap(base, size, frob); + #else + #error "No implementation of map." + #endif + } + +This leads to extreme code spaghetti. In effect, it's a "candy machine +interface" on source code. This kind of thing should be done by having several +implementations of the same interface in separate source files. If this leads +to duplication of code then that code should be placed in a separate, common +module. + + +PROCEDURES + +[Adding an architecture, etc.] + + +NOTES + +What about constants? + +To do: +- Renaming of some stuff. +- Introduce product selection. +- Change makefiles. +- Eliminate mpmconf.h by moving stuff to config.h. +- Update files to refer to this design document. + + diff --git a/mps/design/finalize/index.txt b/mps/design/finalize/index.txt new file mode 100644 index 00000000000..06cc5dafd54 --- /dev/null +++ b/mps/design/finalize/index.txt @@ -0,0 +1,100 @@ + FINALIZATION + design.mps.finalize + incomplete design + drj 1997-02-14 + + +OVERVIEW: + +Finalization is implemented internally using the Guardian Pool Class +(design.mps.poolmrg). Objects can be registered for finalization using an +interface function (called mps_finalize). Notification of finalization is +given to the client via the messaging interface. PoolClassMRG +(design.mps.poolmrg) implements a Message Class which implements the +finalization messages. + + +REQUIREMENTS: + +.req: Currently only Dylan has requirements for finalization, see +req.dylan.fun.final. + + +ARCHITECTURE: + +External Interface + + +.if.register: +mps_res_t mps_finalize(mps_arena_t arena, mps_addr_t obj); + +increases the number of times that the object located at obj has been +registered for finalization by one. The object must have been allocated from +the arena (space). Any finalization messages that are created for this object +will appear on the arena's message queue. The MPS will attempt to finalize the +object that number of times. + +.if.deregister: +void mps_definalize(mps_arena_t arena, mps_addr_t obj); + +mps_definalize reduces the number of times that the object located at obj has +been registered for finalization by one. It is an error to definalize that has +not been registered for finalization. + +.if.deregister.not: At the moment (1997-08-20) mps_definalize is not implemented + +.if.get-ref: +void mps_message_finalization_ref(mps_addr_t *mps_addr_return, + mps_arena_t mps_arena, + mps_message_t mps_message) + + +mps_message_finalization_ref returns the reference to the finalized object +stored in the finalization message. + + +IMPLEMENTATION: + +.int.over: Registering an object for finalization corresponds to allocating a +reference of rank FINAL to that object. This reference is allocated in a +guardian object in a pool of PoolClassMRG (see design.mps.poolmrg). + +.int.arena.struct: The MRG pool used for managing final references is kept in +the Arena (Space), referred to as the "final pool". .int.arena.lazy: The pool +is lazily created, it will not be created until the first object is registered +for finalization. .int.arena.flag: There is a flag in the Arena that indicates +whether the final pool has been created yet or not. + +.int.finalize: + +Res ArenaFinalize(Arena arena, Ref addr) + +.int.finalize.create: Creates the final pool if it has not been created yet. +.int.finalize.alloc: Allocates a guardian in the final pool. +.int.finalize.write: Writes a reference to the object into the guardian +object. .int.finalize.all: That's all. .int.finalize.error: if either the +creation of the pool or the allocation of the object fails then the error will +be reported back to the caller. .int.finalize.error.no-unwind: This function +does not need to do any unwinding in the error cases because the creation of +the pool is not something that needs to be undone. + +.int.arena-destroy.empty: ArenaDestroy empties the message queue by calling +MessageEmpty. + +.int.arena-destroy.final-pool: If the final pool has been created then +ArenaDestroy destroys the final pool. + +.access: mps_message_finalization_ref needs to access the finalization message +to retrieve the reference and then write it to where the client asks. This +must be done carefully, in order to avoid breaking the invariants or creating a +hidden root. + +.access.invariants: We protect the invariants by using special routines +ArenaRead and ArenaPoke to read and write the reference. This works as long as +there's no write-barrier collection. [Instead of ArenaPoke, we could put in an +ArenaWrite that would be identical to ArenaPoke, except for AVERring the +invariant (or it can just AVER there are no busy traces unflipped). When we +get write-barrier collection, we could change it to do the real thing, but in +the absence of a write-barrier, it's functionally identical to ArenaPoke. +Pekka 1997-12-09] + diff --git a/mps/design/index.html b/mps/design/index.html index 869ac9285a0..2275c220e00 100644 --- a/mps/design/index.html +++ b/mps/design/index.html @@ -53,6 +53,397 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
arena/index.txt The design of the MPS arena
arenavm/index.txt Virtual memory arena
bt/index.txt Bit tables
buffer/index.txt Allocation buffers and allocation points
cbs/index.txt Design for coalescing block structure
check/index.txt Design of checking in MPS
collection/index.txt The collection framework
config/index.txt The design of MPS configuration
finalize/index.txt Finalization
interface-c/index.txt The design of the Memory Pool System interface to C
io/index.txt The design of the MPS i/o subsystem
lib/index.txt The design of the Memory Pool System library interface
lock/index.txt The design of the lock module
locus/index.txt The design for the locus manager
message/index.txt MPS to client message protocol
pool/index.txt The design of the pool and pool class mechanisms
poolamc/index.txt The design of the automatic mostly-copying memory pool class
poolams/index.txt The design of the automatic mark-and-sweep pool class
poolawl/index.txt Automatic weak linked
poollo/index.txt Leaf object pool class
poolmfs/index.txt The design of the manual fixed small memory pool class
poolmrg/index.txt Guardian poolclass
poolmv/index.txt The design of the manual variable memory pool class
poolmv2/index.txt The design of a new manual-variable memory pool class
poolmvff/index.txt Design of the manually-managed variable-size first-fit pool
prot/index.txt Generic design of the protection module
protan/index.txt ANSI implementation of protection module
protli/index.txt Linux implementation of protection module
protocol/index.txt The design for protocol inheritance in MPS
protsu/index.txt SunOS 4 implementation of protection module
pthreadext/index.txt Design of the Posix thread extensions for MPS
reservoir/index.txt The design of the low-memory reservoir
ring/index.txt The design of the ring data structure
root/index.txt The design of the root manager
scan/index.txt The design of the generic scanner
seg/index.txt The design of the MPS segment data structure
sig/index.txt The design of the Memory Pool System signature system
splay/index.txt Design of splay trees
sso1al/index.txt Stack scanner for Digital Unix / Alpha systems
telemetry/index.txt The design of the MPS telemetry mechanism
trace/index.txt Tracer
type/index.txt The design of the general MPS types
version-library/index.txt Design of the MPS library version mechanism
version/index.txt Design of MPS software versions
vm/index.txt The design of the virtual mapping interface
vman/index.txt ANSI fake VM
vmo1/index.txt VM Module on DEC Unix
vmso/index.txt VM Design for Solaris
writef/index.txt The design of the MPS writef function
diff --git a/mps/design/interface-c/index.txt b/mps/design/interface-c/index.txt new file mode 100644 index 00000000000..2fc6db8f857 --- /dev/null +++ b/mps/design/interface-c/index.txt @@ -0,0 +1,263 @@ + THE DESIGN OF THE MEMORY POOL SYSTEM INTERFACE TO C + design.mps.interface.c + incomplete doc + richard 1996-07-29 + +INTRODUCTION + + +Scope + +.scope: This document is the design for the Memory Pool System (MPS) interface +to the C Language, impl.h.mps. + + +Background + +.bg: See mail.richard.1996-07-24.10-57. + + +Document History + +.hist.0: The first draft of this document was generated in response to +review.impl.h.mps.10 which revealed the lack of a detailed design document and +also the lack of conventions for external interfaces. The aim of the draft was +to record this information, even if it isn't terribly well structured. + + +ANALYSIS + + +Goals + +.goal.c: The file impl.h.mps is the C external interface to the MPS. It is the +default interface between client code written in C and the MPS. .goal.cpp: +impl.h.mps is not specifically designed to be an interface to C++, but should +be usable from C++. + + +Requirements + +.req: The interface must provide an interface from client code written in C to +the functionality of the MPS required by the product (see req.product), Dylan +(req.dylan), and the Core RIP (req.epcore). + +mps.h may not include internal MPS header files such as "pool.h" etc. + +It is essential that the interface cope well with change, in order to avoid +restricting possible future MPS developments. This means that the interface +must be "open ended" in its definitions. This accounts for some of the +apparently tortuous methods of doing things (fmt_A, for example). The +requirement is that the MPS should be able to add new functionality, or alter +the implementation of existing functionality, without affecting existing client +code. A stronger requirement is that the MPS should be able to change without +_recompiling_ client code. This is not always possible. + +[.naming.global wqas presumable done in response to an unwritten requirement +regarding the use of the name spaces in C, perhaps something like +".req.name.iso: The interface shall not conflict in terms of naming with any +interfaces specified by ISO C and all reasonable future versions" and +".req.name.general: The interface shall use a documented and reasonably small +portion of the namespace so that clients can interoperate easily" drj +1998-10-01] + + +ARCHITECTURE + +.fig.arch: The architecture of the MPS Interface + + + +Just behind mps.h is the file mpsi.c, the "MPS interface layer" which does the +job of converting types and checking parameters before calling through to the +MPS proper, using internal MPS methods. + + +GENERAL CONVENTIONS + + +.naming: The external interface names should adhere to the documented interface +conventions; these are found in doc.mps.ref-man.if-conv(0).naming. +(paraphrased/recreated here) .naming.unixy: The external interface does not +follow the same naming conventions as the internal code. The interface is +designed to resemble a more conventional C, Unix, or Posix naming convention. +.naming.case: Identifiers are in lower case, except non-function-like macros, +which are in upper case. .naming.global: All publicised identifiers are +prefixed "mps_" or "MPS_". .naming.all: All identifiers defined by the MPS +should begin "mps_" or "MPS_" or "_mps_". .naming.type: Types are suffixed +"_t". .naming.struct: Structure types and tags are suffixed "_s". +.naming.union: Unions types and tags are suffixed "_u". + +.naming.scope: The naming conventions apply to all identifiers (see ISO C +clause 6.1.2); this includes names of functions, variables, types (through +typedef), structure and union tags, enumeration members, structure and union +members, macros, macro parameters, labels. .naming.scope.labels: labels (for +goto statements) should be rare, only in special block macros and probably not +even then. .naming.scope.other: The naming convention would also extend to +enumeration types and parameters in functions prototypes but both of those are +prohibited from having names in an interface file. + +.type.gen: The interface defines memory addresses as "void *" and sizes as +"size_t" for compatibility with standard C (in particular, with malloc etc.). +These types must be binary compatible with the internal types "Addr" and "Size" +respectively. Note that this restricts the definitions of the internal types +"Addr" and "Size" when the MPS is interfaced with C, but does not restrict the +MPS in general. + +.type.opaque: Opaque types are defined as pointers to structures which are +never defined. These types are cast to the corresponding internal types in +mpsi.c. + +.type.trans: Some transparent structures are defined. The client is expected +to read these, or poke about in them, under restrictions which should be +documented. The most important is probably the allocation point (mps_ap_s) +which is part of allocation buffers. The transparent structures must be binary +compatible with corresponding internal structures. For example, the fields of +mps_ap_s must corredpond with APStruct internally. This is checked by mpsi.c +in mps_check(). + +.type.pseudo: Some pseudo-opaque structures are defined. These only exist so +that code can be inlined using macros. The client code shouldn't mess with +them. The most important case of this is the scan state (mps_ss_s) which is +accessed by the in-line scanning macros, MPS_SCAN_* and MPS_FIX*. + +.type.enum: There should be no enumeration types in the interface. Note that +enum specifiers (to declare integer constants) are fine as long as no type is +declared. See guide.impl.c.misc.enum.type. + +.type.fun: Whenever function types or derived function types (such as pointer +to function) are declared a prototype should be used and the parameters to the +function should not be named. This includes the case where you are declaring +the prototype for an interface function. .type.fun.example: So use "extern +mps_res_t mps_alloc(mps_addr_t *, mps_pool_t, size_t, ...);" rather than +"extern mps_res_t mps_alloc(mps_addr_t *addr_return, mps_pool_t pool , size_t +size, ...);" and "typedef mps_addr_t (*mps_fmt_class_t)(mps_addr_t);" rather +then "typedef mps_addr_t (*mps_fmt_class_t)(mps_addr_t object);". See +guide.impl.c.misc.prototype.parameters. + + +Checking + +.check.space: When the space needs to be recovered from a parameter it is check +using AVER(CHECKT(Foo, foo)) before any attempt to access FooSpace(foo). +CHECKT (impl.h.assert) performs simple thread-safe checking of foo, so it can +be called outside of SpaceEnter/SpaceLeave. [perhaps this should be a special +macro. "AVER(CHECKT(" can look like the programmer made a mistake. drj +1998-11-05] + +.check.types: We use definitions of types in both our external interface and +our internal code, and we want to make sure that they are compatible. (The +external interface changes less often and hides more information.) At first, +we were just checking their sizes, which wasn't very good, but I've come up +with some macros which check the assignment compatibility of the types too. +This is a sufficiently useful trick that I thought I'd send it round. It may +be useful in other places where types and structures need to be checked for +compatibility at compile time. + +These macros don't generate warnings on the compilers I've tried. + +This macro checks the assignment compatibility of two lvalues. The hack is +that it uses sizeof to guarantee that the assignments have no effect. + +#define check_lvalues(_e1, _e2) \ + (sizeof((_e1) = (_e2)), sizeof((_e2) = (_e1)), sizeof(_e1) == sizeof(_e2)) + +This macro checks that two types are compatible and equal in size. The hack +here is that it generates an lvalue for each type by casting zero to a pointer +to the type. + +#define check_types(_t1, _t2) check_lvalues(*((_t1 *)0), *((_t2 *)0)) + +This macro just checks that the offset and size of two fields are the same. + +#define check_fields_approx(_s1, _f1, _s2, _f2) \ + (offsetof(_s1, _f1) == offsetof(_s2, _f2) && \ + sizeof(((_s1 *)0)->_f1) == sizeof(((_s2 *)0)->_f2)) + +This macro checks the offset, size, and compatibility of fields. + +#define check_fields(_s1, _f1, _s2, _f2) \ + (check_lvalues(((_s1 *)0)->_f1, ((_s2 *)0)->_f2) && \ + check_fields_approx(_s1, _f1, _s2, _f2)) + + +Binary Compatibility Issues + +As in Enumeration types are not allowed (see mail.richard.1995-09-08.09-28). + +There are two main aspects to run-time compatibility: binary interface and +protocol. The binary interface is all the information needed to correctly use +the library, and includes external symbol linkage, calling conventions, type +representation compatibility, structure layouts, etc. The protocol is how the +library is actually used by the client code -- whether this is called before +that -- and determines the semantic correctness of the client with respect to +the library. + +The binary interface is determined completely by the header file and the +target. The header file specifies the external names and the types, and the +target platform specifies calling conventions and type representation. There +is therefore a many-to-one mapping between the header file version and the +binary interface. + +The protocol is determined by the implementation of the library. + + +Constraints + +.cons: The MPS C Interface constrains the MPS in order to provide useful memory +management services to a C or C++ program. + +.cons.addr: The interface constrains the MPS address type, Addr +(design.mps.type.addr), to being the same as C's generic pointer type, void *, +so that the MPS can manage C objects in the natural way. + +.pun.addr: We pun the type of mps_addr_t (which is void *) into Addr (an +incomplete type, see design.mps.type.addr). This happens in the call to the +scan state's fix function, for example. + +.cons.size: The interface constrains the MPS size type, Size +(design.mps.type.size), to being the same as C's size type, size_t, so that the +MPS can manage C objects in the natural way. + +.pun.size: We pun the type of size_t in mps.h into Size in the MPM, as an +argument to the format methods. We assume this works. + +.cons.word: The MPS assumes that Word (design.mps.type.word) and Addr +(design.mps.type.addr) are the same size, and the interface constrains Word to +being the same size as C's generic pointer type, void *. + + +NOTES + +The file mpstd.h is the MPS target detection header. It decodes +preprocessor symbols which are predefined by build environments in order to +determine the target platform, and then defines uniform symbols, such as +MPS_ARCH_I3, for use internall by the MPS. + +There is a design document for the mps interface, design.mps.interface, but +it was written before we had the idea of having a C interface layer. It is +quite relevant, though, and could be updated. We should use it during the +review. + +All exported identifiers and file names should begin with mps_ or mps so +that they don't clash with other systems. + +We should probably have a specialized set of rules and a special checklist +for this interface. + +.fmt.extend: This paragraph should be an explanation of why mps_fmt_A is so +called. The underlying reason is future extensibility. + +.thread-safety: Most calls through this interface lock the space and therefore +make the MPM single-threaded. In order to do this they must recover the space +from their parameters. Methods such as ThreadSpace() must therefore be +callable when the space is _not_ locked. These methods are tagged with the tag +of this note. + +.lock-free: Certain functions inside the MPM are thread-safe and do not need to +be serialized by using locks. They are marked with the tag of this note. + +.form: Almost all functions in this implementation simply cast their arguments +to the equivalent internal types, and cast results back to the external type, +where necessary. Only exceptions are noted in comments. + diff --git a/mps/design/io/index.txt b/mps/design/io/index.txt new file mode 100644 index 00000000000..ada6965ed86 --- /dev/null +++ b/mps/design/io/index.txt @@ -0,0 +1,381 @@ + THE DESIGN OF THE MPS I/O SUBSYSTEM + design.mps.io + incomplete design + richard 1996-08-30 + +INTRODUCTION + +.intro: This document is the design of the MPS I/O Subsystem, a part of the +plinth. + +.readership: This document is intended for MPS developers. + + +History + +.hist.1: Document created from paper notes by Richard Brooksby, 1996-08-30. + +.hist.2: Updated with mail.richard.1997-05-30.16-13 and subsequent discussion +in the Pool Hall at Longstanton. (See also mail.drj.1997-06-05.15-20.) +richard 1997-06-10 + + +Background + +.bg: This design is partly based on the design of the Internet User Datagram +Protocol (UDP). Mainly I used this to make sure I hadn't left out anything +which we might need. + + +PURPOSE + +.purpose: The purpose of the MPS I/O Subsystem is to provide a means to +measure, debug, control, and test a memory manager build using the MPS. + +.purpose.measure: Measurement consists of emitting data which can be collected +and analysed in order to improve the attributes of application program, quite +possibly by adjusting parameters of the memory manager (see overview.mps.usage). + +.purpose.control: Control means adjusting the behaviour of the MM dynamically. +For example, one might want to adjust a parameter in order to observe the +effect, then transfer that adjustment to the client application later. + +.purpose.test: Test output can be used to ensure that the memory manager is +behaving as expected in response to certain inputs. + + +REQUIREMENTS + + +General + +.req.fun.non-hosted: The MPM must be a host-independent system. + +.req.attr.host: It should be easy for the client to set up the MPM for a +particular host (such as a washing machine). + + +Functional + +.req.fun.measure: The subsystem must allow the MPS to transmit quantitative +measurement data to an external tool so that the system can be tuned. + +.req.fun.debug: The subsystem must allow the MPS to transmit qualitative +information about its operation to an external tool so that the system can be +debugged. + +.req.fun.control: The subsystem must allow the MPS to receive control +information from an external tool so that the system can be adjusted while it +is running. + +.req.dc.env.no-net: The subsystem sould operate in environments where there is +no networking available. + +.req.dc.env.no-fs: The subsystem should operate in environments where there is +no filesystem available. + + +ARCHITECTURE + +.arch.diagram: + + - I/O Architecture Diagram + +.arch.int: The I/O Interface is a C function call interface by which the MPM +sends and receives "messages" to and from the hosted I/O module. + +.arch.module: The modules are part of the MPS but not part of the freestanding +core system (see design.mps.exec-env). The I/O module is responsible for +transmitting those messages to the external tools, and for receiving messages +from external tools and passing them to the MPM. + +.arch.module.example: For example, the "file implementation" might just +send/write telemetry messages into a file so that they can be received/read +later by an off-line measurement tool. + +.arch.external: The I/O Interface is part of interface to the freestanding core +system (see design.mps.exec-env). This is so that the MPS can be deployed in a +freestanding environment, with a special I/O module. For example, if the MPS +is used in a washing machine the I/O module could communicate by writing output +to the seven-segment display. + + +Example Configurations + +.example.telnet: This shows the I/O Subsystem communicating with a telnet +client over a TCP/IP connection. In this case, the I/O Subsystem is +translating the I/O Interface into an interactive text protocol so that the +user of the telnet client can talk to the MM. + + + +.example.file: This shows the I/O Subsystem dumping measurement data into a +file which is later read and analysed. In this case the I/O Subsystem is +simply writing out binary in a format which can be decoded. + + + +.example.serial: This shows the I/O Subsystem communicating with a graphical +analysis tool over a serial link. This could be useful for a developer who has +two machines in close proximity and no networking support. + +.example.local: In this example the application is talking directly to the I/O +Subsystem. This is useful when the application is a reflective development +environment (such as MLWorks) which wants to observe its own behaviour. + + + - MPS I/O Configuration Diagrams + + + +INTERFACE + +.if.msg: The I/O interface is oriented around opaque binary "messages" which +the I/O module must pass between the MPM and external tools. The I/O module +need not understand or interpret the contents of those messages. + +.if.msg.opaque: The messages are opaque in order to minimize the dependency of +the I/O module on the message internals. It should be possible for clients to +implement their own I/O modules for unusual environments. We do not want to +reveal the internal structure of our data to the clients. Nor do we want to +burden them with the details of our protocols. We'd also like their code to be +independent of ours, so that we can expand or change the protocols without +requiring them to modify their modules. + +.if.msg.dgram: Neither the MPM nor the external tools should assume that the +messages will be delivered in finite time, exactly once, or in order. This +will allow the I/O modules to be implemented using unreliable transport layers +such as the Internet User Datagram Protocl (UDP). It will also give the I/O +module the freedom to drop information rather than block on a congested +network, or stop the memory manager when the disk is full, or similar events +which really shouldn't cause the memory manager to stop working. The protocols +we need to implement at the high level can be design to be robust againt +lossage without much difficulty. + + +I/O Module State + +.if.state: The I/O module may have some internal state to preserve. The I/O +Interface defines a type for this state, "mps_io_t", a pointer to an incomplete +structure "mps_io_s". The I/O module is at liberty to define this structure. + + typedef struct mps_io_s *mps_io_t; + + +Message Types + +.if.type: The I/O module must be able to deliver messages of several different +types. It will probably choose to send them to different destinations based on +their type: telemetry to the measurement tool, debugging output to the +debugger, etc. + + typedef int mps_io_type_t; + enum { + MPS_IO_TYPE_TELEMETRY, + MPS_IO_TYPE_DEBUG + }; + + +Limits + +.if.message-max: The interface will define an unsigned integral constant +"MPS_IO_MESSAGE_MAX" which will be the maximum size of messages that the MPM +will pass to "mps_io_send" (.if.send) and the maximum size it will expect to +receive from "mps_io_receive". + + +Interface Set-up and Tear-down + +.if.create: The MPM will call "mps_io_create" to set up the I/O module. On +success, this function should return "MPS_RES_OK". It may also initialize +"*mps_io_r" to a "state" value which will be passed to subsequent calls through +the interface. + + extern mps_res_t mps_io_create(mps_io_t *mps_io_r); + +.if.destroy: The MPM will call "mps_io_destroy" to tear down the I/O module, +after which it guarantees that the state value "mps_io" will not be used +again. The "state" parameter is the state previously returned by +"mps_io_create" (.if.create). + + extern void mps_io_destroy(mps_io_t mps_io); + + +Message Send and Receive + +.if.send: The MPM will call "mps_io_send" when it wishes to send a message to a +destination. The "state" parameter is the state previously returned by +"mps_io_create" (.if.create). The "type" parameter is the type (.if.type) of +the message. The "message" parameter is a pointer to a buffer containing the +message, and "size" is the length of that message, in bytes. The I/O module +must make an effort to deliver the message to the destination, but is not +expected to guarantee delivery. The function should return "MPS_RES_IO" only +if a serious error occurs that should cause the MPM to return with an error to +the client application. Failure to deliver the message does not count. +[Should there be a timeout parameter? What are the timing constraints? +mps_io_send shouldn't block.] + + extern mps_res_t mps_io_send(mps_io_t state, + mps_io_type_t type, + void *message, + size_t size); + +.if.receive: The MPM will call "mps_io_receive" when it wants to see if a +message has been sent to it. The "state" parameter is the state previously +returned by "mps_io_create" (.if.create). The "buffer_o" parameter is a +pointer to a value which should be updated with a pointer to a buffer +containing the message received. The "size_o" parameter is a pointer to a +value which should be updated with the length of the message received. If +there is no message ready for receipt, the length returned should be zero. +(Should we be able to receive truncated messages? How can this be done neatly?) + + extern mps_res_t mps_io_receive(mps_io_t state, + void **buffer_o, + size_t *size_o); + + +I/O MODULE IMPLEMENTATIONS + + +Routeing + +The I/O module must decide where to send the various messages. A file-based +implementation could put them in different files based on their types. A +network-based implementation must decide how to address the messages. In +either case, any configuration must either be statically compiled into the +module, or else read from some external source such as a configuration file. + + +NOTES + +The external tools should be able to reconstruct stuff from partial info. For +example, you come across a fragment of an old log containing just a few old +messages. What can you do with it? + +Here's some completely untested code which might do the job for UDP. + +--- + +#include "mpsio.h" + +#include +#include +#include +#include +#include +#include + +typedef struct mps_io_s { + int sock; + struct sockaddr_in mine; + struct sockaddr_in telemetry; + struct sockaddr_in debugging; +} mps_io_s; + +static mps_bool_t inited = 0; +static mps_io_s state; + + +mps_res_t mps_io_create(mps_io_t *mps_io_o) +{ + int sock, r; + + if(inited) + return MPS_RES_LIMIT; + + state.mine = /* setup somehow from config */; + state.telemetry = /* setup something from config */; + state.debugging = /* setup something from config */; + + /* Make a socket through which to communicate. */ + sock = socket(AF_INET, SOCK_DGRAM, 0); + if(sock == -1) return MPS_RES_IO; + + /* Set socket to non-blocking mode. */ + r = fcntl(sock, F_SETFL, O_NDELAY); + if(r == -1) return MPS_RES_IO; + + /* Bind the socket to some UDP port so that we can receive messages. */ + r = bind(sock, (struct sockaddr *)&state.mine, sizeof(state.mine)); + if(r == -1) return MPS_RES_IO; + + state.sock = sock; + + inited = 1; + + *mps_io_o = &state; + return MPS_RES_OK; +} + + +void mps_io_destroy(mps_io_t mps_io) +{ + assert(mps_io == &state); + assert(inited); + + (void)close(state.sock); + + inited = 0; +} + + +mps_res_t mps_io_send(mps_io_t mps_io, mps_type_t type, + void *message, size_t size) +{ + struct sockaddr *toaddr; + + assert(mps_io == &state); + assert(inited); + + switch(type) { + MPS_IO_TYPE_TELEMETRY: + toaddr = (struct sockaddr *)&state.telemetry; + break; + + MPS_IO_TYPE_DEBUGGING: + toaddr = (struct sockaddr *)&state.debugging; + break; + + default: + assert(0); + return MPS_RES_UNIMPL; + } + + (void)sendto(state.sock, message, size, 0, toaddr, sizeof(*toaddr)); + + return MPS_RES_OK; +} + + +mps_res_t mps_io_receive(mps_io_t mps_io, + void **message_o, size_t **size_o) +{ + int r; + static char buffer[MPS_IO_MESSAGE_MAX]; + + assert(mps_io == &state); + assert(inited); + + r = recvfrom(state.sock, buffer, sizeof(buffer), 0, NULL, NULL); + if(r == -1) + switch(errno) { + /* Ignore interrupted system calls, and failures due to lack */ + /* of resources (they might go away.) */ + case EINTR: case ENOMEM: case ENOSR: + r = 0; + break; + + default: + return MPS_RES_IO; + } + + *message_o = buffer; + *size_o = r; + return MPS_RES_OK; +} + + +ATTACHMENTS + "O Architecture Diagram" + "O Configuration Diagrams" + diff --git a/mps/design/lib/index.txt b/mps/design/lib/index.txt new file mode 100644 index 00000000000..0ba457de0b4 --- /dev/null +++ b/mps/design/lib/index.txt @@ -0,0 +1,90 @@ + THE DESIGN OF THE MEMORY POOL SYSTEM LIBRARY INTERFACE + design.mps.lib + incomplete design + richard 1996-09-03 + +INTRODUCTION + +.intro: This document is the design of the MPS Library Interface, a part of the +plinth. + +.readership: Any MPS developer. Any clients that are prepared to read this in +order to get documentation. + + +GOALS + +.goal: The goals of the MPS library interface are: + +.goal.host: To control the dependency of the MPS on the hosted ISO C library so +that the core MPS remains freestanding (see design.mps.exec-env). + +.goal.free: To allow the core MPS convenient access to ISO C functionality that +is provided on freestanding platforms (see design.mps.exec-env.std.com.free). + + +DESCRIPTION + + +Overview + +.overview.access: The core MPS needs to access functionality that could be +provided by an ISO C hosted environment. + +.overview.hosted: The core MPS must not make direct use of any facilities in +the hosted environment (design.mps.exec-env). However, it is sensible to make +use of them when the MPS is deployed in a hosted environment. + +.overview.hosted.indirect: The core MPS does not make any direct use of hosted +ISO C library facilities. Instead, it indirects through the MPS Library +Interface, impl.h.mpslib. + +.overview.free: The core MPS can make direct use of freestanding ISO C library +facilities and does not need to include any of the header files , +, and directly. + +.overview.complete: The MPS Library Interface can be considered as the complete +"interface to ISO" (in that it provides direct access to facilities that we get +in a freestanding environment and equivalents of any functionality we require +from the hosted environment). + +.overview.provision.client: In a freestanding environment the client is +expected to provide functions meeting this interface to the MPS. + +.overview.provision.hosted: In a hosted environment, impl.c.mpsliban may be +used. It just maps impl.h.mpslib directly onto the ISO C library equivalents. + + + + - MPS Library Interface Diagram + + +Outside the Interface + +We provide impl.c.mpsliban to the client, for two reasons: + + 1. he can use it to connect the MPS to the ISO C library if it exists, + + 2. as an example of an implementation of the MPS Library Interface. + + +IMPLEMENTATION + +.impl: The MPS Library Interface comprises a header file impl.h.mpslib +(mpslib.h) and some documentation. + +.impl.decl: The header file defines the interface to definitions which parallel +those parts of the non-freestanding ISO headers which are used by the MPS. + +.impl.include: The header file also includes the freestanding headers +, , and (and not , though perhaps it +should). + + +NOTES + +.doc: User doc in doc.mps.guide.interface and doc.mps.guide.appendix-plinth. + +ATTACHMENT + "MPS Library Interface Diagram" + diff --git a/mps/design/lock/index.txt b/mps/design/lock/index.txt new file mode 100644 index 00000000000..5ab2ef966a3 --- /dev/null +++ b/mps/design/lock/index.txt @@ -0,0 +1,143 @@ + THE DESIGN OF THE LOCK MODULE + design.mps.lock + draft design + dsm 1995-11-21 + +PURPOSE + +.purpose: Support the locking needs of the thread-safe design. In particular: +- Recursive locks +- Binary locks +- Recursive "global" lock that need not be allocated or initialized by the +client +- Binary "global" lock that need not be allocated or initialized by the client + +.context: The MPS has to be able to operate in a multi-threaded environment. +The thread-safe design (design.mps.thread-safety) requires client-allocatable +binary locks, a global binary lock and a global recursive lock. An interface +to client-allocatable recursive locks is also present to support any potential +use, because of historic requirements, and because the implementation will +presumably be necessary anyway for the global recursive lock. + + +BACKGROUND + +.need: In an environment where multiple threads are accessing shared data. The +threads which access data which is shared with other threads need to cooperate +with those threads to maintain consistency. Locks provide a simple mechanism +for doing this. + +.ownership: A lock is an object which may be "owned" by a single thread at a +time. By claiming ownership of a lock before executing some piece of code a +thread can guarantee that no other thread owns the lock during execution of +that code. If some other thread holds a claim on a lock, the thread trying to +claim the lock will suspend until the lock is released by the owning thread. + +.data: A simple way of using this behaviour is to associate a lock with a +shared data structure. By claiming that lock around accesses to the data, a +consistent view of the structure can be seen by the accessing thread. More +generally any set of operations which are required to be mutally exclusive may +be performed so by using locks. + + +OVERVIEW + +.adt: There is an ADT "Lock" which points to a locking structure "LockStruct". +This structure is opaque to any client, although an interface is provided to +supply the size of the structure for any client wishing to make a new lock. +The lock is not allocated by the module as allocation itself may require +locking. LockStruct is implementation specific. + +.simple-lock: There are facilities for claiming and releasing locks. "Lock" is +used for both binary and recursive locking. + +.global-locks: "Global" locks are so called because they are used to protect +data in a global location (such as a global variable). The lock module provides +2 global locks; one recursive and one binary. There are facilities for +claiming and releasing both of these locks. These global locks have the +advantage that they need not be allocated or atomically initialized by the +client, so they may be used for locking the initialization of the allocator +itself. The binary global lock is intended to protect mutable data, possibly +in conjunction with other local locking strategies. The recursive global lock +is intended to protect static read-only data during one-off initialization. +See design.mps.thread-safety. + +.deadlock: This module does not provide any deadlock protection. Clients are +responsible for avoiding deadlock by using traditional strategies such as +ordering of locks. (See design.mps.thread-safety.deadlock.) + +.single-thread: In the single-threaded configuration, locks are not needed and +the claim/release interfaces defined to be no-ops. + + +DETAILED DESIGN + +.interface: The interface comprises the following functions: + +- LockSize + Return the size of a LockStruct for allocation purposes. + +- LockInit / LockFinish + After initialisation the lock is not owned by any thread. This must also be +the case before finalisation. +[ref convention?] + +- LockClaim / LockRelease + LockClaim: claims ownership of a lock that was previously not held by current +thread. + LockRelease: releases ownership of a lock that is currently owned. + +- LockClaimRecursive / LockReleaseRecursive + LockClaimRecursive: remembers the previous state of the lock with respect to +the current thread and claims the lock (if not already held). + LockReleaseRecursive: restores the previous state of the lock stored by +corresponding LockClaimRecursive call. + +- LockClaimGlobal / LockReleaseGlobal + LockClaimGlobal: claims ownership of the binary global lock which was +previously not held by current thread. + LockReleaseGlobal: releases ownership of the binary global lock that is +currently owned. + +- LockClaimGlobalRecursive / LockReleaseGlobalRecursive + LockClaimGlobalRecursive: remembers the previous state of the recursive +global lock with respect to the current thread and claims the lock (if not +already held). + LockReleaseGlobalRecursive: restores the previous state of the recursive +global lock stored by corresponding LockClaimGlobalRecursive call. + +.impl.recursive: For recursive claims, the list of previous states can be +simply implemented by keeping a count of the number of claims made by the +current thread so far. In multi-threaded implementation below this is handled +by the operating system. A count is still kept and used to check correctness. + +.impl.global: The binary and recursive global locks may actually be implemented +using the same mechanism as normal locks. + +.impl.ansi: Single-Threaded Generic Implementation +- single-thread +- no need for locking +- locking structure contains count +- provides checking in debug version +- otherwise does nothing except keep count of claims + +.impl.win32: Win32 Implementation +- supports Win32's threads +- uses Critical Sections [ref?] +- locking structure contains a Critical Section +- both recursive and non-recursive calls use same Windows function +- also performs checking + +.impl.linux: LinuxThreads Implementation (possibly suitable for all PThreads +implementations) +- supports LinuxThreads threads, which are an implementation of PThreads. (see +) +- locking structure contains a mutex, initialized to check for recursive locking +- locking structure contains a count of the number of active claims +- non-recursive locking calls pthread_mutex_lock and expects success +- recursive locking calls pthread_mutex_lock and expects either success or +EDEADLK (indicating a recursive claim). +- also performs checking + + + diff --git a/mps/design/locus/index.txt b/mps/design/locus/index.txt new file mode 100644 index 00000000000..8128abe86e9 --- /dev/null +++ b/mps/design/locus/index.txt @@ -0,0 +1,435 @@ + THE DESIGN FOR THE LOCUS MANAGER + design.mps.locus + incomplete design + gavinm 1998-02-27 + +INTRODUCTION + +.intro: The locus manager coordinates between the pools and takes the burden of +having to be clever about tract/group placement away from the pools, preserving +trace differentiability and contiguity where appropriate. + +.source: mail.gavinm.1998-02-05.17-52(0), mail.ptw.1998-02-05.19-53(0), +mail.pekka.1998-02-09.13-58(0), and mail.gavinm.1998-02-09.14-05(0). + + +Document History + +.hist.0: Originally written as part of change.dylan.box-turtle.170569. Much +developed since. gavinm 1998-02-27 + +.hist.1: Pekka wrote the real requirements after some discussion. pekka +1998-10-28 + +.hist.2: Pekka deleted Gavin's design and wrote a new one. pekka 1998-12-15 + + +DEFINITIONS + +.note.cohort: We use the word "cohort" in its usual sense here, but we're +particularly interested in cohorts that have properties relevant to tract +placement. It is such cohorts that the pools will try to organize using the +services of the locus manager. Typical properties would be trace +differentiability or (en masse) death-time +predictability. Typical cohorts would be instances of a non-generational pool, +or generations of a collection strategy. + +.def.trace.differentiability: Objects (and hence tracts) that are collected, +may or may not have "trace differentiability" from each other, depending on +their placement in the different zones. Objects (or pointers to them) can also +have trace differentiability (or not) from non-pointers in ambiguous +references; in practice, we will be worried about low integers, that may appear +to be in zones 0 or -1. + + +REQUIREMENTS + +.req.cohort: Tract allocations must specify the cohort they allocate in. These +kind of cohorts will be called loci, and they will have such attributes as are +implied by the other requirements. Critical. + +.req.counter.objects: As a counter-requirement, pools are expected to manage +objects. Objects the size of a tract allocation request (segment-sized) are +exceptional. Critical. .req.counter.objects.just: This means the locus +manager is not meant to solve the problems of allocating large objects, and it +isn't required to know what goes on in pools. + +.req.contiguity: Must support a high level of contiguity within cohorts when +requested. This means minimizing the number of times a cohort is made aware of +discontiguity. Essential (as we've effectively renegotiated this in SW, down +to a vague hope that certain critical cohorts are not too badly fragmented). +.req.contiguity.just: TSBA. + +.req.contiguity.specific: It should be possible to request another allocation +next to a specific tract on either side (or an extension in that direction, as +the case may be). Such a request can fail, if there's no space there. Nice. +[IWBN to have one for "next to the largest free block".] + +.req.differentiable: Must support the trace differentiability of segments that +may be condemned separately. Due to the limited number of zones, it must be +possible to place several cohorts into the same zone. Essential. + +.req.differentiable.integer: It must be possible to place collectable +allocations so that they are trace-differentiable from small integers. +Essential. + +.req.disjoint: Must support the disjointness of pages that have different VM +properties (such as mutable/immutable, read-only/read-write, and different +lifetimes). Optional. [I expect the implementation will simply work at page +or larger granularity, so the problem will not arise, but Tucker insisted on +stating this as a requirement. pekka 1998-10-28] + +.req.low-memory: The architecture of the locus manager must not prevent the +design of efficient applications that often use all available memory. +Critical. .req.low-memory.expl: This basically says it must be designed to +perform well in low-memory conditions, but that there can be configurations +where it doesn't do as well, as long as this is documented for the application +programmer. Note that it doesn't say all applications are efficient, only that +if you manage to design an otherwise efficient application, the locus manager +will not sink it. + +.req.address: Must conserve address space in VM arenas to a reasonable extent. +Critical. + +.req.inter-pool: Must support the association of sets of tracts in different +pools into one cohort. Nice. + +.req.ep-style: Must support the existing EP-style of allocation whereby +allocation is from one end of address space either upwards or downwards (or a +close approximation thereto with the same behavior). .req.ep-style.just: We +cannot risk disrupting a policy with well-known properties when this technology +is introduced. + +.req.attributes: There should be a way to inform the locus manager about +various attributes of cohorts that might be useful for placement: deathtime, +expected total size, [more in the future]. Optional. [It's a given that the +cohorts must then have these attributes, within the limits set in the contract +of the appropriate interface.] .req.attributes.action: The locus manager +should use the attributes to guide its placement decisions. Nice. + +.req.blacklisting: There should be a way of maintaining at least one blacklist +for pages (or some other small unit), that can not/should not be allocated to +collectable pools. [How to do blacklist breaking for ambiguous refs?] +Optional. + +.req.hysteresis: There should be a way to indicate which cohorts fluctuate in +size and by how much, to guide the arena hysteresis to hold on to suitable +pages. Optional. + + +ANALYSIS + +.anal.sw: Almost any placement policy would be an improvement on the current SW +one. + +.anal.cause-and-effect: The locus manager doesn't usually need to know _why_ +things need to be differentiable, disjoint, contiguous, etc. Abstracting the +reason away from the interface makes it more generic, more likely to have +serendipitous new uses. Attributes described by a quantity (deathtime, size, +etc.) are an exception to this, because we can't devise a common measure. + +.anal.stable: The strategy must be stable: it must avoid repeated +recomputation, especially the kind that switches between alternatives with a +short period (repeated "bites" out the same region or flip-flopping between two +regions). + +.anal.fragmentation: There's some call to avoid fragmentation in cohorts that +don't need strict contiguity, but this is not a separate requirement, since +fragmentation is a global condition, and can only be ameliorated if there's a +global strategy that clumps allocations together. .anal.deathtime: Cohorts +with good death-time clumping of their objects could use some locality of tract +allocation, because it increases the chances of creating large holes in the +address space (for other allocation to use). OTOH. many cohorts will not do +multiple frees in short succession, or at least cannot reasonably be predicted +to do so. This locality is not contiguity, nor is it low fragmentation, it's +just the requirement to place the new tracts next to the tract where the last +object was allocated in the cohort. Note that the placement of objects is +under the control of the pool, and the locus manager will not know it, +therefore this requirement should be pursued by requesting allocation next to a +particular tract (which we already have a requirement for). + +.anal.asymmetrical: The strategy has to be asymmetrical with respect to cohorts +growing and shrinking. The reason of this asymmetry is that it can choose +where to grow, but it cannot choose where to shrink (except in a small way by +growing with good locality). + + +INTERFACE + +.interface.locus: A cohort will typically reside on multiple tracts (and the +pools will avoid putting objects of other cohorts on them), so there should be +an interface to describe the properties of the cohort, and associate each +allocation request with the cohort. We shall call such an object, created to +represent a cohort, a locus (pl. loci). + +.interface.locus.pool: Loci will usually be created by the pool that uses it. +Some of the locus attributes will be inherited from client-specified pool +attributes [this means there will be additional pool attributes]. + +.interface.detail: This describes interface in overview; for details, see +implementation section and code, or user doc. + + +Loci + +.function.create: A function to create a locus: + Res LocusCreate(Locus *locusReturn, LocusAttrs attrs, ZoneGroup zg, +LocusAllocDesc adesc) +where adesc contains the information about the allocation sequences in the +locus, zg is used for zone differentiability, and attrs encodes the following: + .locus.contiguity: A locus can be contiguous. This means performing as +required in .req.contiguity, non-contiguous allocations can be freely placed +anywhere (but efficiency dictates that similar allocations are placed close +together and apart from others). + .locus.blacklist: Allocations in the locus will avoid blacklisted pages (for +collectable segments). + .locus.zero: Allocations in the locus are zero-filled. + [Other attributes will be added, I'm sure.] + +.interface.zone-group: The locus can be made a member of a zone group. Passing +ZoneGroupNONE means it's not a member of any group (allocations will be placed +without regard to zone, except to keep them out of stripes likely to be needed +for some group). [I propose no mechanism for managing zone groups at this +time, since it's only used internally for one purpose. pekka 2000-01-17] + +.interface.size: An allocation descriptor (LocusAllocDesc) contains various +descriptions of how the locus will develop over time (inconsistent +specifications are forbidden, of course): + .interface.size.typical-alloc: Size of a typical allocation in this locus, in +bytes. This will mainly affect the grouping of non-contiguous loci. + .interface.size.large-alloc: Typical large allocation that the manager should +try to allow for (this allows some relief from .req.counter.objects), in +bytes. This will mainly affect the size of gaps that will be allotted +adjoining this locus. + .interface.size.direction: Direction of growth: up/down/none. Only useful if +the locus is contiguous. + .interface.size.lifetime: Some measure of the lifetime of tracts (not +objects) in the cohort. [Don't know the details yet, probably only useful for +placing similar cohorts next to each other, so the details don't actually +matter. pekka 2000-01-17] + .interface.size.deathtime: Some measure of the deathtime of tracts (not +objects) in the cohort. [Ditto. pekka 2000-01-17] + +.function.init: LocusInit is like LocusCreate, but without the allocation. +This is the usual i/f, since most loci are embedded in a pool or something. + +.function.alloc: ArenaAlloc to take a locus arg. ArenaAllocHere is like it, +plus it takes a tract and a specification to place the new allocation +immediately above/below a given tract; if that is not possible, it returns +ResFAIL (this will make it useful for realloc functionality). + +.function.set-total: A function to tell the arena the expected number of +(non-miscible client) loci, and of zone groups: + ArenaSetTotalLoci(Arena arena, Size nLoci, Size nZoneGroups) + + +Peaks + +.function.peak.create: A function to create a peak: + mps_res_t mps_peak_create(mps_peak_t*, mps_arena_t) +A newly-created peak is open, and will not be used to guide the strategy of the +locus manager. + +.function.peak.add: A function to add a description of the state of one pool +into the peak: + mps_res_t mps_peak_describe_pool(mps_peak_t, mps_pool_t, mps_size_desc_t) +Calling this function again for the same peak and pool instance will replace +the earlier description. .function.peak.add.size: The size descriptor contains +a total size in bytes or % of arena size [@@@@is this right?]. +.function.peak.add.remove: Specifying a NULL size will remove the pool from the +peak. The client is not allowed to destroy a pool that is mentioned in any +peak; it must be first removed from the peak, or the peak must be destroyed. +This is to ensure that the client adjusts the peaks in a manner that makes +sense to the application; the locus manager can't know how to do that. + +.function.peak.close: A function to indicate that all the significant pools +have been added to the peak, and it can now be used to guide the locus manager: + mps_res_t mps_peak_close(mps_peak_t) +For any pool not described in the peak, the locus manager will take its current +size at any given moment as the best prediction of its size at the peak. +.function.peak.close.after: It is legal to add more descriptions to the peak +after closing, but this will reopen the peak, and it will have to be closed +before the locus manager will use it again. The locus manager uses the +previous closed state of the peak, while this is going on. + +.function.peak.destroy: A function to destroy a peak: + void mps_peak_destroy(mps_peak_t) + +.interface.ep-style: This satisfies .req.ep-style by allowing SW to specify +zero size for most pools (which will cause them to be place next to other loci +with the same growth direction). [Not sure this is good enough, but we'll try +it first. pekka 2000-01-17] + + +ARCHITECTURE + +Data Objects + +.arch.locus: To represent the cohorts, we have locus objects. Usually a locus +is embedded in a pool instance, but generations are separate loci. + +.arch.locus.attr: contiguity, blacklist, zg, current region, @@@@ + +.arch.locus.attr.exceptional: The client can define a typical large allocation +for the locus. Requests substantially larger than that are deemed exceptional. + +.arch.zone-group: To satisfy .req.condemn, we offer zone groups. Each locus +can be a member of a zone group, and the locus manager will attempt to place +allocations in this locus in different zones from all the other zone groups. A +zone-group is represented as @@@@. + +.arch.page-table: A page table is maintained by the arena, as usual to track +association between tracts, pools and segments, and mapping status for VM +arenas. + +.arch.region: All of the address space is divided into disjoint regions, +represented by region objects. These objects store their current limits, and +high and low watermarks of currently allocated tracts (we hope there's usually +a gap of empty space between regions). The limits are actually quite porous +and flexible. + +.arch.region.assoc: Each region is associated with one contiguous locus or any +number of non-contiguous loci (or none). We call the first kind of region +"contiguous". .arch.locus.assoc: Each locus remembers all regions where it has +tracts currently, excepting the badly-placed allocations (see below). It is +not our intention that any locus would have very many, or that loci that share +regions would have any reason to stop doing do. + +.arch.region.more: Various quantities used by the placement computation are +also stored in the regions and the loci. Regions are created (and destroyed) +by the placement recomputation. Regions are located in stripes (if it's a +zoned region), but they can extend into neighboring stripes if an exceptionally +large tract allocation is requested (to allow for large objects). + +.arch.chunk: Arenas may allocate more address space in additional chunks, which +may be disjoint from the existing chunks. Inter-chunk space will be +represented by dummy regions. There are also sentinel regions at both ends of +the address space. + + +Overview of Strategy + +.arch.strategy.delay: The general strategy is to delay placement decisions +until they have to be made, but no later. + +.arch.strategy.delay.until: Hence, the locus manager only makes placement +decisions when an allocation is requested (frees and other operations might set +a flag to cause the next allocation to redecide). This also allows the client +to change the peak and pool configuration in complicated ways without causing a +lot of recomputation, by doing all the changes without allocating in the middle +(unless the control pool needs more space because of the changes). + +.arch.strategy.normal: While we want the placement to be sophisticated, we do +not believe it is worth the effort to consider all the data at each +allocation. Hence, allocations are usually just placed in one of the regions +used previously (see .arch.alloc) without reconsidering the issues. + +.arch.strategy.normal.limit: However, the manager sets precautionary limits on +the regions to ensure that the placement decisions are revisited when an +irrevocable placement is about to be made. + +.arch.strategy.create: The manager doesn't create new regions until they are +needed for allocation (but it might compute where they could be placed to +accommodate a peak). + + +Allocation + +.arch.alloc: Normally, each allocation to a locus is placed in its current +region. New regions are only sought when necessary to fulfill an allocation +request or when there is reason to think the situation has changed +significantly (see .arch.significant). + +.arch.alloc.same: An allocation is first attempted next to the previous +allocation in the same locus, respecting growth direction. If that is not +possible, a good place in the current region is sought. .arch.alloc.same.hole: +ATM, for finding a good place within a region, we just use the current +algorithm, limited to the region. In future, the placement within regions will +be more clever. + +.arch.alloc.extend: If there's no adequate hole in the current region and the +request is not exceptional, the neighboring regions are examined to see if the +region could be extended at one border. (This will basically only be done if +the neighbor has shrunk since the last placement recomputation, because the +limit was set on sophisticated criteria, and should not be changed without +justification.) .arch.alloc.extend.here: When an allocation is requested next +to a specific tract (ArenaAllocHere), we try to extend a little harder [at +least for change_size, perhaps not for locality]. + +.arch.alloc.other: If no way can be found to allocate in the current region, +other regions used for this locus are considered in the same way, to see if +space can be found there. [Or probably look at other regions before trying to +extend anything?] + +.arch.alloc.recompute: When no region of this locus has enough space for the +request, or when otherwise required, region placement is recomputed to find a +new region for the request (which might be the same region, after extension). + +.arch.alloc.current: This region where the allocation was placed then becomes +the current region for this locus, except when the request was exceptional, or +when the region chosen was "bad" (see @@@@). + +.arch.significant: Significant changes to the parameters affecting placement +are deemed to have happened at certain client calls and when the total +allocation has changed substantially since the last recomputation. Such +conditions set a flag that causes the next allocation to recompute even if its +current region is not full [possibly second-guess the decision to recompute +after some investigation of the current state?]. + + +Deallocation + +.arch.free: Deallocation simply updates the counters in the region and the +locus. For some loci, it will make the region of the deallocation the current +region. +.arch.free.remove: If a region becomes entirely empty, it is deleted (and the +neighbors limits might be adjusted [quite tricky to get right, this]). + + +Region Placement Recomputation + +.arch.gap: When doing placement computations, we view the arena as a sequence +of alternating region cores and gaps (which can be small, even zero-sized). +Initially, we'll take the core of a region to be the area between the high and +low watermark, but in the future we might be more flexible about that. [Edge +determination is actually a worthwhile direction to explore.] + +.arch.reach: The gap between two cores could potentially end up being allocated +to either region, if they grow in that direction, or one or neither, if they +don't. The set of states that the region assignment could reach by assigning +the gaps to their neighbors is called the reach of the current configuration. + +.arch.placement.object: The object of the recomputation is to find a +configuration of regions that is not too far from the current configuration and +that keeps all the peaks inside its reach; if that is not possible, keep the +nearest ones in the reach and then minimize the total distance from the rest. + +.arch.placement.hypothetical: The configurations that are considered will +include hypothetical placements for new regions for loci that cannot fit in +their existing regions at the peak. This is necessary to avoid choosing a bad +alternative. + +.arch.placement.interesting: The computation will only consider new regions of +loci that are deemed interesting, i.e., far from their peak state. This will +reduce the computational burden and avoid jittering near a peak. + +[details missing] + + +IMPLEMENTATION + +[missing] + + +NOTES + +.idea.change: Even after the first segment, be prepared to change your mind, if +by the second segment a lot of new loci have been created. + +.distance: If the current state is far from a peak, there's time to reassign +regions and for free space to appear (in fact, under the steady arena +assumption, enough free space _will_ appear). + +.clear-pool: Need to have a function to deallocate all objects in a pool, so +that PoolDestroy won't have to be used for that purpose. + diff --git a/mps/design/message/index.txt b/mps/design/message/index.txt new file mode 100644 index 00000000000..0ae9407a868 --- /dev/null +++ b/mps/design/message/index.txt @@ -0,0 +1,363 @@ + MPS TO CLIENT MESSAGE PROTOCOL + design.mps.message + incomplete doc + drj 1997-02-13 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: The MCMP provides a means by which clients can receive messages from +the MPS asynchronously. Typical messages may be low memory notification (or in +general low utility), finalization notification, soft-failure notification. +There is a general assumption that it should not be disastrous for the MPS +client to ignore messages, but that it is probably in the clients best interest +to not ignore messages. The justification for this is that the MPS cannot +force the MPS client to read and act on messages, so no message should be +critical [bogus, since we cannot force clients to check error codes either - +Pekka 1997-09-17]. + +.contents: This document describes the design of the external and internal +interfaces and concludes with a sketch of an example design of an internal +client. The example is that of implementing finalization using PoolMRG. + + +REQUIREMENTS + +.req: The MPS/Client message protocol will be used for implementing +finalization (see design.mps.finalize and req.dylan.fun.final). It will also +be used for implementing the notification of various conditions (possibly +req.dylan.prot.consult is relevant here). + + +INTERFACE + + +External Interface + +.if.queue: + +Messages are presented as a single queue per arena. Various functions are +provided to inspect the queue and inspect messages in it (see below). + + +Functions + +.if.fun: + +The following functions are provided: + +.if.fun.poll: Poll. Sees whether there are any messages pending. + +mps_bool_t mps_message_poll(mps_arena_t arena); + +Returns 1 only if there is a message on the queue of arena. Returns 0 +otherwise. + +.if.fun.enable: Enable. Enables the flow of messages of a certain type. + +void mps_message_type_enable(mps_arena_t arena, mps_message_type_t type); + +Enables the specified message type. The queue of messages of a arena will +contain only messages whose types have been enabled. Initially all message +types are disabled. Effectively this function allows the client to declare to +the MPS what message types the client understands. The MPS does not generate +any messages of a type that hasn't been enabled. This allows the MPS to add +new message types (in subsequent releases of a memory manager) without +confusing the client. The client will only be receiving the messages if they +have explicitly enabled them (and the client presumably only enables message +types when they have written the code to handle them). + +.if.fun.disable: Disable. Disables the flow of messages of a certain type. + +void mps_message_type_disable(mps_arena_t arena, mps_message_type_t type); + +The antidote to mps_message_type_enable. Disables the specified message type. +Flushes any existing messages of that type on the queue, and stops any further +generation of messages of that type. This permits clients to dynamically +decline interest in a message type, which may help to avoid a memory leak or +bloated queue when the messages are only required temporarily. + +.if.fun.get: begins a message "transaction". + +mps_bool_t mps_message_get(mps_message_t *message_return, mps_arena_t arena, +mps_message_type_t type); + +If there is a message of the specified type on the queue then the first such +message will be removed from the queue and a handle to it will be returned to +the client in *messageReturn; in this case the function will return TRUE. +Otherwise it will return FALSE. Having obtained a handle on a message in this +way, the client can use the type-specific accessors to find out about the +message. When the client is done with the message the client should call +mps_message_discard; failure to do so will result in a resource leak. + +.if.fun.discard: ends a message "transaction". + +void mps_message_discard(mps_arena_t arena, mps_message_t message); + +Indicates to the MPS that the client is done with this message and its +resources may be reclaimed. + +.if.fun.type.any: Determines the type of a message in the queue + +mps_bool_t mps_message_queue_type(mps_message_type_t *type_return, mps_arena_t +arena); + +Returns 1 only if there is a message on the queue of arena, and in this case +updates *type_return to be the type of a message in the queue. Otherwise +returns 0. + +.if.fun.type: Determines the type of a message (that has already been got). + +mps_message_type_t mps_message_type(mps_arena_t arena, mps_message_t message) + +Return the type of the message. Only legal when inside a message transaction +(i.e. after mps_message_get and before mps_message_discard). Note that the +type will be the same as the type that the client passed in the call to +mps_message_get. + + +Types of messages + +.type: The type governs the "shape" and meaning of the message. + +.type.int: Types themselves will just be a scalar quantity, an integer. + +.type.semantics: A type indicates the semantics of the message. +.type.semantics.interpret: The semantics of a message are interpreted by the +client by calling various accessor methods on the message. .type.accessor: The +type of a message governs which accessor methods are legal to apply to the +message. + +.type.example: Some example types: + +.type.finalization: There will be a finalization type. The type is abstractly: +FinalizationMessage(Ref). + +.type.finalization.semantics: A finalization message indicates that an object +has been discovered to be finalizable (see design.mps.poolmrg.def.final.object +for a definition of finalizable). .type.finalization.ref: There is an accessor +to get the reference of the finalization message (i.e. a reference to the +object which is finalizable) called mps_message_finalization_ref. +.type.finalization.ref.scan: Note that the reference returned should be stored +in scanned memory. + + +.compatibility: + +Compatibility issues + +.compatibility.future.type-new: Notice that message of a type that the client +doesn't understand are not placed on the queue, therefore the MPS can introduce +new types of message and existing client will still function and will not leak +resources. This has been achieved by getting the client to declare the types +that the client understands (with mps_message_type_enable, .if.fun.enable). + +.compatibility.future.type-extend: The information available in a message of a +given type can be extended by providing more accessor methods. Old clients +won't get any of this information but that's okay. + + +Internal Interface + + +.message.instance: Messages are instances of Message Classes. +.message.concrete: Concretely a Message is represented by a MessageStruct. A +MessageStruct has the usual signature field (see design.mps.sig). A +MessageStruct has a type field which defines its type, a ring node, which is +used to attach the message to the queue of pending messages, a class field, +which identifies a MessageClass object. .message.intent: The intention is that +a MessageStruct will be embedded in some richer object which contains +information relevant to that specific type of message. + +.message.type: + +typedef struct MessageStruct *Message; + +.message.struct: + +struct MessageStruct { + Sig sig; + MessageType type; + MessageClass class; + RingStruct node; +} MessageStruct; + + +.class: A message class is an encapsulation of methods. It encapsulates +methods that are applicable to all types of messages (generic) and methods that +are applicable to messages only of a certain type (type-specific). +.class.concrete: Concretely a message class is represented by a +MessageClassStruct (a struct). Clients of the Message module are expected to +allocate storage for and initialise the MessageClassStruct. It is expected +that such storage will be allocated and initialised statically. + +.class.not-type: Note that message classes and message types are distinct. +.class.not-type.why: (see also mail.drj.1997-07-15.10-33(0) from which this is +derived) This allows two different implementations (ie classes) of messages +with the same meaning (ie type). This may be necessary because the (memory) +management of the messages may be different in the two implemtations (which is +bogus). The case of having one class implement two types is not expected to be +so useful. .class.not-type.why.not: It's all pretty feeble justification +anyway. + +.class.methods.generic: The generic methods are: + +delete - used when the message is destroyed (by the client calling +mps_message_discard). The class implementation should finish the message (by +calling MessageFinish) and storage for the message should be reclaimed (if +applicable). + +.class.methods.specific: + +The type specific methods are: + +.class.methods.specific.finalization: + +Specific to MessageTypeFinalization + +finalizationRef - returns a reference to the finalizable object represented by +this message. + +.class.methods.specific.collectionstats: + +Specific to MessageTypeCollectionStats + +collectionStatsLiveSize - returns the number of bytes (of objects) that were +condemned but survived. + +collectionStatsCondemnedSize - returns the number of bytes condemned in the +collection. + +collectionStatsNotCondemnedSize - returns the the number of bytes (of objects) +that are subject to a GC policy (ie collectable) but were not condemned in the +collection. + + +.class.type: + +typedef struct MessageClassStruct *MessageClass; + +.class.sig.double: The MessageClassStruct has a signature field at both ends. +This is so that if the MessageClassStruct changes size (by adding extra methods +for example) then any static initializers will generate errors from the +compiler (there will be a type error causes by initialising a non-sig type +field with a sig) unless the static initializers are changed as well. + +.class.struct: + +typedef struct MessageClassStruct { + Sig sig; /* design.mps.sig */ + const char *name; /* Human readable Class name */ + + /* generic methods */ + MessageDeleteMethod delete; /* terminates a message */ + + /* methods specific to MessageTypeFinalization */ + MessageFinalizationRefMethod finalizationRef; + + /* methods specific to MessageTypeCollectionStats */ + MessageCollectionStatsLiveSizeMethod collectionStatsLiveSize; + MessageCollectionStatsCondemnedSizeMethod collectionStatsCondemnedSize; + MessageCollectionStatsNotCondemnedSizeMethod collectionStatsNotCondemnedSize; + + Sig endSig; /* design.mps.message.class.sig.double */ +} MessageClassStruct; + + +.space.queue: The arena structure is augmented with a structure for managing +for queue of pending messages. This is a ring in the ArenaStruct. + +struct ArenaStruct +{ + ... + RingStruct messageRing; + ... +} + + +Functions + +.fun.init: +/* Initializes a message */ +void MessageInit(Arena arena, Message message, MessageClass class); + +Initializes the MessageStruct pointed to by message. The caller of this +function is expected to manage the store for the MessageStruct. + +.fun.finish: +/* Finishes a message */ +void MessageFinish(Message message); + +Finishes the MessageStruct pointed to by message. The caller of this function +is expected to manage the store for the MessageStruct. + +.fun.post: +/* Places a message on the client accessible queue */ +void MessagePost(Arena arena, Message message); + +This function places a message on the queue of a arena. .fun.post.precondition +: Prior to calling the function the node field of the message must be a +singleton. After the call to the function the message will be available for +MPS client to access. After the call to the function the message fields must +not be manipulated except from the message's class's method functions (i.e., +you mustn't poke about with the node field in particular). + +.fun.empty: +void MessageEmpty(Arena arena); + +Empties the message queue. This function has the same effect as discarding all +the messages on the queue. After calling this function there will be no +messages on the queue. .fun.empty.internal-only: This functionality is not +exposed to clients. We might want to expose this functionality to our clients +in the future. + + + + +Message Life Cycle + +.life: A message will be allocated by a client of the message module, it will +be initialised by calling MessageInit. The client will eventually post the +message on the external queue (in fact most clients will create a message and +then immediately post it). The message module may then apply any of the +methods to the message. The message module will eventually destroy the message +by applying the Delete method to it. + + +EXAMPLES + + +Finalization + +[possibly out of date, see design.mps.finalize and design.mps.poolmrg instead +-- drj 1997-08-28] + +This subsection is a sketch of how PoolMRG will use Messages for finalization +(see design.mps.poolmrg). + +PoolMRG has guardians (see design.mps.poolmrg.guardian), guardians are used to +manage final references and detect when an object is finalizable. + +The link part of a guardian will be expanded to include a MessageStruct; in +fact the link part of a guardian will be expanded so that it is exactly a +MessageStruct (or rather a structure with a single member that has the type +MessageStruct). + +The MessageStruct is allocated when the final reference is created (which is +when the referred to object is registered for finalization). This avoids +allocating at the time when the message gets posted (which might be a tricky, +undesirable, or impossible, time to allocate). + +The two queues of PoolMRG (the entry queue, and the exit queue) will use the +MessageStruct ring node. Before the object (referred to by the guardian) is +finalizable the MessageStruct is not needed by the Message system (there is no +message to send yet!), so it is okay to use the Message's ring node to attach +the guardian to the entry queue (see +design.mps.poolmrg.guardian.two-part.justify). The exit queue of MRG will +simply be the external message queue. + +MRG Message class + +del - frees both the link part and the reference part of the guardian. + diff --git a/mps/design/pool/index.txt b/mps/design/pool/index.txt new file mode 100644 index 00000000000..4df9d9f17ec --- /dev/null +++ b/mps/design/pool/index.txt @@ -0,0 +1,44 @@ + THE DESIGN OF THE POOL AND POOL CLASS MECHANISMS + design.mps.pool + incomplete doc + richard 1996-07-31 + +- This document must derive the requirements for pool.c etc. from the +architecture. + + + +.def.outer-structure: The "outer structure" (of a pool) is a C object of type +PoolXXXStruct or the type struct PoolXXXStruct itself. +.def.generic-structure: The "generic structure" is a C object of type +PoolStruct (found embedded in the outer-structure) or the type struct +PoolStruct itself. + +.align: When initialised, the pool gets the default alignment (ARCH_ALIGN). + +.no: If a pool class doesn't implement a method, and doesn't expect it to be +called, it should use a non-method (PoolNo*) which will cause an assertion +failure if they are reached. + +.triv: If a pool class supports a protocol but does not require any more than a +trivial implementation, it should use a trivial method (PoolTriv*) which will +do the trivial thing. + +.outer-structure.sig: It is good practice to put the signature for the outer +structure at the end (of the structure). This is because there's already one +at the beginning (in the poolStruct) so putting it at the end gives some extra +fencepost checking. + +REQUIREMENTS + +[Placeholder] + +.req.fix: PoolFix must be fast. + + +OTHER + +Interface in mpm.h +Types in mpmst.h +See also design.mps.poolclass + diff --git a/mps/design/poolamc/index.txt b/mps/design/poolamc/index.txt new file mode 100644 index 00000000000..c74d69a2e0e --- /dev/null +++ b/mps/design/poolamc/index.txt @@ -0,0 +1,446 @@ + THE DESIGN OF THE AUTOMATIC MOSTLY-COPYING MEMORY POOL CLASS + design.mps.poolamc + incomplete design + richard 1995-08-25 + +INTRODUCTION + +.intro: This is the design of the AMC Pool Class. AMC stands for Automatic +Mostly-Copying. This design is highly fragmentory and some may even be +sufficiently old to be misleading. + +.readership: The intended readership is any MPS developer. + + +OVERVIEW + +.overview: This class is intended to be the main pool class used by Harlequin +Dylan. It provides garbage collection of objects (hence "automatic"). It uses +generational copying algorithms, but with some facility for handling small +numbers of ambiguous references. Ambiguous references prevent the pool from +copying objects (hence "mostly copying"). It provides incremental collection. + +[ lot of this design is awesomely old -- drj 1998-02-04] + + +DEFINITIONS + +.def.grain: Grain. An quantity of memory which is both aligned to the pool's +alignment and equal to the pool's alignment in size. IE the smallest amount of +memory worth talking about. + + +DESIGN + +Segments + +.seg.class: AMC allocates segments of class AMCSegClass, which is a subclass of +GCSegClass. Instances contain a segTypeP field, which is of type int*. .seg.gen +: AMC organizes the segments it manages into generations. .seg.gen.map: Every +segment is in exactly one generation. .seg.gen.ind: The segment's segTypeP +field indicates which generation (that the segment is in) (an AMCGenStruct see +blah below). .seg.typep: The segTypeP field actually points to either the type +field of a generation or to the type field of a nail board. +.seg.typep.distinguish: The type field (which can be accessed in either case) +determines whether the segTypeP field is pointing to a generation or to a nail +board. .seg.gen.get: The map from segment to generation is implemented by +AMCSegGen which deals with all this. + + +Fixing and Nailing + +[.fix.nail.* are placeholders for design rather than design really-- drj +1998-02-04] +.fix.nail: + +.nailboard: AMC uses a nail board structure for recording ambiguous references +to segments. A nail board is a bit table with one bit per grain in the +segment. .nailboard.create: Nail boards are allocated dynamically whenever a +segment becomes newly ambiguously referenced. .nailboard.destroy: They are +deallocated during reclaim. Ambiguous fixes simply set the appropriate bit in +this table. This table is used by subsequent scans and reclaims in order to +work out what objects were marked. + +.nailboard.emergency: During emergency tracing two things relating to nail +boards happen that don't normally: .nailboard.emergency.nonew: Nail boards +aren't allocated when we have new ambiguous references to segments +(.nailbaord.emergency.nonew.justify: We could try and allocate a nail board, +but we're in emergency mode so short of memory so it's unlikely to succeed, and +there would be additional code for yet another error path which complicates +things); .nailboard.emergency.exact: nail boards are used to record exact +references in order to avoid copying the objects. .nailboard.hyper-c +onservative: Not creating new nail boards (.nailboard.emergency.nonew above) +means that when we have a new reference to a segment during emergency tracing +then we nail the entire segment and preserve everything in place. + +.fix.nail.states: + +Partition the segment states into 4 sets: + white segment and not nailed (and has no nail board) + white segment and nailed and has no nail board + white segment and nailed and has nail board + the rest + +.fix.nail.why: A segment is recorded as being nailed when either there is an +ambiguous reference to it, or there is an exact reference to it and the object +couldn't be copied off the segment (because there wasn't enough memory to +allocate the copy). In either of these cases reclaim cannot simply destroy the +segment (usually the segment will not be destroyed because it will have live +objects on it, though see .nailboard.limitations.middle below). If the segment +is nailed then we might be using a nail board to mark objects on the segment. +However, we cannot guarantee that being nailed implies a nail board, because we +might not be able to allocate the nail board. Hence all these states actually +occur in practice. + +.fix.nail.distinguish: The nailed bits in the segment descriptor (SegStruct) +are used to record whether a segment is nailed or not. The segTypeP field of +the segment either points to (the "type" field of) an AMCGen or to an +AMCNailBoard, the type field can be used to determine which of these is the +case. (see .seg.typep above). + +.nailboard.limitations.single: Just having a single nail board per segment +prevents traces from improving on the findings of each other: a later trace +could find that a nailed object is no longer nailed or even dead. Until the +nail board is discarded, that is. .nailboard.limitations.middle: An ambiguous +reference into the middle of an object will cause the segment to survive, even +if there are no surviving objects on it. .nailboard.limitations.reclaim: +AMCReclaimNailed could cover each block of reclaimed objects between two nailed +objects with a single padding object, speeding up further scans. + + +Emergency Tracing + +.emergency.fix: AMCFixEmergency is at the core of AMC's emergency tracing +policy (unsurprisingly). AMCFixEmergency chooses exactly one of three options: +a) use the existing nail board structure to record the fix, b) preserve and +nail the segment in its entirety, c) snapout an exact (or high rank) pointer to +a broken heart to the broken heart's forwarding pointer. If the rank of the +reference is AMBIG then it either does a) or b) depending on wether there is an +existing nail board or not. Otherwise (the rank is exact or higher) if there +is a broken heart it is used to snapout the pointer. Otherwise it is as for an +AMBIG ref (we either do a) or b)). + +.emergency.scan: This is basically as before, the only complication is that +when scanning a nailed segment we may need to do multiple passes, as +FixEmergency may introduce new marks into the nail board. + + +Buffers + +.buffer.class: AMC uses buffer of class AMCBufClass (a subclass of SegBufClass) +.buffer.gen: Each buffer allocates into exactly one generation. .buffer.gen: +AMCBuf buffer contain a gen field which points to the generation that the +buffer allocates into. .buffer.fill.gen: AMCBufferFill uses the generation +(obtained from the gen field) to initialise the segment's segTypeP field which +is how segments get allocated in that generation. + +.buffer.condemn: We condemn buffered segments, but not the contents of the +buffers themselves, because we can't reclaim uncommitted buffers (see +design.mps.buffer for details). If the segment has a forwarding buffer on it, +we detach it [why? @@@@ forwarding buffers are detached because they used to +cause objects on the same segment to not get condemned, hence caused retention +of garbage. Now that we condemn the non-buffered portion of buffered segments +this is probably unnecessary -- drj 1998-06-01 But it's probably more +efficient than keeping the buffer on the segment, because then the other stuff +gets nailed -- pekka 1998-07-10]. If the segment has a mutator buffer on it, +we nail the buffer. If the buffer cannot be nailed, we give up condemning, +since nailing the whole segment would make it survive anyway. The scan methods +skip over buffers and fix methods don't do anything to things that have already +been nailed, so the buffer is effectively black. + + +AMCStruct + +.struct: AMCStruct is the pool class AMC instance structure. .struct.pool: +Like other pool class instances, it contains a PoolStruct containing the +generic pool fields. + +.struct.format: The "format" field points to a Format structure describing the +object format of objects allocated in the pool. The field is intialized by +AMCInit from a parameter, and thereafter it is not changed until the pool is +destroyed. [actually the format field is in the generic PoolStruct these +days. drj 1998-09-21] + +[lots more fields here] + + + +Generations + +.gen: Generations partition the segments that a pool manages (see .seg.gen.map +above). .gen.collect: Generations are more or less the units of condemnation +in AMC. And also the granularity for forwarding (when copying objects during a +collection): all the objects which are copied out of a generation use the same +forwarding buffer for allocating the new copies, and a forwarding buffer +results in allocation in exactly one generation. + +.gen.rep: Generations are represented using an AMCGenStruct structure. + +.gen.create: All the generation are create when the pool is created (during +AMCInitComm). + +.gen.manage.ring: An AMC's generations are kept on a ring attached to the +AMCStruct (the genRing field). .gen.manage.array: They are also kept in an +array which is allocated when the pool is created and attached to the AMCStruct +(the gens field holds the number of generations, the gen field points to an +array of AMCGen). [it seems to me that we could probably get rid of the ring +-- drj 1998-09-22] + +.gen.number: There are AMCTopGen + 2 generations in total. "normal" +generations numbered from 0 to AMCTopGen inclusive and an extra "ramp" +generation (see .gen.ramp below). + +.gen.forward: Each generation has an associated forwarding buffer (stored in +the "forward" field of AMCGen). This is the buffer that is used to forward +objects out of this generation. When a generation is created in AMCGenCreate, +its forwarding buffer has a NULL p field, indicating that the forwarding buffer +has no generation to allocate in. The collector will assert out (in +AMCBufferFill where it checks that buffer->p is an AMCGen) if you try to +forward an object out of such a generation. .gen.forward.setup: All the +generation's forwarding buffer's are associated with generations when the pool +is created (just after the generations are created in AMCInitComm). + + +Ramps + +.ramp: Ramps usefully implement the begin/end mps_alloc_pattern_ramp interface. + +.gen.ramp: To implement ramping (request.dylan.170423), AMC uses a special +"ramping mode", where promotions are redirected. .gen.ramp.before: While +ramping, objects promoted from a designated (AMCRampGenFollows) generation are +forwarded into a special "ramp generation", instead of their usual generation. +.gen.ramp.itself: The ramp generation is promoted into itself during ramping +mode; after this mode ends, it is promoted into the generation after +AMCRampGenFollows (even if ramping mode is immediately re-entered, but only +once in that case). + +.ramp.mode: Ramping is controlled using the rampMode field of the pool. There +are five modes: + enum { outsideRamp, beginRamp, ramping, finishRamp, collectingRamp }; +[These would perhaps be better if they all start Ramp* or AMCRamp* drj +1998-08-07] +.ramp.count: The pool just counts the number of ap's that have begun ramp mode +(and not ended). .ramp.begin: Basically, when the count goes up from zero, the +pool enters into beginRamp mode; however, that doesn't happen if it is already +in finishRamp mode, thereby ensuring at least one (decision to start a) +collection when leaving ramp mode even if a new ramp starts immediately (but +see .ramp.collect below). When a new GC begins in beginRamp mode, and a +segment in generation AMCRampGenFollows is condemned, AMCWhiten switches the +generations to forward in the ramping way (.gen.ramp); the pool enters ramping +mode. (This assumes that each generation is condemned together with all lower +generations.) .ramp.end: After the ramp count goes back to zero, the pool +enters finishRamp mode, or outsideRamp directly, if there's no ramp generation +(this means we never collected generation AMCRampGenFollows, and hence never +switched the promotion). When a new GC begins in finishRamp mode (this GC +will always collect the ramp generation, because we jig the benefits to ensure +that), and a segment in generation AMCRampGenFollows is condemned, AMCWhiten +switches the generations to forward in the usual way (.gen.ramp); the pool +enters collectingRamp mode. .ramp.collect: The purpose of collectingRamp mode +is to ensure the pool will switch back into ramping if the ramp count goes +immediately back up, but not before having collected the ramp generation once. +So this mode tells AMCReclaim to check the ramp count, and change the mode to +beginRamp or outsideRamp. + +.ramp.collect-all: There are two flavours of ramp collections: the normal one +that collects the ramp generation and the younger ones, and the collect-all +flavour that does a full GC (this is a hack for producing certain Dylan +statistics). The collection will be of collect-all flavour, if any of the +RampBegins during the corresponding rank asked for that. Ramp beginnings and +collections are asynchronous, so we need two fields to implement this +behaviour: collectAll to indicate whether the ramp collection that is about to +start should be collect-all, and collectAllNext to keep track of whether the +current ramp has any requests for it. + + +Headers + +.header: AMC supports a fixed-size header on objects, with the client pointers +pointing after the header, rather than the base of the memory block. See +format documentation for details of the interface. .header.client: The code +mostly deals in client pointers, only computing the base and limit of a block +when these are needed (such as when an object is copied). In several places, +the code gets a block of some sort, a segment or a buffer, and creates a client +pointer by adding the header length (pool->format->headerLength). .header.fix: +There are two versions of the fix method, due to its criticality, with +(AMCHeaderFix) and without (AMCFix) headers. The correct one is selected in +AMCInitComm, and placed in the pool's fix field. This is the main reason why +fix methods dispatch through the instance, rather than the class like all other +methods. + + +OLD AND AGING NOTES BELOW HERE: + + +AMCFinish + +.finish: + +.finish.forward: + 103 /* If the pool is being destroyed it is OK to destroy */ + 104 /* the forwarding buffers, as the condemned set is about */ + 105 /* to disappear. */ + + +AMCBufferEmpty + +.flush: Removes the connexion between a buffer and a group, so that the group +is no longer buffered, and the buffer is reset and will cause a refill when +next used. + +.flush.pad: The group is padded out with a dummy object so that it appears full. + +.flush.expose: The buffer needs exposing before writing the padding object onto +it. If the buffer is being used for forwarding it might already be exposed, in +this case the segment attached to it must be covered when it leaves the +buffer. See .fill.expose. + +.flush.cover: The buffer needs covering whether it was being used for +forwarding or not. See .flush.expose. + + +AMCBufferFill + +.fill: + 185 * Reserve was called on an allocation buffer which was reset, + 186 * or there wasn't enough room left in the buffer. Allocate a group + 187 * for the new object and attach it to the buffer. + 188 * +.fill.expose: + 189 * .fill.expose: If the buffer is being used for forwarding it may + 190 * be exposed, in which case the group attached to it should be + 191 * exposed. See .flush.cover. + + +AMCBufferTrip + +.trip: + 239 * A flip occurred between a reserve and commit on a buffer, and + 240 * the buffer was "tripped" (limit set to zero). The object wasn't + 241 * scanned, and must therefore be assumed to be invalid, so the + 242 * reservation must be rolled back. This function detaches the + 243 * buffer from the group completely. The next allocation in the + 244 * buffer will cause a refill, and reach AMCFill. + + +AMCBufferFinish + +.buffer-finish: + 264 * Called from BufferDestroy, this function detaches the buffer + 265 * from the group it's attached to, if any. + + +AMCFix + +.fix: + 281 * fix an ambiguous reference to the pool + 282 * + 283 * Ambiguous references lock down an entire segment by removing it + 284 * from old-space and also marking it grey for future scanning. + 285 * + 286 * fix an exact, final, or weak reference to the pool + 287 * + 288 * These cases are merged because the action for an already + 289 * forwarded object is the same in each case. After that + 290 * situation is checked for, the code diverges. + 291 * + 292 * Weak references are either snapped out or replaced with + 293 * ss->weakSplat as appropriate. + 294 * + 295 * Exact and final references cause the referenced object to be copied t +o + 296 * new-space and the old copy to be forwarded (broken-heart installed) + 297 * so that future references are fixed up to point at the new copy. + 298 * + 299 * .fix.exact.expose: In order to allocate the new copy the + 300 * forwarding buffer must be exposed. This might be done more + 301 * efficiently outside the entire scan, since it's likely to happen + 302 * a lot. + 303 * + 304 * .fix.exact.grey: The new copy must be at least as grey as the old +one, + 305 * as it may have been grey for some other collection. + + +AMCGrey + +.grey: + 453 * Turns everything in the pool which is not condemned for a trace + 454 * grey. + + +AMCSegScan + +.seg-scan: + 485 * .seg-scan.blacken: One a group is scanned it is turned black, i.e. + 486 * the ti is removed from the grey TraceSet. However, if the + 487 * forwarding buffer is still pointing at the group it could + 488 * make it grey again when something is fixed, and cause the + 489 * group to be scanned again. We can't tolerate this at present, + 490 * the the buffer is flushed. The solution might be to scan buffers + 491 * explicitly. + +.seg-scan.loop: + 505 /* While the group remains buffered, scan to the limit of */ + 506 /* initialized objects in the buffer. Either it will be reached, */ + 507 /* or more objects will appear until the segment fills up and the */ + 508 /* buffer moves away. */ + +.seg-scan.finish: + 520 /* If the group is unbuffered, or becomes so during scanning */ + 521 /* (e.g. if the forwarding buffer gets flushed) then scan to */ + 522 /* the limit of the segment. */ + +.seg-scan.lower: + 540 /* The segment is no longer grey for this collection, so */ + 541 /* it no longer needs to be shielded. */ + + +AMCScan + +.scan: + 556 * Searches for a group which is grey for the trace and scans it. + 557 * If there aren't any, it sets the finished flag to true. + + +AMCReclaim + +.reclaim: + 603 * After a trace, destroy any groups which are still condemned for the + 604 * trace, because they must be dead. + 605 * + 606 * .reclaim.grey: Note that this might delete things which are grey + 607 * for other collections. This is OK, because we have conclusively + 608 * proved that they are dead -- the other collection must have + 609 * assumed they were alive. There might be a problem with the + 610 * accounting of grey groups, however. + 611 * + 612 * .reclaim.buf: If a condemned group still has a buffer attached, we + 613 * can't destroy it, even though we know that there are no live objects + 614 * there. Even the object the mutator is allocating is dead, because + 615 * the buffer is tripped. + + +AMCAccess + +.access: + 648 * This is effectively the read-barrier fault handler. + 649 * + 650 * .access.buffer: If the page accessed had and still has the + 651 * forwarding buffer attached, then trip it. The group will now + 652 * be black, and the mutator needs to access it. The forwarding + 653 * buffer will be moved onto a fresh grey page. + 654 * + 655 * .access.error: @@@@ There really ought to be some error recovery. + 656 * + 657 * .access.multi: @@@@ It shouldn't be necessary to scan more than + 658 * once. Instead, should use a multiple-fix thingy. This would + 659 * require the ScanState to carry a _set_ of traces rather than + 660 * just one. + + +OLD NOTES + + +Group Scanning + + diff --git a/mps/design/poolams/index.txt b/mps/design/poolams/index.txt new file mode 100644 index 00000000000..2a22eb77913 --- /dev/null +++ b/mps/design/poolams/index.txt @@ -0,0 +1,383 @@ + THE DESIGN OF THE AUTOMATIC MARK-AND-SWEEP POOL CLASS + design.mps.poolams + draft design + nickb 1997-08-14 + + +INTRODUCTION: + +This is the design of the AMS pool class. + +.readership: MM developers. + +.source: design.mps.buffer, design.mps.trace, design.mps.scan, +design.mps.action and design.mps.class-interface [none of these were actually +used -- pekka 1998-04-21]. No requirements doc [we need a req.mps that +captures the commonalities between the products -- pekka 1998-01-27]. + + +Document History + +.hist.0: Nick Barnes wrote down some notes on the implementation 1997-08-14. +Pekka P. Pirinen wrote the first draft design 1998-01-27. + +.hist.1: Pekka edited on the basis of review.design.mps.poolams.0, and +redesigned the colour representation (results mostly in +analysis.non-moving-colour(0)). + +.hist.2: Described subclassing and allocation policy. pekka 1999-01-04 + + + +OVERVIEW: + +This document describes the design of the AMS pool class. The AMS pool is a +proof-of-concept design for a mark-sweep pool in the MPS. It's not meant to be +efficient, but it could serve as a model for an implementation of a more +advanced pool (such as EPVM). + + +REQUIREMENTS: + +.req.mark-sweep: The pool must use a mark-and-sweep GC algorithm. + +.req.colour: The colour representation should be as efficient as possible. + +.req.incremental: The pool must support incremental GC. + +.req.ambiguous: The pool must support ambiguous references to objects in it +(but ambiguous references into the middle of an object do not preserve the +object). + +.req.format: The pool must be formatted, for generality. + +.req.correct: The design and the implementation should be simple enough to be +seen to be correct. + +.req.simple: Features not related to mark-and-sweep GC should initially be +implemented as simply as possible, in order to save development effort. + +.not-req.grey: We haven't figured out how buffers ought to work with a grey +mutator, so we use .req.correct to allow us to design a pool that doesn't work +in that phase. This is acceptable as long as we haven't actually implemented +grey mutator collection. + + +ARCHITECTURE: + +Subclassing + +.subclass: Since we expect to have many mark-and-sweep pools, we build in some +protocol for subclasses to modify various aspects of the behaviour. Notably +there's a subclassable segment class, and a protocol for performing iteration. + + +Allocation + +.align: We divide the segments in grains, each the size of the format +alignment. .alloc-bit-table: We keep track of allocated grains using a bit +table. This allows a simple implementation of allocation and freeing using the +bit table operators, satisfying .req.simple, and can simplify the GC routines. +Eventually, this should use some sophisticated allocation technique suitable +for non-moving automatic pools. + +.buffer: We use buffered allocation, satisfying .req.incremental. The AMC +buffer technique is reused, although it is not suitable for non-moving pools, +but req.simple allows us to do that for now. + +.extend: If there's no space in any existing segment, a new segment is +allocated. The actual class is allowed to decide the size of the new segment. + +.no-alloc: Do not support PoolAlloc, because we can't support one-phase +allocation for a scannable pool (unless we disallow incremental collection). +For exact details, see design.mps.buffer. + +.no-free: Do not support PoolFree, because automatic pools don't need explicit +free and having it encourages clients to use it (and therefore to have dangling +pointers, double frees, &c.) + + +Colours + +.colour: Objects in a segment which is _not_ condemned (for some trace) take +their colour (for this trace) from the segment. .colour.object: Since we need +to implement a non-copying GC, we keep track of the colour of each object in a +condemned segment separately. For this, we use bit tables with a bit for each +grain. This format is fast to access, has better locality than mark bits in +the objects themselves, and allows cheap interoperation with the allocation bit +table. As to the details, we follow analysis.non-moving-colour(0), with the +the analysis.non-moving-colour.free.black option [why?]. .colour.alloc-table: +We choose to keep a separate allocation table, for generality. + +.ambiguous.middle: We will allow ambiguous references into the middle of an +object (as required by .req.ambiguous), using the trick in +analysis.non-moving-colour.interior.ambiguous-only to speed up scanning. +.interior-pointer: Note that non-ambiguous interior pointers are outlawed. + +.colour.alloc: Objects are allocated black. This is the most efficient +alternative for traces in the black mutator phase, and .not-req.grey means +that's sufficient. [Some day, we need to think about allocating grey or white +during the grey mutator phase.] + + +Scanning + +.scan.segment: The tracer protocol requires (for segment barrier hits) that +there is a method for scanning a segment and turning all grey objects on it +black. This cannot be achieved with a single sequential sweep over the +segment, since objects that the sweep has already passed may become grey as +later objects are scanned. .scan.graph: For a non-moving GC, it is more +efficient to trace along the reference graph than segment by segment [it would +also allow passing type information from fix to scan]. Currently, the tracer +doesn't offer this option when it's polling for work. + +.scan.stack: Tracing along the reference graph cannot be done by recursive +descent, because we can't guarantee that the stack won't overflow. We can, +however, maintain an explicit stack of things to trace, and fall back on +iterative methods (.scan.iter) when it overflows and can't be extended. + +.scan.iter: As discussed in .scan.segment, when scanning a segment, we need to +ensure that there are no grey objects in the segment when the scan method +returns. We can do this by iterating a sequential scan over the segment until +nothing is grey (see .marked.scan for details). .scan.iter.only: Some +iterative method is needed as a fallback for the more advanced methods, and as +this is the simplest way of implementing the current tracer protocol, we will +start by implementing it as the only scanning method. + +.scan.buffer: We do not scan between ScanLimit and Limit of a buffer (see +.iteration.buffer), as usual [design.mps.buffer should explain why this works, +but doesn't. Pekka 1998-02-11]. + +.fix.to-black: When fixing a reference to a white object, if the segment does +not refer to the white set, the object cannot refer to the white set, and can +therefore be marked as black immediately (rather than grey). + + + +ANALYSIS: + +[This section intentionally left blank.] + + +IDEAS: + +[This section intentionally left blank.] + + +IMPLEMENTATION: + +Colour + +.colour.determine: Following the plan in .colour, if SegWhite(seg) includes the +trace, the colour of an object is given by the bit tables. Otherwise if +SegGrey(seg) includes the trace, all the objects are grey. Otherwise all the +objects are black. + +.colour.bits: As we only have searches for runs of zero bits, we use two bit +tables, the non-grey and non-white tables, but this is hidden beneath a layer +of macros talking about grey and white in positive terms. + +.colour.single: We have only implemented a single set of mark and scan tables, +so we can only condemn a segment for one trace at a time. This is checked for +in condemnation. If we want to do overlapping white sets, each trace needs its +own set of tables. + +.colour.check: The grey&white state is illegal, and free objects must be not +grey and not white as explained in analysis.non-moving-colour.free.black. + + +Iteration + +.iteration: Scan, reclaim and other operations need to iterate over all objects +in a segment. We abstract this into a single iteration function, even though +we no longer use it for reclaiming and rarely for scanning. + +.iteration.buffer: Iteration skips directly from ScanLimit to Limit of a +buffer. This is because this area may contain partially-initialized and +uninitialized data, which cannot be processed. [ScanLimit is used for reasons +which are not documented in design.mps.buffer.] Since the iteration skips the +buffer, callers need to take the appropriate action, if any, on it. + + +Scanning Algorithm + +.marked: Each segment has a 'marksChanged' flag, indicating whether anything in +it has been made grey since the last scan iteration (.scan.iter) started. This +flag only concerns the colour of objects with respect to the trace for which +the segment is condemned, as this is the only trace for which objects in the +segment are being made grey by fixing. Note that this flag doesn't imply that +there are grey objects in the segment, because the grey objects might have been +subsequently scanned and blackened. + +.marked.fix: The marksChanged flag is set TRUE by AMSFix when an object is made +grey. + +.marked.scan: AMSScan must blacken all grey objects on the segment, so it must +iterate over the segment until all grey objects have been seen. Scanning an +object in the segment might grey another one (.marked.fix), so the scanner +iterates until this flag is FALSE, setting it to FALSE before each scan. It is +safe to scan the segment even if it contains nothing grey. + +.marked.scan.fail: If the format scanner returns failure (see +protocol.mps.scanning [is that the best reference?]), we abort the scan in the +middle of a segment. So in this case the marksChanged flag is set back to +TRUE, because we may not have blackened all grey objects. + +.marked.unused: The marksChanged flag is meaningless unless the segment is +condemned. We make it FALSE in these circumstances. + +.marked.condemn: Condemnation makes all objects in a segment either black or +white, leaving nothing grey, so it doesn't need to set the marksChanged flag +which must already be FALSE. + +.marked.reclaim: When a segment is reclaimed, it can contain nothing marked as +grey, so the marksChanged flag must already be FALSE. + +.marked.blacken: When the tracer decides not to scan, but to call PoolBlacken, +we know that any greyness can be removed. AMSBlacken does this and resets the +marksChanged flag, if it finds that the segment has been condemned. + +.marked.clever: AMS could be clever about not setting the marksChanged flag, if +the fixed object is ahead of the current scan pointer. It could also keep low- +and high-water marks of grey objects, but we don't need to implement these +improvements at first. + + +Allocation + +.buffer-init: We take one init arg to set the Rank on the buffer, just to see +how it's done. + +.no-bit: As an optimization, we won't use the alloc bit table until the first +reclaim on the segment. Before that, we just keep a high-water mark. + +.fill: AMSBufferFill takes the simplest approach: it iterates over the segments +in the pool, looking for one which can be used to refill the buffer. +.fill.colour: The objects allocated from the new buffer must be black for all +traces (.colour.alloc), so putting it on a black segment (meaning one where +neither SegWhite(seg) nor SegGrey(seg) include the trace, see +.colour.determine) is obviously OK. White segments (where SegWhite(seg) +includes the trace) are also fine, as we can use the colour tables to make it +black (we don't actually have to adjust the tables, since free grains have the +same colour table encoding as black, see .colour.object). At first glance, it +seems we can't put it on a segment that is grey for some trace (one where where +SegWhite(seg) doesn't include the trace, but SegGrey(seg) does), because the +new objects would become grey as the buffer's ScanLimit advanced. We could +switch the segment over to using colour tables, but this becomes very hairy +when multiple traces are happening, so in that case, we'd be better off either +not attaching to grey segments or allowing grey allocation, wasteful as it is +[@@@@ decide which]. + +.fill.slow: AMSBufferFill gets progressively slower as more segments fill up, +as it laboriously checks whether the buffer can be refilled from each segment, +by inspecting the allocation bitmap. This is helped a bit by keeping count of +free grains in each segment, but it still spends a lot of time iterating over +all the full segments checking the free size. Obviously, this can be much +improved (we could keep track of the largest free block in the segment and in +the pool, or we could keep the segments in some more efficient structure, or we +could have a real free list structure). + +.fill.extend: If there's no space in any existing segment, the segSize method +is called to decide the size of the new segment to allocate. If that fails, +the code tries to allocate a segment that's just large enough to satisfy the +request. + +.empty: AMSBufferEmpty makes the unused space free, since there's no reason not +to. We don't have to adjust the colour tables, since free grains have the same +colour table encoding as black, see .colour.object. + +.reclaim.empty.buffer: Segments which after reclaim only contain a buffer could +be destroyed by trapping the buffer, but there's no point to this. + + +Initialization + +.init: The initialization method AMSInit() takes one additional argument: the +format of objects allocated in the pool. The pool alignment is set equal to +the format alignment (see design.mps.align). + +.init.internal: Subclasses call AMSInitInternal() to avoid the problems of +sharing va_list and emitting a superfluous PoolInitAMS event. + + +Condemnation + +.action: We use PoolCollectAct to condemn the whole pool (except the buffers) +at once. + +.condemn.buffer: Buffers are not condemned, instead they are coloured black, to +make sure that the objects allocated will be black, following .colour.alloc +(or, if you wish, because buffers are ignored like free space, so need the same +encoding). + +.benefit.guess: The benefit computation is pulled out of a hat; any real pool +class will need a real benefit computation. It will return a positive value +when the allocated size of the pool is over one megabyte and more than twice +what it was when the last segment in this pool was reclaimed (we call this +lastReclaimedSize). + +.benefit.repeat: We reset lastReclaimedSize when starting a trace in order to +avoid repeat condemnation (i.e., the next AMSBenefit returning 1.0 for the same +reason as the last). In the future we need to do better here. + + +Segment Merging and Splitting + +.split-merge: We provide methods for splitting and merging AMS segments. The +pool implementation doesn't cause segments to be split or merged - but a +subclass might want to do this (see .stress.split-merge). The methods serve as +an example of how to implement this facility. + +.split-merge.constrain: There are some additional constraints on what segments +may be split or merged: + +.split-merge.constrain.align: Segments may only be split or merged at an +address which is aligned to the pool alignment as well as to the arena +alignment. .split-merge.constrain.align.justify: This constraint is implied by +the design of allocation and colour tables, which cannot represent segments +starting at unaligned addresses. The constraint only arises if the pool +alignment is larger than the arena alignment. There's no requirement to split +segments at unaligned addresses. + +.split-merge.constrain.empty: The higher segment must be empty. I.e. the higher +segment passed to SegMerge must be empty, and the higher segment returned by +SegSplit must be empty. .split-merge.constrain.empty.justify: This constraint +makes the code significantly simpler. There's no requirement for a more complex +solution at the moment (as the purpose is primarily pedagogic). + +.split-merge.fail: The split and merge methods are not proper anti-methods for +each other (see design.mps.seg.split-merge.fail.anti.no). Methods will not +reverse the side-effects of their counterparts if the allocation of the colour +and allocation bit tables should fail. Client methods which over-ride split and +merge should not be written in such a way that they might detect failure after +calling the next method, unless they have reason to know that the bit table +allocations will not fail. + + + + + +TESTING: + +.stress: There's a stress test, MMsrc!amsss.c, that does 800 KB of allocation, +enough for about three GCs. It uses a modified Dylan format, and checks for +corruption by the GC. Both ambiguous and exact roots are tested. + +.stress.split-merge: There's also a stress test for segment splitting and +merging, MMsrc!segsmss.c. This is similar to amsss.c - but it defines a +subclass of AMS, and causes segments to be split and merged. Both buffered and +non-buffered segments are split / merged. + + +TEXT: + +.addr-index.slow: Translating from an address to and from a grain index in a +segment uses macros such as AMSAddrIndex and AMSIndexAddr. These are slow +because they call SegBase on every translation. + +.grey-mutator: To enforce the restriction set in .not-req.grey we check that +all the traces are flipped in AMSScan. It would be good to check in AMSFix as +well, but we can't do that, because it's called during the flip, and we can't +tell the difference between the flip and the grey mutator phases with the +current tracer interface. + diff --git a/mps/design/poolawl/index.txt b/mps/design/poolawl/index.txt new file mode 100644 index 00000000000..b7feacd065b --- /dev/null +++ b/mps/design/poolawl/index.txt @@ -0,0 +1,464 @@ + AUTOMATIC WEAK LINKED + design.mps.poolawl + incomplete doc + drj 1997-03-11 + +INTRODUCTION + +.readership: Any MPS developer + +.intro: The AWL (Automatic Weak Linked) pool is used to manage Dylan Weak +Tables (see req.dylan.fun.weak). Currently the design is specialised for Dylan +Weak Tables, but it could be generalised in the future. + + + +REQUIREMENTS + +See req.dylan.fun.weak. + +See meeting.dylan.1997-02-27(0) where many of the requirements for this pool +were first sorted out. + +Must satisfy request.dylan.170123. + +.req.obj-format: Only objects of a certain format need be supported. This +format is a subset of the Dylan Object Format. The pool uses the first slot in +the fixed part of an object to store an association. See +mail.drj.1997-03-11.12-05 + + +DEFINITIONS + +.def.grain: alignment grain, grain. A grain is a range of addresses where both +the base and the limit of the range are aligned and the size of range is equal +to the (same) alignment. In this context the alignment is the pool's alignment +(pool->alignment). The grain is the unit of allocation, marking, scanning, etc. + + +OVERVIEW + +.overview: +.overview.ms: The pool is mark and sweep. .overview.ms.justify: Mark-sweep +pools are slightly easier to write (than moving pools), and there are no +requirements (yet) that this pool be high performance or moving or anything +like that. .overview.alloc: It is possible to allocate weak or exact objects +using the normal reserve/commit AP protocol. .overview.alloc.justify: +Allocation of both weak and exact objects is required to implement Dylan Weak +Tables. Objects are formatted; the pool uses format A. .overview.scan: The +pool handles the scanning of weak objects specially so that when a weak +reference is deleted the corresponding reference in an associated object is +deleted. The associated object is determined by using information stored in +the object itself (see .req.obj-format). + + +INTERFACE + +.if.init: The init method takes one extra parameter in the vararg list. This +parameter should have type Format and be a format object that describes the +format of the objects to be allocated in this pool. The format should support +scan and skip methods. There is an additional restriction on the layout of +objects, see .req.obj-format. + +.if.buffer: The BufferInit method takes one extra parameter in the vararg +list. This parameter should be either RankEXACT or RankWEAK. It determines +the rank of the objects allocated using that buffer. + + +DATASTRUCTURES + +.sig: This signature for this pool will be 0x519bla3l (SIGPooLAWL) + +.poolstruct: The class specific pool structure is +struct AWLStruct { + PoolStruct poolStruct; + Format format; + Shift alignShift; + ActionStruct actionStruct; + double lastCollected; + Serial gen; + Sig sig; +} +.poolstruct.format: The format field is used to refer to the object format. +The object format is passed to the pool during pool creation. +.poolstruct.alignshift: The alignShift field is the SizeLog2 of the pool's +alignment. It is computed and initialised when a pool is created. It is used +to compute the number of alignment grains in a segment which is the number of +bits need in the segment's mark and alloc bit table (see .awlseg.bt, +.awlseg.mark, and .awlseg.alloc below). @@ clarify +.poolstruct.actionStruct: Contains an Action which is used to participate in +the collection benefit protocol. See .fun.benefit AWLBenefit below for a +description of the algorithm used for determining when to collect. +.poolstruct.lastCollected: Records the time (using the mutator total allocation +clock, ie that returned by ArenaMutatorAllocSize) of the most recent call to +either AWLInit or AWLTraceBegin for this pool. So this is the time of the +beginning of the last collection of this pool. Actually this isn't true +because the pool can be collected without AWLTraceBegin being called (I think) +as it will get collected by being in the same zone as another pool/generation +that is being collected (which it does arrange to be, see the use of the gen +field in .poolstruct.gen below and .fun.awlsegcreate.where below). +.poolstruct.gen: This part of the mechanism by which the pool arranges to be in +a particular zone and arranges to be collected simulataneously with other +cohorts in the system. gen is the generation that is used in expressing a +generation preference when allocating a segment. The intention is that this +pool will get collected simulataneously with any other segments that are also +allocated using this generation preference (when using the VM arena, generation +preferences get mapped more or less to zones, each generation to a unique set +of zones in the ideal case). Whilst AWL is not generational it is expected +that this mechanism will arrange for it to be collected simultaneously with +some particular generation of AMC. +.poolstruct.gen.1: At the moment the gen field is set for all AWL pools to be 1. + +.awlseg: The pool defines a segment class AWLSegClass, which is a subclass of +GCSegClass (see design.mps.seg.over.hierarchy.gcseg). All segments allocated by +the pool are instances of this class, and are of type AWLSeg, for which the +structure is +struct AWLSegStruct { + GCSegStruct gcSegStruct; + BT mark; + BT scanned; + BT alloc; + Count grains; + Count free; + Count singleAccesses; + AWLStatSegStruct stats; + Sig sig; +} + +.awlseg.bt: The mark, alloc, and scanned fields are bit-tables (see +design.mps.bt). Each bit in the table corresponds to a a single alignment +grain in the pool. +.awlseg.mark: The mark bit table is used to record mark bits during a trace. +Condemn (see .fun.condemn below) sets all the bits of this table to zero. Fix +will read and set bits in this table. Currently there is only one mark bit +table. This means that the pool can only be condemned for one trace. +.awlseg.mark.justify: This is simple, and can be improved later when we want to +run more than one trace. +.awlseg.scanned: The scanned bit-table is used to note which objects have been +scanned. Scanning (see .fun.scan below) a segment will find objects that are +marked but not scanned, scan each object found and set the corresponding bits +in the scanned table. +.awlseg.alloc: The alloc bit table is used to record which portions of a +segment have been allocated. Ranges of bits in this table are set when a +buffer is attached to the segment. When a buffer is flushed (ie AWLBufferEmpty +is called) from the segment, the bits corresponding to the unused portion at +the end of the buffer are reset. +.awlseg.alloc.invariant: A bit is set in the alloc table <=> (the corresponding +address is currently being buffered || the corresponding address lies within +the range of an allocated object). +.awlseg.grains: The grains field is the number of grains that fit in the +segment. Strictly speaking this is not necessary as it can be computed from +SegSize and awl's alignment, however, precalculating it and storing it in the +segment makes the code simpler by avoiding lots of repeated calculations. +.awlseg.free: A conservative estimate of the number of free grains in the +segment. It is always guaranteed to be >= the number of free grains in the +segment, hence can be used during allocation to quickly pass over a segment. +Maintained by blah and blah. @@@@ Unfinished obviously. + + +FUNCTIONS + + +@@ How will pool collect? It needs an action structure. + +External + +.fun.init: + +Res AWLInit(Pool pool, va_list arg); + +AWLStruct has four fields, each one needs initializing. + +.fun.init.poolstruct: The poolStruct field has already been initialized by +generic code (impl.c.pool). +.fun.init.format: The format will be copied from the argument list, checked, +and written into this field. +.fun.init.alignshift: The alignShift will be computed from the pool alignment +and written into this field. +.fun.init.sig: The sig field will be initialized with the signature for this +pool. + +.fun.finish: + +Res AWLFinish(Pool pool); + +Iterates over all segments in the pool and destroys each segment (by calling +SegFree). +Overwrites the sig field in the AWLStruct. Finishing the generic pool +structure is done by the generic pool code (impl.c.pool). + +.fun.alloc: + +PoolNoAlloc will be used, as this class does not implement alloc. + +.fun.free: + +PoolNoFree will be used, as this class does not implement free. + +.fun.fill: + +Res AWLBufferFill(Seg *segReturn, Addr *baseReturn, Pool pool, Buffer buffer, +Size size); + +This zips round all the the segments applying AWLSegAlloc to each segment that +has the same rank as the buffer. AWLSegAlloc attempts to find a free range, if +it finds a range then it may be bigger than the actual request, in which case +the remainder can be used to "fill" the rest of the buffer. If no free range +can be found in an existing segment then a new segment will be created (which +is at least large enough). The range of buffered addresses is marked as +allocated in the segment's alloc table. + +.fun.empty: + +void AWLBufferEmpty(Pool pool, Buffer buffer); + +Locates the free portion of the buffer, that is the memory between the init and +the limit of the buffer and records these locations as being free in the +relevant alloc table. The segment that the buffer is pointing at (which +contains the alloc table that needs to be dinked with) is available via +BufferSeg. + +.fun.benefit: The benefit returned is the total amount of mutator allocation +minus the lastRembemberedSize minus 10 Megabytes, so the pool becomes an +increasingly good candidate for collection at a constant (mutator allocation) +rate, crossing the 0 line when there has been 10Mb of allocation since the +(beginning of the) last collection. So it gets collected approximately every +10Mb of allocation. Note that it will also get collected by virtue of being in +the same zone as some AMC generation (assuming there are instantiated AMC +pools), see .poolstruct.gen above. + +.fun.condemn: + +Res AWLCondemn(Pool pool, Trace trace, Seg seg); + +The current design only permits each segment to be condemned for one trace (see +.awlseg.mark). This function checks that the segment is not condemned for any +trace (seg->white == TraceSetEMPTY). The segment's mark bit-table is reset, +and the whiteness of the seg (seg->white) has the current trace added to it. + + +.fun.grey: + +void AWLGrey(Pool pool, Trace trace, Seg seg); + +if the segment is not condemned for this trace the segment's mark table is set +to all 1s and the segment is recorded as being grey. + +.fun.scan: + +Res AWLScan(ScanState ss, Pool pool, Seg seg); + +.fun.scan.overview: The scanner performs a number of passes over the segment, +scanning each marked and unscanned (grey) object that is finds. +.fun.scan.overview.finish: It keeps perform a pass over the segment until it is +finished. .fun.scan.overview.finish.condition: A condition for finishing is +that no new marks got placed on objects in this segment during the pass. +.fun.scan.overview.finish.approximation: We use an even stronger condition for +finishing that assumes that scanning any object may introduce marks onto this +segment. It is finished when a pass results in scanning no objects (ie all +objects were either unmarked or both marked and scanned). + +.fun.scan.overview.finished-flag: There is a flag called 'finished' which keeps +track of whether we should finish or not. We only ever finish at the end of a +pass. At the beginning of a pass the flag is set. During a pass if any +objects are scanned then the finished flags is reset. At the end of a pass if +the finished flag is still set then we are finished. No more passes take place +and the function returns. + +.fun.scan.pass: A pass consists of a setup phase and a repeated phase. + +.fun.scan.pass.buffer: The following assumes that in the general case the +segment is buffered; if the segment is not buffered then the actions that +mention buffers are not taken (they are unimportant if the segment is not +buffered). + +.fun.scan.pass.p: The pass uses a cursor called 'p' to progress over the +segment. During a pass p will increase from the base address of the segment to +the limit address of the segment. When p reaches the limit address of the +segment, the pass in complete. + +.fun.scan.pass.setup: p initially points to the base address of the segment. + +.fun.scan.pass.repeat: The following comprises the repeated phase. The +repeated phase is repeated until the pass completion condition is true (ie p +has reached the limit of the segment, see .fun.scan.pass.p above and +.fun.scan.pass.repeat.complete below). + +.fun.scan.pass.repeat.complete: if p is equal to the segment's limit then we +are done. We proceed to check whether any further passes need to be performed +(see .fun.scan.pass.more below). + +.fun.scan.pass.repeat.free: if !alloc(p) (the grain is free) then increment p +and return to the beginning of the loop. + +.fun.scan.pass.repeat.buffer: if p is equal to the buffer's ScanLimit (see +BufferScanLimit), then set p equal to the buffer's Limit (use BufferLimit) and +return to the beginning of the loop. + +.fun.scan.pass.repeat.object-end: The end of the object is located using the +format->skip method. + +.fun.scan.pass.repeat.object: if (mark(p) && !scanned(p)) then the object +pointed at is marked but not scanned, which means we must scan it, otherwise we +must skip it. .fun.scan.pass.repeat.object.dependent: To scan the object the +object we first have to determine if the object has a dependent object (see +.req.obj-format). .fun.scan.pass.repeat.object.dependent.expose: If it has a +dependent object then we must expose the segment that the dependent object is +on (only if the dependent object actually points to MPS managed memory) prior +to scanning and cover the segment subsequent to scanning. +.fun.scan.pass.repeat.object.dependent.summary: The summary of the dependent +segment must be set to RefSetUNIV to reflect the fact that we are allowing it +to be written to (and we don't know what gets written to the segment). +.fun.scan.pass.repeat.object.scan: The object is then scanned by calling the +format's scan method with base and limit set to the beginning and end of the +object (.fun.scan.scan.improve.single: A scan1 format method would make it +slightly simpler here). Then the finished flag is cleared and the bit in the +segment's scanned table is set. + +.fun.scan.pass.repeat.advance: p is advanced past the object and we return to +the beginning of the loop. + +.fun.scan.pass.more: At the end of a pass the finished flag is examined. +.fun.scan.pass.more.not: If the finished flag is set then we are done (see +.fun.scan.overview.finished-flag above), AWLScan returns. +.fun.scan.pass.more.so: Otherwise (the finished flag is reset) we perform +another pass (see .fun.scan.pass above). + + +.fun.fix: + +Res AWLFix(Pool pool, ScanState ss, Seg seg, Ref *refIO); + +ss->wasMarked is set to TRUE (clear compliance with +design.mps.fix.protocol.was-marked.conservative). + +If the rank (ss->rank) is RankAMBIG then fix returns immediately unless the +reference is aligned to the pool alignment. + +If the rank (ss->rank) is RankAMBIG then fix returns immediately unless the +referenced grain is allocated. + +The bit in the marked table corresponding to the referenced grain will be +read. If it is already marked then fix returns. Otherwise (the grain is +unmakred), ss->wasMarked is set to FALSE, the remaining actions depend on +whether the rank (ss->rank) is Weak or not. If the rank is weak then the +reference is adjusted to 0 (see design.mps.weakness) and fix returns. If the +rank is something else then the mark bit corresponding to the referenced grain +is set, and the segment is greyed using TraceSegGreyen. + +Fix returns. + + +.fun.reclaim: + +void AWLReclaim(Pool pool, Trace trace, Seg seg); + +This iterates over all allocated objects in the segment and frees objects that +are not marked. +When this iteration is complete the marked array is completely reset. + +p point to base of segment + +while(p < SegLimit(seg) { + if(!alloc(p)) { ++p;continue;} + q = skip(p) (ie q points to just past the object pointed at by p) + if !marked(p) free(p, q); free(p, q) consists of resetting the bits in the +alloc table from p to q-1 inclusive. + p = q +} + +Reset the entire marked array using BTResRange. + +.fun.reclaim.improve.pad: Consider filling free ranges with padding objects. +Now reclaim doesn't need to check that the objects are allocated before +skipping them. There may be a corresponding change for scan as well. + + +.fun.describe: + +Res AWLDescribe(Pool pool, mps_lib_FILE *stream); + + +Internal: + +.fun.awlsegcreate: + +Res AWLSegCreate(AWLSeg *awlsegReturn, Size size); + +Creates a segment of class AWLSegClass of size at least size. +.fun.awlsegcreate.size.round: size is rounded up to an ArenaAlign before +requesting the segment. .fun.awlsegcreate.size.round.justify: The arena +requires that all segment sizes are aligned to the ArenaAlign. +.fun.awlsegcreate.where: The segment is allocated using a generation +perference, using the generation number stored in the AWLStruct (the gen +field), see .poolstruct.gen above. + +.fun.awlseginit: + +Res awlSegInit(Seg seg, Pool pool, Addr base, Size size, + Bool reservoirPermit, va_list args) + +Init method for AWLSegClass, called for SegAlloc whenever an AWLSeg is created +(see .fun.awlsegcreate above). .fun.awlseginit.tables: The segment's mark +scanned and alloc tables (see .awlseg.bt above) are allocated and initialised. +The segment's grains field is computed and stored. + +.fun.awlsegfinish: + +void awlSegFinish(Seg seg); + +Finish method for AWLSegClass, called from SegFree. Will free the segment's +tables (see .awlseg.bt). + +.fun.awlsegalloc: + +Bool AWLSegAlloc(Addr *baseReturn, Addr *limitReturn, AWLSeg awlseg, AWL awl, +Size size); + +Will search for a free block in the segment that is at least size bytes long. +The base address of the block is returned in *baseReturn, the limit of the +entire free block (which must be at least as large size and may be bigger) is +returned in *limitReturn. The requested size is converted to a number of +grains, BTFindResRange is called to find a run of this length in the alloc +bit-table (.awlseg.alloc). The return results (if it is successful) from +BTFindResRange are in terms of grains, they are converted back to addresses +before returning the relevant values from this function. + +.fun.dependent-object: + +Bool AWLDependentObject(Addr *objReturn, Addr parent); + +This function abstracts the association between an object and its linked +dependent (see .req.obj-format). It currently assumes that objects are Dylan +Object formatted according to deisng.dylan.container (see +analsys.mps.poolawl.dependent.abstract for suggested improvements). An object +has a dependent object iff the 2nd word of the object (ie (((Word *)parent)[1]) +) is non-NULL. The dependent object is the object referenced by the 2nd word +and must be a valid object. +This function assumes objects are in Dylan Object Format (see +design.dylan.container). It will check that the first word looks like a dylan +wrapper pointer. It will check that the wrapper indicates that the wrapper has +a reasonable format (namely at least one fixed field). +If the second word is NULL it will return FALSE. +If the second word is non-NULL then the contents of it will be assigned to +*objReturn, and it will return TRUE. + + +TEST + +must create dylan objects. +must create dylan vectors with at least one fixed field. +must allocate weak thingies. +must allocate exact tables. +must link tables together. +must populate tables with junk. +some junk must die. + +Use an LO pool and an AWL pool. +3 buffers. One buffer for the LO pool, one exact buffer for the AWL pool, one +weak buffer for the AWL pool. + +Initial test will allocate one object from each buffer and then destroy all +buffers and pools and exit + + diff --git a/mps/design/poollo/index.txt b/mps/design/poollo/index.txt new file mode 100644 index 00000000000..12253d3460e --- /dev/null +++ b/mps/design/poollo/index.txt @@ -0,0 +1,200 @@ + LEAF OBJECT POOL CLASS + design.mps.poollo + incomplete doc + drj 1997-03-07 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: The Leaf Object Pool Class (LO for short) is a pool class developed for +DylanWorks. It is designed to manage objects that have no references (leaf +objects) such as strings, bit tables, etc. It is a garbage collected pool (in +that objects allocated in the pool are automatically reclaimed when they are +discovered to be unreachable. + +[Need to sort out issue of alignment. Currently lo grabs alignment from +format, almost certainly "ought" to use the greater of the format alignment and +the MPS_ALIGN value -- @@ drj 1997-07-02] + +DEFINITIONS + +.def.leaf: A "leaf" object is an object that contains no references, or an +object all of whose references refer to roots. That is, any references that +the object has must refer to a priori alive objects that are guaranteed not to +move, hence the references do not need fixing. + +.def.grain: A grain (of some alignment) is a contiguous aligned area of memory +of the smallest size possible (which is the same size as the alignment). + + +REQUIREMENTS + +.req.source: See req.dylan.fun.obj.alloc and req.dylan.prot.ffi.access. + +.req.leaf: The pool must manage formatted leaf objects (see .def.leaf above for +a defintion). This is intended to encompass Dylan and C leaf objects. Dylan +leaf objects have a reference to their wrapper, but are still leaf objects (in +the sense of .def.leaf) because the wrapper will be a root. + +.req.nofault: The memory caontaining objects managed by the pool must not be +protected. The client must be allowed to access these objects without using +the MPS trampoline (the exception mechanism, q.v.). + + +OVERVIEW + +.overview: +.overview.ms: The LO Pool is a non-moving mark-and-sweep collector. +.overview.ms.justify: mark-and-sweep pools are simpler than moving pools. +.overview.alloc: Objects are allocated in the pool using the reserve commit +protocol on allocation points. .overview.format: The pool is formatted. The +format of the objects in the pool is specified at instantiation time, using an +format object derived from a format A variant (using variant A is overkill, see +.if.init below) (see design.mps.format for excuse about calling the variant +'variant A'). + + +INTERFACE + +.if.init: +.if.init.args: The init method for this class takes one extra parameter in the +vararg parameter list. .if.init.format: The extra parameter should be an +object of type Format and should describe the format of the objects that are to +be allocated in the pool. .if.init.format.use: The pool uses the skip and +alignment slots of the format. The skip method is used to determine the length +of objects (during reclaim). The alignment field is used to determine the +granularity at which memory should be managed. .if.init.format.a: Currently +only format variant A is supported though clearley that is overkill as only +skip and alignment are used. + + +DATASTRUCTURES + +.sig: The signature for the LO Pool Class is 0x51970b07 (SIGLOPOoL). + +.poolstruct: The class specific pool structure is: +typedef struct LOStruct { + PoolStruct poolStruct; /* generic pool structure */ + Format format; /* format for allocated objects */ + Shift alignShift; + Sig sig; /* impl.h.misc.sig */ +} LOStruct; + +.poolstruct.format: This is the format of the objects that are allocated in the +pool. + +.poolstruct.alignShift: This is shift used in alignment computations. It is +SizeLog2(pool->alignment). It can be used on the right of a shift operator (<< +or >>) to convert between a number of bytes and a number of grains. + +.loseg: Every segment is an instance of segment class LOSegClass, a subclass of +GCSegClass, and is an object of type LOSegStruct. + +.loseg.purpose: The purpose of the LOSeg structure is to associate the bit +tables used for recording allocation and mark information with the segment. + +.loseg.decl: The declaration of the structure is as follows: + +typedef struct LOSegStruct { + GCSegStruct gcSegStruct; /* superclass fields must come first */ + LO lo; /* owning LO */ + BT mark; /* mark bit table */ + BT alloc; /* alloc bit table */ + Count free; /* number of free grains */ + Sig sig; /* impl.h.misc.sig */ +} LOSegStruct; + +.loseg.sig: The signature for a loseg is 0x519705E9 (SIGLOSEG). + +.loseg.lo: The lo field points to the LO structure that owns this segment. + +.loseg.bit: Bit Tables (see design.mps.bt) are used to record allocation and +mark information. This is relatively straightforward, but might be inefficient +in terms of space in some circumstances. + +.loseg.mark: This is a Bit Table that is used to mark objects during a trace. +Each grain in the segment is associated with 1 bit in this table. When LOFix +(see .fun.fix below) is called the address is converted to a grain within the +segment and the corresponding bit in this table is set. + +.loseg.alloc: This is a Bit Table that is used to record which addresses are +allocated. Addresses that are allocated and are not buffered have their +corresponding bit in this table set. If a bit in this table is reset then +either the address is free or is being buffered. + +.loseg.diagram: The following diagram is now obsolete. It's also not very +interesting - but I've left the sources in case anyone ever gets around to +updating it. tony 1999-12-16 + + + + +FUNCTIONS + + +External + +.fun.init: + +.fun.destroy: + +.fun.buffer-fill: + +[explain way in which buffers interact with the alloc table and how it could be +improved] + +.fun.buffer-empty: + +.fun.condemn: + +.fun.fix: + +static Res LOFix(Pool pool, ScanState ss, Seg seg, Ref *refIO) + +[sketch] +Fix treats references of most ranks much the same. There is one mark table +that records all marks. A reference of rank RankAMBIG is first checked to see +if it is aligned to the pool alignment and discarded if not. The reference is +converted to a grain number within the segment (by subtracting the segments' +base from the refrence and then dividing by the grain size). The bit (the one +corresponding to the grain number) is set in the mark table. Exception, for a +weak reference (rank is RankWEAK) the mark table is checked and the reference +is fixed to 0 if this address has not been marked otherwise nothing happens. +Note that there is no check that the reference refers to a valid object +boundary (which wouldn't be a valid check in the case of ambiguous references +anyway). + +.fun.reclaim: + +static void LOReclaim(Pool pool, Trace trace, Seg seg) + +LOReclaim derives the loseg from the seg, and calls loSegReclaim (see +.fun.segreclaim below). + + +Internal + +.fun.segreclaim: + +static void loSegReclaim(LOSeg loseg, Trace trace) + +[sketch] +for all the contiguous allocated regions in the segment it locates the +boundaries of all the objects in that region by repeatedly skipping (calling +format->skip) from the beginning of the region (the beginning of the region is +guaranteed to coincide with the beginning of an object). For each object it +examines the bit in the mark bit table that corresponds to the beginning of the +object. If that bit is set then the object has been marked as a result of a +previous call to LOFix, the object is preserved by doing nothing. If that bit +is not set then the object has not been marked and should be reclaimed; the +object is reclaimed by resetting the appropriate range of bits in the segment's +free bit table. + +[special things happen for buffered segments] + +[explain how the marked variable is used to free segments] + +ATTACHMENT + "LOGROUP.CWK" + diff --git a/mps/design/poolmfs/index.txt b/mps/design/poolmfs/index.txt new file mode 100644 index 00000000000..fbc3836eae0 --- /dev/null +++ b/mps/design/poolmfs/index.txt @@ -0,0 +1,16 @@ + THE DESIGN OF THE MANUAL FIXED SMALL MEMORY POOL CLASS + design.mps.poolmfs + incomplete design + richard 1996-11-07 + + +OVERVIEW: + +MFS stands for "Manual Fixed Small". The MFS Pool Class manages objects that +are of a fixed size. It is intended to only manage small objects efficiently. +Storage is recycled manually by the client programmer. + +A particular instance of an MFS Pool can manage objects only of a single size, +but different instances can manage objects of different sizes. The size of +object that an instance can manage is declared when the instance is created. + diff --git a/mps/design/poolmrg/index.txt b/mps/design/poolmrg/index.txt new file mode 100644 index 00000000000..eb0116a44b4 --- /dev/null +++ b/mps/design/poolmrg/index.txt @@ -0,0 +1,515 @@ + GUARDIAN POOLCLASS + design.mps.poolmrg + incomplete doc + drj 1997-02-03 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: This is the design of the Guardian PoolClass. The Guardian PoolClass +is part of the MPS. The Guardian PoolClass is internal to the MPS (has no +client interface) and is used to implement finalization. + +.source: Some of the techniques in paper.dbe93 ("Guardians in a +Generation-Based Garbage Collector") were used in this design. Some analysis +of this design (including various improvements and some more in-depth +justification) is in analysis.mps.poolmrg. That document should be understood +before changing this document. + +It is also helpful to look at design.mps.finalize and design.mps.message. + + +GOALS + +.goal.final: The Guardian Pool should support all requirements pertaining to +finalization. + + +REQUIREMENTS + +.req: We have only one requirement pertaining to finalization: + +req.dylan.fun.finalization: Support the Dylan language-level implementation of +finalized objects: objects are registered, and are finalized in random order +when they would otherwise have died. Cycles are broken at random places. +There is no guarantee of promptness. + +.req.general: However, finalization is a very common piece of functionality +that is provided by (sophisticated) memory managers, so we can expect other +clients to request this sort of functionality. + +.anti-req: Is it required that the Guardian Pool return unused segments to the +arena? (PoolMFS does not do this) (PoolMRG will not do this in its initial +implementation) + + +TERMINOLOGY + +.def.mrg: MRG: The Pool Class's identifier will be MRG. This stands for +"Manual Rank Guardian". The pool is manually managed and implements guardians +for references of a particular rank (currently just final). + +.def.final.ref: final reference: A reference of rank final (see +design.mps.type.rank). + +.def.final.object: finalizable object: An object is finalizable with respect to +a final reference if, since the creation of that reference, there was a point +in time when no references to the object of lower (that is, stronger) rank were +reachable from a root. + +.def.final.object.note: Note that this means an object can be finalizable even +if it is now reachable from the root via exact references. + +.def.finalize: finalize: To finalize an object is to notify the client that the +object is finalizable. The client is presumed to be interested in this +information (typically it will apply some method to the object). + +.def.guardian: guardian: An object allocated in the Guardian Pool. A guardian +contains exactly one final reference, and some fields for the pool's internal +use. Guardians are used to implement a finalization mechanism. + + +OVERVIEW + +.over: The Guardian Pool Class is a PoolClass in the MPS. It is intended to +provide the functionality of "finalization". + +.over.internal: The Guardian PoolClass is internal to the MPM, it is not +intended to have a client interface. Clients are expected to access the +functionality provided by this pool (finalization) using a separate MPS +finalization interface (design.mps.finalize). + +.over.one-size: The Guardian Pool manages objects of a single certain size, +each object contains a single reference of rank final. + +.over.one-size.justify: This is all that is necessary to meet our requirements +for finalization. Whenever an object is registered for finalization, it is +sufficient to create a single reference of rank final to it. + +.over.queue: A pool maintains a queue of live guardian objects, called (for +historical reasons) the "entry" queue. .over.queue.free: The pool also +maintains a queue of free guardian objects called the "free" queue. +.over.queue.exit.not: There used to be an "exit" queue, but this is now +historical and there shouldn't be any current references to it. + +.over.alloc: When guardians are allocated, they are placed on the entry queue. +Guardians on the entry queue refer to objects that have not yet been shown to +be finalizable (either the object has references of lower rank than final to +it, or the MPS has not yet got round to determining that the object is +finalizable). .over.message.create: When a guardian is discovered to refer to +a finalizable object it is removed from the entry queue and becomes a message +on the space's queue of messages. .over.message.deliver: When the MPS client +receives the message the message system arranges for the message to be +destroyed and the pool reclaims the storage associated with the +guardian/message. + +.over.scan: When the pool is scanned at rank final each reference will be +fixed. If the reference is to an object that was in old arena (before the +fix), then the object must now be finalizable. In this case the containing +guardian will be removed from the entry queue and posted as a message. + +.over.scan.justify: The scanning process is a crucial step necessary for +implementing finalization. It is the means by which the MPS detects that +objects are finalizable. + +.over.message: PoolClassMRG implements a MessageClass (see +design.mps.message). All the messages are of one MessageType. This type is +MessageTypeFinalization. Messages are created when objects are discovered to +be finalizable and destroyed when the MPS client has received the message. + +.over.message.justify: Messages provide a means for the MPS to communicate with +its client. Notification of finalization is just such a communication. +Messages allow the MPS to inform the client of finalization events when it is +convenient for the MPS to do so (i.e. not in PageFault context). + +.over.manual: Objects in the Guardian Pool are manually managed. +.over.manual.alloc: They are allocated (by ArenaFinalize when objects are +registered for finalization. .over.manual.free: They are freed when the +associated message is destroyed. +.over.manual.justify: The lifetime of a guardian object is very easy to +determine so manual memory management is appropriate. + + +PROTOCOLS + + +Object Registration + +.protocol.register: There is a protocol by which objects can be registered for +finalization. This protocol is handled by the arena module on behalf of +finalization. see design.mps.finalize.int.finalize. + + +Finalizer Execution + +.protocol.finalizer: If an object is proven to be finalizable then a message to +this effect will eventually be posted. A client can receive the message, +determine what to do about it, and do it. Typically this would involve calling +the finalization method for the object, and deleting the message. Once the +message is deleted, the object may become recyclable. + + +Setup / Destroy + +.protocol.life: An instance of PoolClassMRG is needed in order to support +finalization, it is called the "final" pool and is attached to the arena (see +design.mps.finalize.int.arena.struct). .protocol.life.birth: The final pool is +created lazily by ArenaFinalize. .protocol.life.death: The final pool is +destroyed during ArenaDestroy. + + +DATA STRUCTURES + +.guardian: + +The guardian + +.guardian.over: A guardian is an object used to manage the references and other +datastructures that are used by the pool in order to keep track of which +objects are registered for finalization, which ones have been finalized, and so +on. .guardian.state: A guardian can be in one of four states: +.guardian.state.enum: The states are Free, Prefinal, Final, PostFinal (referred +to as MRGGuardianFree, etc. in the implementation). .guardian.state.free: The +guardian is free, meaning that it is on the free list for the pool and +available for allocation. .guardian.state.prefinal: The guardian is allocated, +and refers to an object that has not yet been discovered to be finalizable. +.guardian.state.final: The guardian is allocated, and refers to an object that +has been shown to be finalizable; this state corresponds to the existence of a +message. .guardian.state.postfinal: This state is only used briefly and is +entirely internal to the pool; the guardian enters this state just after the +associated message has been destroyed (which happens when the client receives +the message) and will be freed immediately (whereupon it will enter the Free +state). This state is used for checking only (so that MRGFree can check that +only guardians in this state are being freed). + +.guardian.life-cycle: Guardians go through the following state life-cycle: + +Free -> Prefinal -> Final -> Postfinal -> Free. + +.guardian.two-part: A guardian is a structure consisting abstractly of a link +part and a reference part. Concretely, the link part will be a LinkPartStruct, +and the reference part will be a Word. The link part is used by the pool, the +reference part forms the object visible to clients of the pool. The reference +part is the reference of Rank FINAL that refers to objects registered for +finalization and is how the MPS detects finalizable objects. +.guardian.two-part.union: The LinkPartStruct is a discriminated union of a +RingStruct and a MessageStruct. The RingStruct is used when the guardian is +either Free or Prefinal. The MessageStruct is used when the guardian is +Final. Neither part of the union is used when the guardian is in the Postfinal +state. +.guardian.two-part.justify: This may seem a little profligate with space, but +this is okay as we are not required to make finalization extremely space +efficient. + +.guardian.parts.separate: The two parts will be stored in separate segments. +.guardian.parts.separate.justify: This is so that the data structures the pool +uses to manage the objects can be separated from the objects themselves. This +avoids the pool having to manipulate data structures that are on shielded +segments (analysis.mps.poolmrg.hazard.shield). + +.guardian.assoc: The nth (from the beginning of the segment) ref part in one +segment will correspond with the nth link part in another segment. The +association between the two segments will be managed by the additional fields +in pool-specific segment subclasses (see .mrgseg). .guardian.ref: Guardians +that are either Prefinal or Final are live and have valid references (possibly +NULL) in their ref parts. Guardians that are free are dead and always have +NULL in their ref parts (see .free.overwrite and .scan.free) +.guardian.ref.free: When freeing an object, it is a pointer to the reference +part that will be passed (internally in the pool). + +.guardian.init: Guardians are initialized when the pool is grown +(.alloc.grow). The initial state has the ref part NULL and the link part is +attached to the free ring. Freeing an object returns a guardian to its initial +state. + +.poolstruct: +The Pool structure, MRGStruct will have + +.poolstruct.entry: the head of the entry queue. + +.poolstruct.exit: the head of the exit queue. + +.poolstruct.free: a free list. + +.poolstruct.rings: The entry queue, the exit queue, and the free list will all +use Rings. Each Ring will be maintained using the link part of the guardian. +.poolstruct.rings.justify: This is because Rings are convenient to use and are +well tested. It is possible to implement all three lists using a singly linked +list, but the saving is certainly not worth making at this stage. + +.poolstruct.refring: a ring of "ref" segments in use for links or messages (see +.mrgseg.ref.mrgring below). + +.poolstruct.extend: a precalculated extendby field (see .init.extend). This +value is used to determine how large a segment should be requested from the +Arena for the reference part segment when the pool needs to grow (see +.alloc.grow.size). .poolstruct.extend.justify: Calculating a reasonable value +for this once and remembering it simplifies the allocation (.alloc.grow). + +.poolstruct.init: poolstructs are initialized once for each pool instance by +MRGInit (.init). The initial state has all the rings initialized to singleton +rings, and the extendBy field initialized to some value (see .init.extend). + +.mrgseg: + +The pool defines two segment subclasses: MRGRefSegClass and MRGLinkSegClass. +Segments of the former class will be used to store the ref parts of guardians, +segments of the latter will be used to store the link parts of guardians (see +.guardian.two-part). Segments are always allocated in pairs, with one of each +class (by function MRGSegPairCreate). Each segment contains a link to its pair. + +.mrgseg.ref: MRGRefSegClass is a subclass of GCSegClass. Instances are of type +MRGRefSeg, and contain: + +.mrgseg.ref.mrgring: a field for the ring of ref part segments in the pool. + +.mrgseg.ref.linkseg: a pointer to the paired link segment. + +.mrgseg.ref.grey: a set describing the greyness of the segment for each trace. + +.mrgseg.ref.init: A segment is created and initialized once every time the pool +is grown (.alloc.grow). The initial state has the segment ring node +initialized and attached to the pool's segment ring, the linkseg field points +to the relevant link segment, the grey field is initialized such that the +segment is not grey for all traces. + +.mrgseg.link: MRGLinkSegClass is a subclass of SegClass. Instances are of type +MRGLinkSeg, and contain: + +.mrgseg.link.refseg: a pointer to the paired ref segment. This may be NULL +during initialization, while the pairing is being established. + +.mrgseg.link.init: The initial state has the linkseg field pointing to the +relevant ref segment. + + +FUNCTIONS + +.check: MRGCheck + + Will check the signatures, the class, and each field of the MRGStruct. Each +field is checked as being appropriate for its type. .check.justify: There are +no non-trivial invariants that can be easily checked. + +.alloc: [these apply to MRGRegister now. - Pekka 1997-09-19] + +.alloc.grow: If the free list is empty then two new segments will be allocated +and the free list filled up from them (note that the reference fields of the +new guardians will need to be overwritten with NULL, see .free.overwrite) +.alloc.grow.size: The size of the reference part segment will be the pool's +extendBy (.poolstruct.extend) value. The link part segment will be whatever +size is necessary to accommodate N link parts, where N is the number of +reference parts that fit in the reference part segment. + +.alloc.error: If any of the requests for more resource (there are two; one for +each of two segments) fail then the successful requests will be retracted and +the result code from the failing request will be returned. + +.alloc.pop: MRGAlloc will pop a ring node off the free list, and add it to the +entry queue. + +.free: MRGFree + + MRGFree will remove the guardian from the message queue and add it to the +free list. .free.push: The guardian will simply be added to the front of the +free list (i.e. no keeping the free list in address order or anything like +that). .free.inadequate: No attempt will be made to return unused free +segments to the Arena (although see analysis.mps.poolmrg.improve.free.* for +suggestions). + +.free.overwrite: + MRGFree also writes over the reference with NULL. .free.overwrite.justify: +This is so that when the segment is subsequently scanned (.scan.free), the +reference that used to be in the object is not accidentally fixed. + +.init: MRGInit + + Has to initialize the two queues, the free ring, the ref ring, and the +extendBy field. .init.extend: The extendBy field is initialized to one +ArenaAlign() (usually a page). .init.extend.justify: This is adequate as the +pool is not expected to grow very quickly. + +.finish: MRGFinish + + Iterate over all the segments, returning all the segments to the Arena. + +.scan: MRGScan + +.scan.trivial: Scan will do nothing (i.e. return immediately) if the tracing +rank is anything other than final. [This optimization is missing. +impl.c.trace.scan.conservative is not a problem because there are no faults on +these segs, because there are no references into them. But that's why +TraceScan can't do it. - Pekka 1997-09-19] .scan.trivial.justify: If the +rank is lower than final then scanning is detrimental, it will only delay +finalization. If the rank is higher than final there is nothing to do, the +pool only contains final references. + +.scan.guardians: Scan will iterate over all guardians in the segment. Every +guardian's reference will be fixed (.scan.free: note that guardians that are on +the free list have NULL in their reference part). .scan.wasold: If the object +referred to had not been fixed previously (i.e. was unmarked) then the object +is not referenced by a reference of a lower rank (than FINAL) and hence is +finalizable. .scan.finalize: The guardian will be finalized. This entails +moving the guardian from state Prefinal to Final; it is removed from the entry +queue and initialized as a message and posted on the arena's message queue. +.scan.finalize.idempotent: In fact this will only happen if the guardian has +not already been finalized (which is determined by examining the state of the +guardian). + +.scan.unordered: Because scanning occurs a segment at a time, the order in +which objects are finalized is "random" (it cannot be predicted by considering +only the references between objects registered for finalization). See +analysis.mps.poolmrg.improve.semantics for how this can be improved. +.scan.unordered.justify: Unordered finalization is all that is required. + +(see analysis.mps.poolmrg.improve.scan.nomove for a suggested improvement that +avoids redundant unlinking and relinking). + +.describe: MRGDescribe + + Will print out the usual blurb. + Will iterate along each of the entry and exit queues and printout the +guardians in each. The location of the guardian and the value of the reference +in it will be printed out. + +.functions.unused: BufferInit, BufferFill, BufferEmpty, BufferFinish, +TraceBegin, Condemn, Fix, Reclaim, TraceEnd, Benefit. + + All of these will be unused. + +.functions.trivial: The Grey method of the pool class will be PoolTrivGrey, +this pool has no further bookkeeping to perform for grey segments. + + +TRANSGRESSIONS + +.trans.no-finish: The MRG pool does not trouble itself to tidy up its internal +rings properly when being destroyed. + +.trans.free-seg: No attempt is made to release free segments to the arena. A +suggested strategy for this is as follows: + - Add a count of free guardians to each segment, and maintain it in +appropriate places. + - Add a free segment ring to the pool. + - In MRGRefSegScan, if the segment is entirely free, don't scan it, but +instead detach its links from the free ring, and move the segment to the free +segment ring. + - At some appropriate point (such as the end of MRGAlloc), destroy free +segments. + - In MRGAlloc, if there are no free guardians, check the free segment ring +before creating a new pair of segments. +Note that this algorithm would give some slight measure of segment hysteresis. +It is not the place of the pool to support general segment hysteresis. + + +FUTURE + +.future.array: In future, for speed or simplicity, this pool could be rewritten +to use an array. See mail.gavinm.1997-09-04.13-08(0). + + +TESTS + +.test: [This section is utterly out of date. -- Pekka 1997-09-19] The test +impl.c.finalcv is similar to the weakness test (see design.mps.weakness, +impl.c.weakcv [???]). + + +Functionality + + This is the functionality to be tested: + + .fun.alloc: Can allocate objects. + + .fun.free: Can free objects that were allocated. + + .prot.write: Can write a reference into an allocated object. + + .prot.read: Can read the reference from an allocated object. + + .promise.faithful: A reference stored in an allocated object will continue to +refer to the same object. + + .promise.live: A reference stored in an allocated object will preserve the +object referred to. + + .promise.unreachable: Any objects referred to in finalization messages are +not (at the time of reading the message) reachable via a chain of ambiguous or +exact references. (we will not be able to test this at first as there is no +messaging interface) + + .promise.try: The Pool will make a "good faith" effort to finalize objects +that are not reachable via a chain of ambiguous or exact references. + + +Attributes + + The following attributes will be tested: + +.attr.none: There are no attribute requirements. + + +Implementation + +[New test] +new test will simply allocate a number of objects in the AMC pool and finalize +each one, throwing away the reference to the objects. Churn. + + .test.mpm: The test will use the MPM interface (impl.h.mpm). +.test.mpm.justify: This is because it is not intended to provide an MPS +interface to this pool directly, and the MPS interface to finalization has not +been written yet (impl.h.mps). .test.mpm.change: Later on it may use the MPS +interface, in which case, where the following text refers to allocating objects +in the MRG pool it will need adjusting. + + .test.two-pools: The test will use two pools, an AMC pool, and an MRG pool. + + .test.alloc: A number of objects will be allocated in the MRG pool. +.test.free: They will then be freed. This will test .fun.alloc and .fun.free, +although not very much. + + .test.rw.a: An object, 'A', will be allocated in the AMC pool, a reference to +it will be kept in a root. .test.rw.alloc: A number of objects will be +allocated in the MRG pool. .test.rw.write: A reference to A will be written +into each object. .test.rw.read: The reference in each object will be read and +checked to see if it refers to A. .test.rw.free: All the objects will be +freed. .test.rw.drop: The reference to A will be dropped. This will test +.prot.write and .prot.read. + + .test.promise.fl.alloc: A number of objects will be allocated in the AMC +pool. .test.promise.fl.tag: Each object will be tagged uniquely. +.test.promise.fl.refer: a reference to it will be stored in an object allocated +in the MRG pool. .test.promise.fl.churn: A large amount of garbage will be +allocated in the AMC pool. Regularly, whilst this garbage is being allocated, +a check will be performed that all the objects allocated in the MRG pool refer +to valid objects and that they still refer to the same objects. All objects +from the MRG pool will then be freed (thus dropping all references to the AMC +objects). This will test .promise.faithful and .promise.live. + + .test.promise.ut.not: The following part of the test has not implemented. +This is because the messaging system has not yet been implemented. + .test.promise.ut.alloc: A number of objects will be allocated in the AMC +pool. .test.promise.ut.refer: Each object will be referred to by a root and +also referred to by an object allocated in the MRG pool. .test.promise.ut.drop +: References to a random selection of the objects from the AMC pool will be +deleted from the root. .test.promise.ut.churn: A large amount of garbage will +be allocated in the AMC pool. .test.promise.ut.message: The message interface +will be used to receive finalization messages. .test.promise.ut.final.check: +For each finalization message received it will check that the object referenced +in the message is not referred to in the root. .test.promise.ut.nofinal.check: +After some amount of garbage has been allocated it will check to see if any +objects are not in the root and haven't been finalized. This will test +.promise.unreachable and .promise.try. + + +NOTES + + +.access.inadequate: PoolAccess will scan segments at Rank Exact. Really it +should be scanned at whatever the minimum rank of all grey segments is (the +trace rank phase), however there is no way to find this out. As a consequence +we will sometimes scan pages at Rank exact when the pages could have been +scanned at Rank final. This means that finalization of some objects may +sometimes get delayed. + diff --git a/mps/design/poolmv/index.txt b/mps/design/poolmv/index.txt new file mode 100644 index 00000000000..a8ed9b1b75a --- /dev/null +++ b/mps/design/poolmv/index.txt @@ -0,0 +1,12 @@ + THE DESIGN OF THE MANUAL VARIABLE MEMORY POOL CLASS + design.mps.poolmv + incomplete design + richard 1995-08-25 + + +IMPLEMENTATION: + +.lost: It is possible for MV to "lose" memory when freeing an objects. This +happens when an extra block descriptor is needed (ie the interior of a block is +being freed) and the call to allocate the block fails. + diff --git a/mps/design/poolmv2/index.txt b/mps/design/poolmv2/index.txt new file mode 100644 index 00000000000..7d4acdc89af --- /dev/null +++ b/mps/design/poolmv2/index.txt @@ -0,0 +1,760 @@ + THE DESIGN OF A NEW MANUAL-VARIABLE MEMORY POOL CLASS + design.mps.poolmv2 + draft design + P T Withington 1998-02-13 + + +INTRODUCTION: + +This is a second-generation design for a pool that manually manages +variable-sized objects. It is intended as a replacement for poolmv (except in +its control pool role) and poolepdl, and it is intended to satisfy the +requirements of the Dylan "misc" pool and the product malloc/new drop-in +replacement. + +[This form should include these fields, rather than me having to create them +"by hand"] + +.readership: MM developers + +.source: req.dylan(6), req.epcore(16), req.product(2) + +.background: design.mps.poolmv(0), design.mps.poolepdl(0), +design.product.soft.drop(0), paper.wil95(1), paper.vo96(0), paper.grun92(1), +paper.beck82(0), mail.ptw.1998-02-25.22-18(0) + +.hist.-1: Initial email discussion mail.ptw.1998-02-04.21-27(0), ff. +.hist.0: Draft created 1998-02-13 by P. T. Withington from email RFC +mail.ptw.1998-02-12.03-36, ff. +.hist.1: Revised 1998-04-01 in response to email RFC +mail.ptw.1998-03-23.20-43(0), ff. +.hist.2: Revised 1998-04-15 in response to email RFC +mail.ptw.1998-04-13.21-40(0), ff. +.hist.3: Erroneously incremented version number +.hist.4: Revised 1998-05-06 in response to review review.design.mps.poolmv2.2 +(0) + + +DEFINITIONS + +.def.alignment: Alignment is a constraint on an object's address, typically to +be a power of 2 (see also, glossary.alignment ) + +.def.bit-map: A bitmap is a boolean-valued vector (see also, glossary.bitmap ). + +.def.block: A block is a contiguous extent of memory. In this document, block +is used to mean a contiguous extent of memory managed by the pool for the pool +client, typically a subset of a segment (compare with .def.segment). + +.def.cartesian-tree: A cartesian tree is a binary tree ordered by two keys +(paper.stephenson83(0)). + +.def.crossing-map: A mechanism that supports finding the start of an object +from any address within the object, typically only required on untagged +architectures (see also, glossary.crossing.map ). + +.def.footer: A block of descriptive information describing and immediately +following another block of memory (see also .def.header). + +.def.fragmentation: Fragmented memory is memory reserved to the program but not +usable by the program because of the arrangement of memory already in use (see +also, glossary.fragmentation ). + +.def.header: A block of descriptive information describing and immediately +preceding another block of memory (see also, glossary.in-band.header ). + +.def.in-band: From "in band signalling", when descriptive information about a +data structure is stored in the data structure itself (see also, +glossary.in-band.header ). + +.def.out-of-band: When descriptive information about a data structure is stored +separately from the structure itself (see also, glossary.out-of-band.header ). + +.def.refcount: A refcount is a count of the number of users of an object (see +also, glossary.reference.count ). + +.def.segment: A segment is a contiguous extent of memory. In this document, +segment is used to mean a contiguous extent of memory managed by the MPS arena +(design.mps.arena(1)) and subdivided by the pool to provide blocks (see +.def.block) to its clients. + +.def.splay-tree: A splay tree is a self-adjusting binary tree (paper.st85(0), +paper.sleator96(0)). + +.def.splinter: A splinter is a fragment of memory that is too small to be +useful (see also, glossary.splinter ) + +.def.subblock: A subblock is a contiguous extent of memory. In this document, +subblock is used to mean a contiguous extent of memory manage by the client for +its own use, typically a subset of a block (compare with .def.block). + + +ABBREVIATIONS + +.abbr.abq: ABQ = Available Block Queue + +.abbr.ap: AP = Allocation Point + +.abbr.cbs: CBS = Coalescing Block Structure + +.abbr.mps: MPS = Memory Pool System + +.abbr.mv: MV = Manual-Variable + +.abbr.ps: PS = PostScript + + + +OVERVIEW: + +mv2 is intended to satisfy the requirements of the clients that need +manual-variable pools, improving on the performance of the existing +manual-variable pool implementations, and reducing the duplication of code that +currently exists. The expected clients of mv2 are: Dylan (currently for its +misc pool), EP (particularly the dl pool, but all pools other than the PS +object pool), and Product (initially the malloc/new pool, but also other manual +pool classes). + + +REQUIREMENTS: + +.req.cat: Requirements are categorized per guide.req(2). + +.req.risk: req.epcore(16) is known to be obsolete, but the revised document has +not yet been accepted. + + +Critical Requirements + +.req.fun.man-var: The pool class must support manual allocation and freeing of +variable-sized blocks (source: req.dylan.fun.misc.alloc, +req.epcore.fun.{dl,gen,tmp,stat,cache,trap}.{alloc,free}, +req.product.fun.{malloc,new,man.man}). + +.non-req.fun.gc: There is not a requirement that the pool class support +formatted objects, scanning, or collection objects; but it should not be +arbitrarily precluded. + +.req.fun.align: The pool class must support aligned allocations to +client-specified alignments. An individual instance need only support a single +alignment; multiple instances may be used to support more than one alignment +(source: req.epcore.attr.align). + +.req.fun.reallocate: The pool class must support resizing of allocated blocks +(source req.epcore.fun.dl.promise.free, req.product.dc.env.{ansi-c,cpp}). + +.non-req.fun.reallocate.in-place: There is not a requirement blocks must be +resized in place (where possible); but it seems like a good idea. + +.req.fun.thread: Each instance of the pool class must support multiple threads +of allocation (source req.epcore.fun.dl.multi, req.product.dc.env.{ansi-c,cpp}). + +.req.attr.performance: The pool class must meet or exceed performance of +"competitive" allocators (source: rec.epcore.attr.{run-time,tp}, +req.product.attr.{mkt.eval, perform}). [Dylan does not seem to have any +requirement that storage be allocated with a particular response time or +throughput, just so long as we don't block for too long. Clearly there is a +missing requirement.] + +.req.attr.performance.time: By inference, the time overhead must be competetive. + +.req.attr.performance.space: By inference, the space overhead must be +competetive. + +.req.attr.reliability: The pool class must have "rock-solid reliability" +(source: req.dylan.attr.rel.mtbf, req.epcore.attr.rel, req.product.attr.rel). + +.req.fun.range: The pool class must be able to manage blocks ranging in size +from 1 byte to all of addressable memory +(req.epcore.attr.{dl,gen,tmp,stat,cache,trap}.obj.{min,max}. The range +requirement may be satisfied by multiple instances each managing a particular +client-specified subrange of sizes. [Dylan has requirements +req.dylan.attr.{capacity,obj.max}, but no requirement that such objects reside +in a manual pool.] + +.req.fun.debug: The pool class must support debugging erroneous usage by client +programs (source: req.epcore.fun.{dc.variety, debug.support}, +req.product.attr.{mkt.eval,perform}). Debugging is permitted to incur +additional overhead. + +.req.fun.debug.boundaries: The pool class must support checking for accesses +outside the boundaries of live objects. + +.req.fun.debug.log: The pool class must support logging of all allocations and +deallocations. + +.req.fun.debug.enumerate: The pool class must support examining all allocated +objects. + +.req.fun.debug.free: The pool class must support detecting incorrect, +overlapping, and double frees. + +.req.fun.tolerant: The pool class must support tolerance of erroneous usage +(source req.product.attr.use.level.1). + + +Essential Requirements + +.req.fun.profile: The pool class should support memory usage profiling (source: +req.product.attr.{mkt.eval, perform}). + +.req.attr.flex: The pool class should be flexible so that it can be tuned to +specific allocation and freeing patterns (source: +req.product.attr.flex,req.epcore.attr.{dl,cache,trap}.typ). The flexibility +requirement may be satisfied by multiple instances each optimizing a specific +pattern. + +.req.attr.adapt: The pool class should be adaptive so that it can accommodate +changing allocation and freeing patterns (source: +req.epcore.fun.{tmp,stat}.policy, req.product.attr.{mkt.eval,perform}). + + +Nice Requirements + +.req.fun.suballocate: The pool class may support freeing of any aligned, +contiguous subset of an allocated block (source req.epcore.fun.dl.free.any, +req.product.attr.{mkt.eval,perform}). + + + + +ARCHITECTURE: + +.arch.overview: The pool has several layers: client allocation is by Allocation +Points (APs). .arch.overview.ap: APs acquire storage from the pool +available-block queue (ABQ). .arch.overview.abq: The ABQ holds blocks of a +minimum configurable size: "reuse size". .arch.overview.storage: The ABQ +acquires storage from the arena or from the coalescing-block structure (CBS). +.arch.overview.storage.contiguous: The arena storage is requested to be +contiguous to maximize opportunities for coalescing (Loci will be used when +available). .arch.overview.cbs: The CBS holds blocks freed by the client until, +through coalescing, they have reached the reuse size, at which point they are +made available on the ABQ. + +.arch.ap: The pool will use allocation points as the allocation interface to +the client. .arch.ap.two-phase: Allocation points will request blocks from the +pool and suballocate those blocks (using the existing AP, compare and +increment, 2-phase mechanism) to satisfy client requests. .arch.ap.fill: The +pool will have a configurable "fill size" that will be the preferred size block +used to fill the allocation point. .arch.ap.fill.size: The fill size should be +chosen to amortize the cost of refill over a number of typical reserve/commit +operations, but not so large as to exceed the typical object population of the +pool. .arch.ap.no-fit: When an allocation does not fit in the remaining space +of the allocation point, there may be a remaining fragment. +.arch.ap.no-fit.sawdust: If the fragment is below a configurable threshold +(minimum size), it will be left unused (but returned to the CBS so it will be +reclaimed when adjacent objects are freed); .arch.ap.no-fit.splinter: +otherwise, the remaining fragment will be (effectively) returned to the head of +the available-block queue, so that it will be used as soon as possible (i.e., +by objects of similar birthdate). .arch.ap.no-fit.oversize: If the requested +allocation exceeds the fill size it is treated exceptionally (this may indicate +the client has either misconfigured or misused the pool and should either +change the pool configuration or create a separate pool for these exceptional +objects for best performance). .arch.ap.no-fit.oversize.policy: Oversize blocks +are assumed to have exceptional lifetimes, hence are allocated to one side and +do not participate in the normal storage recycling of the pool. +.arch.ap.refill.overhead: If reuse size is small, or becomes small due to +.arch.adapt, all allocations will effectively be treated exceptionally (the AP +will trip and a oldest-fit block will be chosen on each allocation). This mode +will be within a constant factor in overhead of an unbuffered pool. + +.arch.abq: The available block queue holds blocks that have coalesced +sufficiently to reach reuse size. arch.abq.reuse.size: A multiple of the +quantum of virtual memory is used as the reuse size (.anal.policy.size). +.arch.abq.fifo: It is a FIFO queue (recently coalesced blocks go to the tail of +the queue, blocks are taken from the head of the queue for reuse). +.arch.abq.delay-reuse: By thus delaying reuse, coalescing opportunities are +greater. .arch.abq.high-water: It has a configurable high water mark, which +when reached will cause blocks at the head of the queue to be returned to the +arena, rather than reused. .arch.abq.return: When the MPS supports it, the pool +will be able to return free blocks from the ABQ to the arena on demand. +.arch.return.segment: arch.abq.return can be guaranteed to be able to return a +segment by setting reuse size to twice the size of the segments the pool +requests from the arena. + +.arch.cbs: The coalescing block structure holds blocks that have been freed by +the client. .arch.cbs.optimize: The data structure is optimized for coalescing. +.arch.cbs.abq: When a block reaches reuse size, it is added to the ABQ. +.arch.cbs.data-structure: The data structures are organized so that a block can +be on both the CBS and ABQ simultaneously to permit additional coalescing, up +until the time the block is removed from the ABQ and assigned to an AP. + +.arch.fragmentation.internal: Internal fragmentation results from The pool will +request large segments from the arena to minimize the internal fragmentation +due to objects not crossing segment boundaries. + +.arch.modular: The architecture will be modular, to allow building variations +on the pool by assembling different parts. .arch.modular.example: For example, +it should be possible to build pools with any of the freelist mechanisms, with +in-band or out-of-band storage (where applicable), that do or do not support +derived object descriptions, etc. + +.arch.modular.initial: The initial architecture will use +.mech.freelist.splay-tree for the CBS, .sol.mech.storage.out-of-band, +.sol.mech.desc.derived, and .sol.mech.allocate.buffer. + +.arch.segregate: The architecture will support segregated allocation through +the use of multiple allocation points. The client will choose the appropriate +allocation point either at run time, or when possible, at compile time. + +.arch.segregate.initial: The initial architecture will segregate allocations +into two classes: large and small. This will be implemented by creating two +pools with different parameters. + +.arch.segregate.initial.choice: The initial architecture will provide glue code +to choose which pool to allocate from at run time. If possible this glue code +will be written in a way that a good compiler can optimize the selection of +pool at compile time. Eventually this glue code should be subsumed by the +client or generated automatically by a tool. + +.arch.debug: Debugging features such as tags, fenceposts, types, creators will +be implemented in a layer above the pool and APs. A generic pool debugging +interface will be developed to support debugging in this outer layer. + +.arch.debug.initial: The initial architecture will have counters for +objects/bytes allocated/freed and support for detecting overlapping frees. + +.arch.dependency.loci: The architecture depends on the arena being able to +efficiently provide segments of varying sizes without excessive fragmentation. +The locus mechanism should satisfy this dependency. (See .anal.strategy.risk) + +.arch.dependency.mfs: The architecture internal data structures depend on +efficient manual management of small, fixed-sized objects (2 different sizes). +The MFS pool should satisfy this dependency. + +.arch.contingency: Since the strategy we propose is new, it may not work. +.arch.contingency.pathalogical: In particular, pathological allocation patterns +could result in fragmentation such that no blocks recycle from the CBS to ABQ. +.arch.contingency.fallback: As a fallback, there will be a pool creation +parameter for a high water mark for the CBS. +.arch.contingency.fragmentation-limit: When the free space in the CBS as a +percentage of all the memory managed by the pool (a measure of fragmentation) +reaches that high water mark, the CBS will be searched oldest-fit before +requesting additional segments from the arena. .arch.contingency.alternative: +We also plan to implement .mech.freelist.cartesian-tree as an alternative CBS, +which would permit more efficient searching of the CBS. + +.arch.parameters: The architecture supports several parameters so that multiple +pools may be instantiated and tuned to support different object cohorts. The +important parameters are: reuse size, minimum size, fill size, ABQ high water +mark, CBS fragmentation limit (see .arch.contingency.fragmentation-limit). +.arch.parameters.client-visible: The client-visible parameters of the pool are +the minimum object size, the mean object size, the maximum object size, the +reserve depth and fragmentation limit. The minimum object size determines when +a splinter is kept on the head of the ABQ (.arch.ap.no-fit.splinter). The +maximum object size determines the fill size (.arch.ap.fill-size) and hence +when a block is allocated exceptionally (.arch.ap.no-fit.oversize). The mean +object size is the most likely object size. The reserve depth is a measure of +the hysteresis of the object population. The mean object size, reserve depth +and, maximum object size are used to determine the size of the ABQ +(.arch.abq.high-water). The fragmentation limit is used to determine when +contingency mode is used to satisfy an allocation request (.arch.contingency). + +.arch.adapt: We believe that an important adaptation to explore is tying the +reuse size inversely to the fragmentation (as measured in +.arch.contingency.fragmentation-limit). .arch.adapt.reuse: By setting reuse +size low when fragmentation is high, smaller blocks will be available for +reuse, so fragmentation should diminish. .arch.adapt.overhead: This will result +in higher overhead as the AP will need to be refilled more often, so reuse size +should be raised again as fragmentation diminishes. .arch.adapt.oldest-fit: In +the limit, if reuse size goes to zero, the pool will implement a "oldest-fit" +policy: the oldest free block of sufficient size will be used for each +allocation. + +.arch.adapt.risk: This adaptation is an experimental policy and should not be +delivered to clients until thoroughly tested. + + + + + +ANALYSIS: + +.anal.discard: We have discarded many traditional solutions based on experience +and analysis in paper.wil95(1). In particular, managing the free list as a +linear list arranged by address or size and basing policy on searching such a +linear list in a particular direction, from a particular starting point, using +fit and/or immediacy as criteria. We believe that none of these solutions is +derived from considering the root of the problem to be solved (as described in +.strategy), although their behavior as analyzed by Wilson gives several +insights. + +.anal.strategy: For any program to run in the minimum required memory (with +minimal overhead -- we discard solutions such as compression for now), +fragmentation must be eliminated. To eliminate fragmentation, simply place +blocks in memory so that they die "in order" and can be immediately coalesced. +This ideal is not achievable, but we believe we can find object attributes that +correlate with deathtime and exploit them to approximate the ideal. Initially +we believe birth time and type (as approximated by size) will be useful +attributes to explore. + +.anal.strategy.perform: To meet .req.attr.performance, the implementation of +.sol.strategy must be competitive in both time and space. + +.anal.strategy.risk: The current MPS segment substrate can cause internal +fragmentation which an individual pool can do nothing about. We expect that +request.epcore.170193.sugg.loci will be implemented to remove this risk. + +.anal.policy: Deferred coalescing, when taken to the extreme will not minimize +the memory consumption of a program, as no memory would ever be reused. Eager +reuse appears to lead to more fragmentation, whereas delayed reuse appears to +reduce fragmentation (paper.wil95(1)). The systems studied by Wilson did not +directly address deferring reuse. Our proposed policy is to reuse blocks when +they reach a (configurable) size. We believe that this policy along with the +policy of segregating allocations by death time, will greatly reduce +fragmentation. .anal.policy.risk: This policy could lead to pathologic behavior +if allocations cannot be successfully segregated. + +.anal.policy.allocate.segregate: This policy has some similarities to +CustomAlloc (paper.grun92(1)). CustomAlloc segregates objects by size classes, +and then within those classes chooses a different allocator depending on +whether that size class has a stable or unstable population. Classes with +stable population recycle storage within the class, whereas classes with +unstable populations return their storage to the general allocation pool for +possible reuse by another class. CustomAlloc, however, requires profiling the +application and tuning the allocator according to those profiles. Although we +intend to support such tuning, we do not want to require it. + +.anal.policy.reallocate: For reallocation, .fun.suballocate can be used to free +the remainder if a block is made smaller. Doing so will cause the freed block +to obey .sol.policy.allocate [i.e., the freed block will not be treated +specially, it will be subject to the normal policy on reuse]. Copying can be +used if a block is made larger. paper.vo96(0) reports success in +over-allocating a block the first time it is resized larger, presumably because +blocks that are resized once tend to be resized again and over-allocating may +avoid a subsequent copy. If each object that will be reallocated can be given +its own allocation point until its final reallocation, the allocation point can +be used to hold released or spare storage. + +.anal.policy.size: We believe that this will take advantage of the underlying +virtual memory system's ability to compact the physical memory footprint of the +program by discarding free fragments that align with the virtual memory +quantum. (In a VM system one can approximate compaction by sparse mapping. If +every other page of a segment is unused, the unused pages can be unmapped, +freeing up physical memory that can be mapped to a new contiguous vm range.) + +.anal.mech.freelist: The literature (paper.grun92(1), paper.vo96(0)) indicate +that .sol.freelist.cartesian-tree provides a space-efficient implementation at +some cost in speed. .sol.freelist.splay-tree is faster but less +space-efficient. .sol.freelist.bitmap is unstudied. Many of the faster +allocators maintain caches of free blocks by size to speed allocation of +"popular" sizes. We intend to initially explore not doing so, as we believe +that policy ultimately leads to fragmentation by mixing objects of varying +death times. Instead we intend to use a free list mechanism to support fast +coalescing, deferring reuse of blocks until a minimum size has been reached. + +anal.mech.allocate.optimize-small: Wilson (paper.wil95(1)) notes that small +blocks typically have short lifetimes and that overall performance is improved +if you optimize the management of small blocks, e.g., +sol.mech.allocate.lookup-table for all small blocks. We believe that +.sol.mech.allocate.buffer does exactly that. + +.anal.mech.allocate.optimize-new: Wilson (paper.wil95(1)) reports some benefit +from "preserving wilderness", that is, when a block of memory must be requested +from the system to satisfy an allocation, only the minimum amount of that block +is used, the remainder is preserved (effectively by putting it at the tail of +the free list). This mechanism may or may not implement .sol.policy.allocate. +We believe a better mechanism is to choose to preserve or not, based on +.sol.policy.allocate. + + + + +IDEAS: + +.sol: Many solution ideas for manual management of variable-sized memory blocks +are enumerated by paper.wil95(1). Here we list the most promising, and some of +our own. + + +Strategy + +.sol.strategy: To run a program in the minimal required memory, with minimal +overhead, utilize memory efficiently. Memory becomes unusable when fragmented. +Strategy is to minimize fragmentation. So place blocks where they won't cause +fragmentation later. + +.sol.strategy.death: objects that will die together (in time) should be +allocated together (in space); thus they will coalesce, reducing fragmentation. + +.sol.strategy.death.birth: assume objects allocated near each other in time +will have similar deathtimes (paper.beck82(0)) +.sol.strategy.death.type: assume objects of different type may have different +deathtimes, even if born together +.sol.strategy.death.predict: find and use program features to predict deathtimes + +.sol.strategy.reallocate: reallocation implies rebirth, or at least a change in +lifetime + +.sol.strategy.debug: as much of the debugging functionality as possible should +be implemented as a generally available MPS utility; the pool will provide +support for debugging that would be expensive or impossible to allocate outside +the pool + + +Policy + +[Policy is an implementable decision procedure, hopefully approximating the +strategy.] + +.sol.policy.reuse: defer reusing blocks, to encourage coalescing +.sol.policy.split: when a block is split to satisfy an allocation, use the +remainder as soon as possible +.sol.policy.size: prevent .policy.reuse from consuming all of memory by +choosing a (coalesced) block for reuse when it reaches a minimum size +.sol.policy.size.fixed: use the quantum of virtual memory (e.g., one page) as +minimum size +.sol.policy.size.tune: allow tuning minimum size +.sol.policy.size.adapt: adaptively change minimum size +.sol.policy.allocate: allocate objects with similar birthdate and lifetime +together +.sol.policy.allocate.segregate: segregate allocations by type +.sol.policy.allocate.segregate.size: use size as a substitute for type +.sol.policy.allocate.segregate.tune: permit tuning of segregation +.sol.policy.allocate.segregate.adapt: adaptively segregate allocations + +.sol.policy.reallocate: implement reallocation in a central mechanism outside +of the pool, create a generic pool interface in support of same. + +.sol.policy.debug: implement a pool debugging interface +.sol.policy.debug.counters: implement debugging counters in the pool that are +queried with a generic interface +.sol.policy.debug.verify: implement debugging error returns on overlapping frees + + +Mechanism + +[Mechanisms are algorithms or data structures used to implement policy.] + +.sol.mech.free-list: mechanisms that can be used to describe the free list + +.sol.mech.free-list.cartesian-tree: Using address and size as keys supports +fast coalescing of adjacent blocks and fast searching for optimal-sized blocks. +Unfortunately, because the shape of the tree is constrained by the second key, +it can become unbalanced. This data structure is used in the SunOS 4.1 malloc +(paper.grun92(1)). +.sol.mech.free-list.splay-tree: The amortized cost of a splay tree is +competitive with balanced binary trees in the worst case, but can be +significantly better for regular patterns of access because recently-accessed +keys are moved to the root of the tree and hence can be re-accessed quickly. +This data structure is used in the System Vr4 malloc (paper.vo96(0)). (For a +complete analysis of the splay tree algorithm time bounds see paper.st85(0).) +.sol.mech.free-list.bit-map: Using address as an index and fix-sized blocks, +the booleans can represent whether a block is free or not. Adjacent blocks can +be used to construct larger blocks. Efficient algorithms for searching for runs +in a vector are known. This data structure is used in many file system disk +block managers. +.sol.mech.free-list.refcount: A count of the number of allocated but not freed +subblocks of a block can be used to determine when a block is available for +reuse. This is an extremely compact data structure, but does not support +subblock reuse. +.sol.mech.free-list.hybrid: Bitmaps appear suited particularly to managing +small, contiguous blocks. The tree structures appear suited particularly to +managing varying-sized, discontiguous blocks. A refcount can be very efficient +if objects can be placed accurately according to death time. A hybrid mechanism +may offer better performance for a wider range of situations. + +.sol.mech.storage: methods that can be used to store the free list description + +.sol.mech.storage.in-band: The tree data structures are amenable to being +stored in the free blocks themselves, minimizing the space overhead of +management. To do so imposes a minimum size on free blocks and reduces the +locality of the data structure. +.sol.mech.storage.out-of-band: The bit-map data structure must be stored +separately. + +.sol.mech.desc: for an allocated block to be freed, its base and bound must be +known + +.sol.mech.desc.derived: Most clients can supply the base of the block. Some +clients can supply the bound. +.sol.mech.desc.in-band: When the bound cannot be supplied, it can be stored as +an in-band "header". If neither the base nor bound can be supplied (e.g., the +client may only have an interior pointer to the block), a header and footer may +be required. +.sol.mech.desc.out-of-band: In un-tagged architectures, it may be necessary to +store the header and footer out-of-band to distinguish them from client data. +Out-of-band storage can improve locality and reliability. Any of the free-list +structures can also be used to describe allocated blocks out-of-band. +.sol.mech.desc.crossing-map: An alternative for untagged architectures is to +store a "crossing map" which records an encoding of the start of objects and +then store the descriptive information in-band. + +.sol.mech.allocate: mechanisms that can be used to allocate blocks (these +typically sit on top of a more general free-list manager) + +.sol.mech.allocate.lookup-table: Use a table of popular sizes to cache free +blocks of those sizes. +.sol.mech.allocate.buffer: Allocate from contiguous blocks using compare and +increment. +.sol.mech.allocate.optimize-small: Use a combination of techniques to ensure +the time spent managing a block is small relative to the block's lifetime; +assume small blocks typically have short lifetimes. +.sol.mech.allocate.optimize-new: When "virgin" memory is acquired from the +operating system to satisfy a request, try to preserve it (i.e., use only what +is necessary) +.sol.mech.allocate.segregate.size: use size as a substitute for type + +.sol.mech.reallocate: use .req.fun.suballocate to return unused memory when a +block shrinks, but differentiate this from an erroneous overlapping free by +using separate interfaces. + + + + +IMPLEMENTATION: + +The implementation consists of the following separable modules: + + +Coalescing Block Structure + +.impl.c.cbs: The initial implementation will use .sol.mech.free-list.splay-tree +and sol.mech.storage.out-of-band. For locality, this storage should be managed +as a linked free list of splay nodes suballocated from blocks acquired from a +pool shared by all CBS's. Must support creation and destruction of an empty +tree. Must support search, insert and delete by key of type Addr. Must support +finding left and right neighbors of a failed search for a key. Must support +iterating over the elements of the tree with reasonable efficiency. Must +support storing and retrieving a value of type Size associated with the key. +Standard checking and description should be provided. See design.mps.splay(0) +and design.mps.cbs(0). + + +Available Block Queue + +.impl.c.abq: The initial implementation will be a queue of fixed size +(determined at pool creation time from the high water mark). Must support +creation and destruction of an empty queue. Must support insertion at the head +or tail of the queue (failing if full), peeking at the head of the queue, and +removal of the head (failing if empty) or any element of the queue (found by a +search). Standard checking and description should be provided. + + +Pool Implementation + +.impl.c: The initial implementation will use the above modules to implement a +buffered pool. Must support creation and destruction of the pool. Creation +takes parameters: minimum size, mean size, maximum size, reserve depth and +fragmentation limit. Minimum, mean, and maximum size are used to calculate the +internal fill and reuse sizes. Reserve depth and mean size are used to +calculate the ABQ high water mark. Fragmentation limit is used to set the CBS +contingency mode. Must support buffer initialization, filling and emptying. +Must support freeing. Standard checking and description should be provided. +[Eventually, it should support scanning, so it can be used with collected +pools, but no manual pool currently does.] + +.impl.c.future: The implementation should not preclude "buffered free" +(mail.ptw.1997-12-05.19-07(0), ff.) being added in the future. + +.impl.c.parameters: The pool parameters are calculated as follows from the +input parameters: minimum, mean, and maximum size are taked directly from the +parameters. .impl.c.parameter.fill-size: The fill size is set to the maximum +size times the reciprocal of the fragmentation limit, aligned to the arena +alignment. .imple.c.parameter.reuse-size: The reuse size is set to twice the +fill size (see .arch.abq.return.segment, .impl.c.free.merge.segment). +.impl.c.parameter.abq-limit: The ABQ high-water limit is set to the reserve +depth times the mean size (that is, the queue should hold as many reuse blocks +as would take to cover the population hysteresis if the population consisted +solely of mean-sized blocks, see .arch.abq.high-water). +.impl.c.parameter.avail-limit: The CBS high-water limit is implemented by +comparing the available free space to an "available limit". The available +limit is updated each time a segment is allocated from or returned to the arena +by setting it to the total size of the pool times the fragmentation limit +divide vy 100 (see .arch.contingency.fallback). + +.impl.c.ap.fill: An AP fill request will be handled as follows: +o If the request is larger than fill size, attempt to request a segment from +the arena sufficient to satisfy the request +o Use any previously returned splinter (from .impl.c.ap.empty), if large enough +o Attempt to retrieve a free block from the head of the ABQ (removing it from +ABQ and CBS if found). +o If above fragmentation limit, attempt to find a block on the CBS, using +oldest-fit search +o Attempt to request a segment of fill size from the arena +o Attempt to find a block on the CBS, using oldest-fit search +o Otherwise, fail + +.impl.c.ap.empty: An AP empty request will be handled as follows: +o If remaining free is less than min size, return it to the CBS +o If the remaining free is larger than any previous splinter, return that +splinter to the CBS and save this one for use by a subsequent fill +o Otherwise return the remaining block to the CBS + +.impl.c.free: When blocks are returned to the CBS a search is made for adjacent +blocks that can be merged. If not, the block is simply inserted in the CBS. If +a merge occurs between two blocks on the ABQ, the ABQ must be adjusted to +reflect the merge. .impl.c.free.exception: Exceptional blocks are returned +directly to the arena. + +.impl.c.free.merge: If a merge occurs and the merged block is larger than reuse +size: +o If the ABQ is full, remove the block at the head of the ABQ from the ABQ and +CBS and return it to the arena(*) +o Insert the newly merged block at the tail of the ABQ, leaving it on the CBS +for further merging + +.impl.c.free.merge.segment: (*) Merged blocks may not align with arena +segments. If necessary, return the interior segments of a block to the arena +and return the splinters to the CBS. .impl.c.free.merge.segment.reuse: If the +reuse size (the size at which blocks recycle from the CBS to the ABQ) is at +least twice the fill size (the size of segments the pool allocates from the +arena), we can guarantee that there will always be a returnable segment in +every ABQ block. .impl.c.free.merge.segment.overflow: If the reuse size is set +smaller (see .arch.adapt), there may not be a returnable segment in an ABQ +block, in which case the ABQ has "overflowed". Whenever this occurs, the ABQ +will be refilled by searching the CBS for dropped reusable blocks when needed. + +.impl.c.free.merge.segment.risk: The current segment structure does not really +support what we would like to do. Loci should do better: support reserving +contiguous address space and mapping/unmapping any portion of that address +space. + +.impl.c.free.merge.alternative: Alternatively, if the MPS segment substrate +permitted mapping/unmapping of pages, the pool could use very large segments +and map/unmap pages as needed. + + +AP Dispatch + +.impl.c.multiap: The initial implementation will be a glue layer that selects +among several AP's for allocation according to the predicted deathtime (as +approximated by size) of the requested allocation. Each AP will be filled from +a pool instance tuned to the range of object sizes expected to be allocated +from that AP. [For bonus points provide an interface that creates a batch of +pools and AP's according to some set of expected object sizes. Eventually +expand to understand object lifetimes and general lifetime prediction keys.] + +impl.c.multiap.sample-code: This glue code is not properly part of the pool or +MPS interface. It is a layer on top of the MPS interface, intended as sample +code for unsophisticated clients. Sophisticated clients will likely want to +choose among multiple AP's more directly. + + + + +TESTING: + +.test.component: Components .impl.c.splay, .impl.c.cbs, and .impl.c.abq will be +subjected to individual component tests to verify their functionality. + +.test.regression: All tests applied to poolmv (design.mps.poolmv(0)) and +poolepdl (design.mps.poolepdl(0)) will be applied to poolmv2 to ensure that mv2 +is at least as functional as the pools it is replacing. + +.test.qa: Once poolmv2 is integrated into the MPS, the standard MPS QA tests +will be applied to poolmv2 prior to each release. + +.test.customer: Customer acceptance tests will be performed on a per-customer +basis before release to that customer (cf., proc.release.epcore(2).test) + + +TEXT: + +Possible tweaks (from mail.pekka.1998-04-15.13-10(0)): + +1. Try to coalesce splinters returned from AP's with the front (or any) block +on the ABQ. +2. Sort ABQ in some other way to minimize splitting/splinters. E.g., proximity +to recently allocated blocks. + diff --git a/mps/design/poolmvff/index.txt b/mps/design/poolmvff/index.txt new file mode 100644 index 00000000000..39191bf3489 --- /dev/null +++ b/mps/design/poolmvff/index.txt @@ -0,0 +1,106 @@ + DESIGN OF THE MANUALLY-MANAGED VARIABLE-SIZE FIRST-FIT POOL + design.mps.poolmvff + incomplete doc + gavinm 1998-09-09 + +INTRODUCTION + +.intro: The pool was created in a response to a belief that EPDL/EPDR's first +fit policy is beneficial for some classes of client behaviour, but the +performance of a linear free list was unacceptable. This pool implements a +first (or last) fit policy for variable-sized manually-managed objects, with +control over first/last, segment preference high/low, and slot fit low/high. + + +Document History + +.hist.0: GavinM wrote a list of methods and function plus some notes 1998-09-09. + +.hist.1: Added overview, removed bogus ArenaEnter design, and described +buffered allocation. pekka 1999-01-06 + +.hist.2: Modified for the "Sunset On Segments" redesign of segments. Buffered +allocation is no longer limited to segment boundaries. + +OVERVIEW + +.over: This pool implements certain variants of the address-ordered first-fit +policy. The implementation allows allocation across segment boundaries. +.over.buffer: Buffered allocation is also supported, but in that case, the +buffer-filling policy is worst-fit. Buffered and unbuffered allocation can be +used at the same time, but in that case, the first ap must be created before +any allocations. .over.buffer.class: The pool uses the simplest buffer class, +BufferClass. This is appropriate since these buffers don't attach to segments, +and hence don't constrain buffered regions to lie within segment boundaries. +.over.segments: The pool uses the simplest segment class (SegClass). There's no +need for anything more complex. + + +METHODS + +.method: The MVFF pool supports the following methods: + +.method.init: Res MVFFInit(Pool pool, va_list arg) + This takes six vararg parameters: + - extendBy -- the segment size; + - avgSize -- the average object size; + - alignment -- the alignment of allocations and frees (must be at least +sizeof(void*)); + - slotHigh -- whether to allocate objects at the end of free blocks found, as +opposed to at the start (for unbuffered allocation); + - arenaHigh -- whether to express SegPrefHIGH to the arena, as opposed to +SegPrefLOW; + - firstFit -- whether to use the suitable block of lowest address, as opposed +to the highest (for unbuffered allocation). +.method.init.epdl: To simulate the EPDL pool, specify extendBy, avgSize, and +maxSize as normal, and use slotHigh=FALSE, arenaHigh=FALSE, firstFit=TRUE. +.method.init.epdr: To simulate the EPDL pool, specify extendBy, avgSize, and +maxSize as normal, and use slotHigh=TRUE, arenaHigh=TRUE, firstFit=TRUE. +.method.init.other: The performance characteristics of other combinations are +unknown. + +.method.finish: The PoolFinish method, + +.method.alloc: Alloc and Free methods are supported, implementing the policy +set by the pool params (see .method.init). + +.method.describe: The usual describe method. + +.method.buffer: The buffer methods implement a worst-fit fill strategy. + + +EXTERNAL FUNCTIONS + +.function: MVFF supports the following external functions: + +.function.free-size: size_t mps_mvff_free_size(mps_pool_t pool) + This function returns the total size of free space in segments allocated to +the MVFF pool instance. + +.function.size: size_t mps_mvff_size(mps_pool_t pool) + This function returns the total memory used by pool segments, whether free or +allocated. + +.function.class: mps_class_t mps_class_mvff(void) + This function returns the class object for the pool class, to be used in pool +creation. + + +IMPLEMENTATION + +.impl.free-list: The pool stores its free list in a CBS (see design.mps.cbs). +It uses the CBS's mayUseInline facility to avoid running out of memory to store +the free this. This is the reason for the alignment restriction above. + + + +DETAILS + +.design.seg-size: When adding a segment, we use extendBy as the segment size +unless the object won't fit, in which case we use the object size (in both +cases we align up). + +.design.seg-fail: If allocating a segment fails, we try again with a segment +size just large enough for the object we're allocating. This is in response to +request.mps.170186. + diff --git a/mps/design/prot/index.txt b/mps/design/prot/index.txt new file mode 100644 index 00000000000..272fe47b3c6 --- /dev/null +++ b/mps/design/prot/index.txt @@ -0,0 +1,85 @@ + GENERIC DESIGN OF THE PROTECTION MODULE + design.mps.prot + incomplete doc + drj 1997-04-02 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: This is the generic design of the Protection Module. The protection +module provides protection services to other parts of the MPS. It is expected +that different operating systems will have different implementations of this +module. + + +INTERFACE + +.if.setup: + +void ProtSetup(void); + +ProtSetup will be called exactly once (per process). It will be called as part +of the initialization of the first space that is created. It should arrange +for the setup and initialization of any datastructures or services that are +necessary in order to implement the protection module. (On UNIX it expected +that it will install a signal handler, on Windows it will do nothing) + +.if.set: + +void ProtSet(Addr base, Addr limit, AccessSet mode) + +ProtSet should set the protection of the memory between base and limit, +including base, but not including limit (ie the half-open interval +[base,limit)) to that specified by mode. +The mode parameter should have the AccessWrite bit set if write accesses to the +page are to be forbidden, and should have the AccessRead bit set if read +accesses to the page are to be forbidden. A request to forbid read accesses +(ie AccessRead is set) may also forbid write accesses, but read accesses will +not be forbidden unless AccessRead is set. + +.if.tramp: + +void ProtTramp(void **resultReturn, void *(*f)(void *, size_t), void *p, size_t +s); + +.if.sync: + +void ProtSync(Space space); + +ProtSync is called to ensure that the actual protection of each segment (as +determined by the OS) is in accordance with the segments's pm field. + +.if.context-type: + +typedef struct MutatorFaultContextStruct *MutatorFaultContext; + +This abstract type is implemented by the protection module (impl.c.prot*). It +represents the continuation of the mutator which is restored after a mutator +fault has been handled. The functions ProtCanStepInstruction (.if.canstep +below) and ProtStepInstruction (.if.step below) inspect and manipulate the +context. + +.if.canstep: + +Bool ProtCanStepInstruction(MutatorFaultContext context); + +Examines the context to determine whether the protection module can single-step +the instruction which is causing the fault. Should return TRUE if and only if +the instruction can be single-stepped (ie ProtStepInstruction can be called). + +.if.step: + +Bool Res ProtStepInstruction(MutatorFaultContext context); + +Single-steps the instruction which is causing the fault. This function should +only be called if ProtCanStepInstruction applied to +the context returned TRUE. It should return ResUNIMPL if the instruction +cannot be single-stepped. It should return ResOK if the +instruction is single-stepped. + +The mutator context will be updated by the emulation/execution of the +instruction such that resuming the mutator will not cause the +instruction which was causing the fault to be executed. + + diff --git a/mps/design/protan/index.txt b/mps/design/protan/index.txt new file mode 100644 index 00000000000..0c5c05028c5 --- /dev/null +++ b/mps/design/protan/index.txt @@ -0,0 +1,67 @@ + ANSI IMPLEMENTATION OF PROTECTION MODULE + design.mps.protan + incomplete doc + drj 1997-03-19 + +INTRODUCTION + +.readership: Any MPS developer + +.intro: This is the design for the ANSI implementation of the Protection Module. + + +REQUIREMENTS + +.req.test: This module is required for testing. Particularly on platforms +where no real implementation of the protection module exists. + +.req.rapid-port: This module is required for rapid porting. It should enable a +developer to port a minimally useful configuration of the MPS to new platforms +very quickly. + + +OVERVIEW + +.overview: Most of the functions in the module do nothing. The exception is +ProtSync which traverses over all segments in the arena and simulates an access +to each segment that has any protection on it. This means that this module +depends on certain fields in the segment structure. + +.overview.noos: No operating system specific (or even ANSI hosted specific) +code is in this module. It can therefore be used on any platform, particularly +where no real implementation of the module exists. It satisfies .req.test and +.req.rapid-port in this way. + + +FUNCTIONS + +.fun.protsetup: + +ProtSetup + +Does nothing as there is nothing to do (under UNIX we might expect the +Protection Module to install one or more signal handlers at this pointer, but +that is not appropropriate for the ANSI implementation). Of course, we can't +have an empty function body, so there is a NOOP; here. + +.fun.sync: + +ProtSync + +.fun.sync.what: +ProtSync is called to ensure that the actual protection of each segment (as +determined by the OS) is in accordance with the segments's pm field. In the +ANSI implementation we have no way of changing the protection of a segment, so +instead we generate faults on all protected segments in the assumption that +that will remove the protection on segments. + +.fun.sync.how: +Continually loops over all the segments until it finds that all segments have +no protection. .sync.seg: If it finds a segment that is protected then +PoolAccess is called on that segment's pool and with that segment. The call to +PoolAccess is wrapped with a ShieldEnter and ShieldLeave thereby giving the +pool the illusion that the fault was generated outside the MM. This depends on +being able to determine the protection of a segment (using the pm field), on +being able to call ShieldEnter and ShieldLeave, and on being able to call +PoolAccess. + diff --git a/mps/design/protli/index.txt b/mps/design/protli/index.txt new file mode 100644 index 00000000000..e24d7c11d43 --- /dev/null +++ b/mps/design/protli/index.txt @@ -0,0 +1,155 @@ + LINUX IMPLEMENTATION OF PROTECTION MODULE + design.mps.protli + incomplete doc + tony 2000-02-03 + +INTRODUCTION + +.readership: Any MPS developer + +.intro: This is the design of the Linux implementation of the protection +module. It makes use of various services provided by Linux. It is intended to +work with LinuxThreads. + + +REQUIREMENTS + +.req.general: Required to implement the general protection interface defined in +design.mps.prot.if.*. + + +MISC + +.improve.sigvec: Note 1 of ProtSetup notes that we can't honour the sigvec(2) +entries of the next handler in the chain. What if when we want to pass on the +signal instead of calling the handler we call sigvec with the old entry and use +kill to send the signal to ourselves and then restore our handler using sigvec +again. [need more detail and analysis here]. + + +DATASTRUCTURES + +.data.signext: This is static. Because that is the only communications channel +available to signal handlers. [write a little more here] + + +FUNCTIONS + +.fun.setup: ProtSetup installs a signal handler for the signal SIGSEGV to catch +and handle protection faults (this handler is the function sigHandle, see +.fun.sighandle). The previous handler is recorded (in the variable sigNext, see +.data.signext) so that it can be reached from sigHandle if it fails to handle +the fault. .fun.setup.problem: The problem with this approach is that we can't +honour the wishes of the sigvec(2) entry for the previous handler (in terms of +masks in particular). + +.fun.set: ProtSet uses mprotect to adjust the protection for pages. +void ProtSet(Addr base, Addr limit, AccessSet mode) + +.fun.set.convert: The requested protection (which is expressed in the mode +parameter, see design.mps.prot.if.set) is translated into an OS protection. If +read accesses are to be forbidden then all accesses are forbidden, this is done +by setting the protection of the page to PROT_NONE. If write accesses are to +be forbidden (and not read accesses) then write accesses are forbidden and read +accesses are allowed, this is done by setting the protection of the page to +PROT_READ|PROT_EXEC. Otherwise (all access are okay), the protection is set to +PROT_READ|PROT_WRITE|PROT_EXEC. + +.fun.set.assume.mprotect: We assume that the call to mprotect always succeeds. +.fun.set.assume.mprotect: This is because we should always call the function +with valid arguments (aligned, references to mapped pages, and with an access +that is compatible with the access of the underlying object). + +.fun.sync: ProtSync does nothing in this implementation as ProtSet sets the +protection without any delay. +void ProtSync(Space space); + +.fun.tramp: The protection trampoline is trivial under Linux, as there is +nothing that needs to be done in the dynamic context of the mutator in order +to catch faults. (Contrast this with Win32 Structured Exception Handling.) +void ProtTramp(void **resultReturn, void *(*f)(void *, size_t), void *p, size_t +s); + + +THREADS + +.threads: The design must operate in a multi-threaded environment (with +LinuxThreads) and cooperate with the Linux support for locks (see +design.mps.lock) and the thread suspension mechanism (see design.mps.pthreadext +). + +.threads.suspend: The SIGSEGV signal handler does not mask out any signals, so +a thread may be suspended while the handler is active, as required by the +design (see design.mps.pthreadext.req.suspend.protection). The signal handlers +simply nest at top of stack. + +.threads.async: POSIX (and hence Linux) imposes some restrictions on signal +handler functions (see design.mps.pthreadext.anal.signal.safety). Basically the +rules say the behaviour of almost all POSIX functions inside a signal handler +is undefined, except for a handful of functions which are known to be +"async-signal safe". However, if it's known that the signal didn't happen +inside a POSIX function, then it is safe to call arbitrary POSIX functions +inside a handler. + +.threads.async.protection: If the signal handler is invoked because of an MPS +access, then we know the access must have been caused by client code (because +the client is not allowed to permit access to protectable memory to arbitrary +foreign code [need a reference for this]). In these circumstances, it's OK to +call arbitrary POSIX functions inside the handler. + +.threads.async.other: If the signal handler is invoked for some other reason +(i.e. one we are not prepared to handle) then there is less we can say about +what might have caused the SEGV. In general it is not safe to call arbitrary +POSIX functions inside the handler in this case. + +.threads.async.choice: The signal handler calls ArenaAccess to determine +whether the SEGV was the result of an MPS access. ArenaAccess will claim +various MPS locks (i.e. the arena ring lock and some arena locks). The code +calls no other POSIX functions in the case where the SEGV is not an MPS access. +The locks are implemented as mutexes and are claimed by calling +pthread_mutex_lock, which is not defined to be async-signal safe. +.threads.async.choice.ok: However, despite the fact that PThreads documentation +doesn't define the behaviour of pthread_mutex_lock in these circumstances, we +expect the LinuxThreads implementation will be well-behaved unless the SEGV +occurs while while in the process of locking or unlocking one of the MPS locks +(see .threads.async.linux-mutex). But we can assume that a SEGV will not happen +then (because we use the locks correctly, and generally must assume that they +work). Hence we conclude that it is OK to call ArenaAccess directly from the +signal handler. + +.threads.async.linux-mutex: A study of the LinuxThreads source code reveals +that mutex lock and unlock functions are implemented as a spinlock (using a +locked compare-and-exchange instruction) with a backup suspension mechanism +using sigsuspend. On locking, the spinlock code performs a loop which examines +the state of the lock, and then atomically tests that the state is unchanged +while attempting to modify it. This part of the code is reentrant (and hence +async-signal safe). Eventually, when locking, the spinlock code may need to +block, in which case it calls sigsuspend waiting for the manager thread to +unblock it. The unlocking code is similar, except that this code may need to +release another thread, in which case it calls kill. sigsuspend and kill are +both defined to be async-signal safe by POSIX. In summary, the mutex locking +functions use primitives which are entirely async-signal safe. They perform +side-effects which modify the fields of the lock structure only. This code may +be safely invoked inside a signal handler unless the interrupted function is in +the process of manipulating the fields of that lock structure. + +.threads.async.improve: In future it would be preferable to not have to assume +reentrant mutex locking and unlocking functions. By making the assumption we +also assume that the implementaion of mutexes in LinuxThreads will not be +completely re-designed in future (which is not wise for the long term). An +alternative approach would be necessary anyway when supporting another platform +which doesn't offer reentrant locks (if such a platform does exist). +.threads.async.improve.how: We could avoid the assumption if we had a means of +testing whether an address lies within an arena chunk without the need to claim +any locks. Such a test might actually be possible. For example, arenas could +update a global datastructure describing the ranges of all chunks, using atomic +updates rather than locks; the handler code would be allowed to read this +without locking. However, this is somewhat tricky; a particular consideration +is that it's not clear when it's safe to deallocate stale portions of the +datastructure. + +.threads.sig-stack: We do not handle signals on a separate signal stack. +Separate signal stacks apparantly don't work properly with Pthreads. + + + diff --git a/mps/design/protocol/index.txt b/mps/design/protocol/index.txt new file mode 100644 index 00000000000..9a39735333d --- /dev/null +++ b/mps/design/protocol/index.txt @@ -0,0 +1,441 @@ + THE DESIGN FOR PROTOCOL INHERITANCE IN MPS + design.mps.protocol + incomplete doc + tony 1998-10-12 + +INTRODUCTION + +.intro: This document explains the design of the support for class inheritance +in MPS. It is not yet complete. It describes support for single inheritance +of classes. Future extensions will describe multiple inheritance and the +relationship between instances and classes. + +.readership: This document is intended for any MM developer. + +.hist.0: Written by Tony 1998-10-12 + + +PURPOSE + +.purpose.code-maintain: The purpose of the protocol inheritance design is to +ensure that the MPS code base can make use of the benefits of OO class +inheritance to maximize code reuse, minimize code maintainance and minimize the +use of "boiler plate" code. + +.purpose.related: For related discussion, see mail.tony.1998-08-28.16-26(0), +mail.tony.1998-09-01.11-38(0), mail.tony.1998-10-06.11-03(0) & other messages +in the same threads. + + +REQUIREMENTS + +.req.implicit: The object system should provide a means for classes to inherit +the methods of their direct superclasses implicitly for all functions in the +protocol without having to write any explicit code for each inherited function. + +.req.override: There must additionally be a way for classes to override the +methods of their superclasses. + +.req.next-method: As a result of .req.implicit, classes cannot make static +assumptions about methods used by direct superclasses. The object system must +provide a means for classes to extend (not just replace) the behaviour of +protocol functions, such as a mechanism for invoking the "next-method". + +.req.ideal.extend: The object system must provide a standard way for classes to +implement the protocol supported by they superclass and additionally add new +methods of their own which can be specialized by subclasses. + +.req.ideal.multiple-inheritance: The object system should support multiple +inheritance such that sub-protocols can be "mixed in" with several classes +which do not themselves support identical protocols. + + +OVERVIEW + +.overview.root: We start with the root of all conformant class hierarchies, +which is called "ProtocolClass". ProtocolClass is an "abstract" class (i.e. it +has no direct instances, but it is intended to have subclasses). To use Dylan +terminology, instances of its subclasses are "general" instances of +ProtocolClass. They look as follows:- + + Instance Object Class Object + + -------------------- -------------------- + | sig | |-------->| sig | + -------------------- | -------------------- + | class |----| | superclass | + -------------------- -------------------- + | ... | | coerceInst | + -------------------- -------------------- + | ... | | coerceClass | + -------------------- -------------------- + | | | ... | + +.overview.inherit: Classes inherit the protocols supported by their +superclasses. By default they have the same methods as the class(es) from +which they inherit. .overview.inherit.specialize: Classes may specialize the +behaviour of their superclass. They do this by by overriding methods or other +fields in the class object. + +.overview.extend: Classes may extend the protocols supported by their +superclasses by adding new fields for methods or other data. + +.overview.sig.inherit: Classes will contain (possibly several) signatures. +Classes must not specialize (i.e. override) the signature(s) they inherit from +their superclass(es). + +.overview.sig.extend: If a class definition extends a protocol, it is normal +policy for the class definition to include a new signature as the last field in +the class object. + +.overview.coerce-class: Each class contains a coerceClass field. This contains +a method which can find the part of the class object which implements the +protocols of a supplied superclass argument (if, indeed, the argument IS a +superclass). This function may be used for testing subclass/superclass +relationships, and it also provides support for multiple inheritance. + +.overview.coerce-inst: Each class contains a coerceInst field. This contains a +method which can find the part of an instance object which contains the +instance slots of a supplied superclass argument (if, indeed, the argument IS a +superclass). This function may be used for testing whether an object is an +instance of a given class, and it also provides support for multiple +inheritance. + +.overview.superclass: Each class contains a superclass field. This enables +classes to call "next-method", as well as enabling the coercion functions. + +.overview.next-method: A specialized method in a class can make use of an +overridden method from a superclass by accessing the method from the +appropriate field in the superclass object and calling it. The superclass may +be accessed indirectly from the class's "Ensure" function when it is statically +known (see .overview.access). This permits "next-method" calls, and is fully +scalable in that it allows arbitrary length method chains. The SUPERCLASS +macro helps with this (see .int.static-superclass). + +.overview.next-method.naive: In some cases it is necessary to write a method +which is designed to specialize an inherited method, needs to call the +next-method, and yet the implementation doesn't have static knowledge of the +superclass. This might happen because the specialized method is designed to be +reusable by many class definitions. The specialized method can usually locate +the class object from one of the parameters passed to the method. It can then +access the superclass through the "superclass" field of the class, and hence +call the next method. This technique has some limitations and doesn't support +longer method chains. It is also dependent on none of the class definitions +which use the method having any subclasses. + +.overview.access: Classes must be initialized by calls to functions, since it +is these function calls which copy properties from superclasses. Each class +must provide an "Ensure" function, which returns the canonical copy of the +class. The canonical copy may reside in static storage, but no MPS code may +refer to that static storage by name. + +.overview.naming: There are some strict naming conventions which must be +followed when defining and using classes. The use is obligatory because it is +assumed by the macros which support the definition and inheritance mechanism. +For every class SomeClass, we insist upon the following naming conventions:- + + SomeClassStruct - names the type of the structure for the protocol class. + This might be a typedef which aliases the type to the + type of the superclass, but if the class has extended + the protocols of the superclass the it will be a type which + contains the new class fields. + + SomeClass - names the type *SomeClassStruct. + This might be a typedef which aliases the type to the + type of the superclass, but if the class has extended + the protocols of the superclass then it will be a type +which + contains the new class fields. + + EnsureSomeClass - names the function that returns the initialized + class object. + + + +INTERFACE + +Class Definition + +.int.define-class: Class definition is performed by the macro +DEFINE_CLASS(className, var). A call to the macro must be followed by a body +of initialization code in braces {}. The parameter className is used to name +the class being defined. The parameter var is used to name a local variable of +type className, which is defined by the macro; it refers to the canonical +storage for the class being defined. This variable may be used in the +initialization code. (The macro doesn't just pick a name implicitly because of +the danger of a name clash with other names used by the programmer). A call to +DEFINE_CLASS(SomeClass, var) does the following: +Defines the EnsureSomeClass function. +Defines some static storage for the canonical class object +Defines some other things to ensure the class gets initialized exactly once + +.int.define-alias-class: A convenience macro DEFINE_ALIAS_CLASS is provided +which both performs the class definition and defines the types SomeClass and +SomeClass struct as aliases for some other class types. This is particularly +useful for classes which simply inherit, and don't extend protocols. The macro +call DEFINE_ALIAS_CLASS(className, superName, var) is exactly equivalent to the +following: + typedef superName className; + typedef superNameStruct classNameStruct; + DEFINE_CLASS(className, var) + +.int.define-special: If classes are particularly likely to be subclassed +without extension, the class implementor may choose to provide a convenience +macro which expands into DEFINE_ALIAS_CLASS with an appropriate name for the +superclass. For example, there might be a macro for defining pool classes such +that the macro call DEFINE_POOL_CLASS(className, var) is exactly equivalent to +the macro call DEFINE_ALIAS_CLASS(className, PoolClass, var). It may also be +convenient to define a static superclass accessor macro at the same time (see +.int.static-superclass.special). + + +Single Inheritance + +.int.inheritance: Class inheritance details must be provided in the class +initialization code (see .int.define-class). Inheritance is performed by the +macro INHERIT_CLASS(thisClassCoerced, parentClassName). A call to this macro +will make the class being defined a direct subclass of ParentClassName by +ensuring that all the fields of the parent class are copied into thisClass, and +setting the superclass field of thisClass to be the parent class object. The +parameter thisClassCoerced must be of type parentClassName. If the class +definition defines an alias class (see .int.define-alias-class), then the +variable named as the second parameter to DEFINE_CLASS will be appropriate to +pass to INHERIT_CLASS. + + +Specialization + +.int.specialize: Class specialization details must be given explicitly in the +class initialization code (see .int.define-class). This must happen AFTER the +inheritance details are given (see .int.inheritance). + + +Extension + +.int.extend: To extend the protocol when defining a new class, a new type must +be defined for the class structure. This must embed the structure for the +primarily inherited class as the first field of the structure. Class extension +details must be given explicitly in the class initialization code (see +.int.define-class). This must happen AFTER the inheritance details are given +(see .int.inheritance). + + +Introspection + +.introspect.c-lang: The design includes a number of introspection functions for +dynamically examining class relationships. These functions are polymorphic and +accept arbitrary subclasses of ProtocolClass. C doesn't support such +polymorphism. So although these have the semantics of functions (and could be +implemented as functions in another language with compatible calling +conventions) they are actually implemented as macros. The macros are named as +method-style macros despite the fact that this arguably contravenes +guide.impl.c.macro.method. The justification for this is that this design is +intended to promote the use of polymorphism, and it breaks the abstraction for +the users to need to be aware of what can and can't be expressed directly in C +function syntax. These functions all end in "Poly" to identify them as +polymorphic functions. + +.int.superclass: ProtocolClassSuperclassPoly(class) is an introspection +function which returns the direct superclass of class object class. + +.int.static-superclass: SUPERCLASS(className) is an introspection macro which +returns the direct superclass given a class name, which must (obviously) be +statically known. The macro expands into a call to the ensure function for the +class name, so this must be in scope (which may require a forward +declaration). The macro is useful for next-method calls (see +.overview.next-method). The superclass is returned with type ProtocolClass so +it may be necessary to cast it to the type for the appropriate subclass. + +.int.static-superclass.special: Implementors of classes which are designed to +be subclassed without extension may choose to provide a convenience macro which +expands into SUPERCLASS along with a type cast. For example, there might be a +macro for finding pool superclasses such that the macro call +POOL_SUPERCLASS(className) is exactly equivalent to +(PoolClass)SUPERCLASS(className). It's convenient to define these macros +alongside the convenience class definition macro (see .int.define-special). + +.int.class: ClassOfPoly(inst) is an introspection function which returns the +class of which inst is a direct instance. + +.int.subclass: IsSubclassPoly(sub, super) is an introspection function which +returns a boolean indicating whether sub is a subclass of super. I.e., it is a +predicate for testing subclass relationships. + + +Multiple inheritance + +.int.mult-inherit: Multiple inheritance involves an extension of the protocol +(see .int.extend) and also multiple uses of the single inheritance mechanism +(see .ini.inheritance). It also requires specialized methods for coerceClass +and coerceInst to be written (see .overview.coerce-class & +.overview.coerce-inst). Documentation on support for multiple inheritance is +under construction. This facility is not currently used. The basic idea is +described in mail.tony.1998-10-06.11-03(0). + + +Protocol guidelines + +.guide.fail: When designing an extensible function which might fail, the design +must permit the correct implementation of the failure-case code. Typically, a +failure might occur in any method in the chain. Each method is responsible for +correctly propagating failure information supplied by superclass methods and +for managing it's own failures. + +.guide.fail.before-next: Dealing with a failure which is detected before any +next-method call is made is similar to a fail case in any non-extensible +function. See .example.fail below. + +.guide.fail.during-next: Dealing with a failure returned from a next-method +call is also similar to a fail case in any non-extensible function. See +.example.fail below. + +.guide.fail.after-next: Dealing with a failure which is detected after the next +methods have been successfully invoked is more complex. If this scenario is +possible, the design must include an "anti-function", and each class must +ensure that it provides a method for the anti-method which will clean up any +resources which are claimed after a successful invocation of the main method +for that class. Typically the anti-function would exist anyway for clients of +the protocol (e.g. "finish" is an anti-function for "init"). The effect of the +next-method call can then be cleaned up by calling the anti-method for the +superclass. See .example.fail below. + + +Example + +.example.inheritance: The following example class definition shows both +inheritance and specialization. It shows the definition of the class +EPDRPoolClass, which inherits from EPDLPoolClass and has specialized values of +the name, init & alloc fields. The type EPDLPoolClass is an alias (typedef) +for PoolClass. + +typedef EPDLPoolClass EPDRPoolClass; +typedef EPDLPoolClassStruct EPDRPoolClassStruct; + +DEFINE_CLASS(EPDRPoolClass, this) +{ + INHERIT_CLASS(this, EPDLPoolClass); + this->name = "EPDR"; + this->init = EPDRInit; + this->alloc = EPDRAlloc; +} + + +.example.extension: The following (hypothetical) example class definition shows +inheritance, specialization and also extension. It shows the definition of the +class EPDLDebugPoolClass, which inherits from EPDLPoolClass, but also +implements a method for checking properties of the pool. + +typedef struct EPDLDebugPoolClassStruct { + EPDLPoolClassStruct epdl; + DebugPoolCheckMethod check; + Sig sig; +} EPDLDebugPoolClassStruct; + +typedef EPDLDebugPoolClassStruct *EPDLDebugPoolClass; + +DEFINE_CLASS(EPDLDebugPoolClass, this) +{ + EPDLPoolClass epdl = &this->epdl; + INHERIT_CLASS(epdl, EPDLPoolClass); + epdl->name = "EPDLDBG"; + this->check = EPDLDebugCheck; + this->sig = EPDLDebugSig; +} + +.example.fail: The following example shows the implemenation of failure-case +code for an "init" method, making use of the "finish" anti-method:- + +static Res mySegInit(Seg seg, Pool pool, Addr base, Size size, + Bool reservoirPermit, va_list args) +{ + SegClass super; + MYSeg myseg; + OBJ1 obj1; + Res res; + Arena arena; + + AVERT(Seg, seg); + myseg = SegMYSeg(seg); + AVERT(Pool, pool); + arena = PoolArena(pool); + + /* Ensure the pool is ready for the segment */ + res = myNoteSeg(pool, seg); + if(res != ResOK) + goto failNoteSeg; + + /* Initialize the superclass fields first via next-method call */ + super = (SegClass)SUPERCLASS(MYSegClass); + res = super->init(seg, pool, base, size, reservoirPermit, args); + if(res != ResOK) + goto failNextMethods; + + /* Create an object after the next-method call */ + res = ControlAlloc(&obj1, arena, sizeof(OBJ1Struct), reservoirPermit); + if(res != ResOK) + goto failObj1; + + myseg->obj1 = obj1 + return ResOK; + +failObj1: + /* call the anti-method for the superclass */ + super->finish(seg); +failNextMethods: + /* reverse the effect of myNoteSeg */ + myUnnoteSeg(pool, seg); +failNoteSeg: + return res; +} + + +IMPLEMENTATION + +.impl.derived-names: The DEFINE_CLASS macro derives some additional names from +the class name as part of it's implementation. These should not appear in the +source code - but it may be useful to know about this for debugging purposes. +For each class definition for class SomeClass, the macro defines the following: + +extern SomeClass EnsureSomeClass(void); /* The class accessor function. +See.overview.naming */ +static Bool protocolSomeClassGuardian; /* A boolean which indicates whether +the class has been initialzed yet */ +static void protocolEnsureSomeClass(SomeClass); /* A function called by +EnsureSomeClass. All the class initialization code is actually in this function +*/ +static SomeClassStruct protocolSomeClassStruct; /* Static storage for the +canonical class object */ + + +.impl.init-once: Class objects only behave according to their definition after +they have been initialized, and class protocols may not be used before +initialization has happened. The only code which is allowed to see a class +object in a partially initialized state is the initialization code itself -- +and this must take care not to pass the object to any other code which might +assume it is initialized. Once a class has been initialized, the class might +have a client. The class must not be initialized again when this has happened, +because the state is not necessarily consistent in the middle of an +initialization function. The initialization state for each class is stored in +a boolean "guardian" variable whose name is derived from the class name (see +.impl.derived-names). This ensures the initialization hapens only once. The +path through the EnsureSomeClass function should be very fast for the common +case when this variable is TRUE, and the class has already been initialized, as +the canonical static storage can simply be returned in that case. However, +when the value of the guardian is FALSE, the class is not initialized. In this +case, a call to EnsureSomeClass must first execute the initialization code and +then set the guardian to TRUE. However, this must happen atomically (see +.impl.init-lock). + +.impl.init-lock: There would be the possibility of a race condition if +EnsureSomeClass were called concurrently on separate threads before SomeClass +has been initialized. The class must not be initialized more than once, so the +sequence test-guard, init-class, set-guard must be run as a critical region. +It's not sufficient to use the arena lock to protect the critical region, +because the class object might be shared between multiple arenas. The +DEFINE_CLASS macro uses a global recursive lock instead. The lock is only +claimed after an initial unlocked access of the guard variable shows that the +class in not initialized. This avoids any locking overhead for the common case +where the class is already initialized. This lock is provided by the lock +module -- see design.mps.lock(0). + + diff --git a/mps/design/protsu/index.txt b/mps/design/protsu/index.txt new file mode 100644 index 00000000000..48192d49173 --- /dev/null +++ b/mps/design/protsu/index.txt @@ -0,0 +1,105 @@ + SUNOS 4 IMPLEMENTATION OF PROTECTION MODULE + design.mps.protsu + incomplete doc + drj 1997-03-20 + +INTRODUCTION + +.readership: Any MPS developer + +.intro: This is the design of the SunOS 4 implementation of the protection +module. It is intended to be used only in SunOS 4 (os.su). It makes use of +various services provided by SunOS 4. + +[largely unwritten] + +REQUIREMENTS + +.req.general: Required to implement the general protection interface defined in +design.mps.prot.if.*. + + +OVERVIEW + +[uses mprotect] + +MISC + +.improve.sig-stack: Currently we do not handle signals on a separate signal +stack. If we handled signals on our own stack then we could guarantee not to +run out of stack while we were handling the signal. This would be useful (it +may even be required). We would have to use sigvec(2) rather than signal(3) +(set the SV_ONSTACK flag and use sigstack(2)). This has drawbacks as the +signal stack is not grown automatically, so we would have to to frig the stacks +back if we wanted to pass on the signal to some other handler as that handler +may require arbitrary amounts of stack. + +.improve.sigvec: Note 1 of ProtSetup notes that we can't honour the sigvec(2) +entries of the next handler in the chain. What if when we want to pass on the +signal instead of calling the handler we call sigvec with the old entry and use +kill to send the signal to ourselves and then restore our handler using sigvec +again. ramble ramble. [need more detail and analysis here]. + +assume mprotect never fails and why. [We also need a policy here] + +DATASTRUCTURES + +.data.signext: This is static. Because that is the only communications channel +available to signal handlers. [write a little more here] + + +FUNCTIONS + +.fun.setup: + +ProtSetup + +The setup involves installing a signal handler for the signal SIGSEGV to catch +and handle protection faults (this handler is the function sigHandle, see +.fun.sighandle). The previous handler is recorded (in the variable sigNext, see +.data.signext) so that it can be reached from sigHandle if it fails to handle +the fault. + +The problem with this approach is that we can't honor the wishes of the +sigvec(2) entry for the previous handler (in terms of masks in particular). + +Obviously it would be okay to always chain the previous signal handler onto +sigNext, however in the case where the previous handler is the one we've just +installed (ie, sigHandle) then it is not necessary to chain the handler, so we +don't. + +.fun.set: + +void ProtSet(Addr base, Addr limit, AccessSet mode) + +.fun.set.convert: The requested protection (which is expressed in the mode +parameter, see design.mps.prot.if.set) is translated into an OS protection. If +read accesses are to be forbidden then all accesses are forbidden, this is done +by setting the protection of the page to PROT_NONE. If write access are to be +forbidden (and not read accesses) then write accesses are forbidden and read +accesses are allowed, this is done by setting the protection of the page to +PROT_READ|PROT_EXEC. Otherwise (all access are okay), the protection is set to +PROT_READ|PROT_WRITE|PROT_EXEC. + +.fun.set.assume.mprotect: We assume that the call to mprotect always succeeds. +.fun.set.assume.mprotect: This is because we should always call the function +with valid arguments (aligned, references to mapped pages, and with an access +that is compatible with the access of the underlying object). + +.fun.sync: + +void ProtSync(Space space); + +This does nothing in this implementation as ProtSet sets the protection without +any delay. + +.fun.tramp: + +void ProtTramp(void **resultReturn, void *(*f)(void *, size_t), void *p, size_t +s); + +The protection trampoline is trivial under SunOS, as there is nothing that +needs to be done in the dynamic context of the mutator in order to catch +faults. (Contrast this with Win32 Structured Exception Handling.) + + diff --git a/mps/design/pthreadext/index.txt b/mps/design/pthreadext/index.txt new file mode 100644 index 00000000000..7649c7743a4 --- /dev/null +++ b/mps/design/pthreadext/index.txt @@ -0,0 +1,274 @@ + DESIGN OF THE POSIX THREAD EXTENSIONS FOR MPS + design.mps.pthreadext + draft doc + tony 2000-02-01 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: This is the design of the Pthreads extension module, which provides +some low-level threads support for use by MPS (notably suspend and resume). + + +DEFINITIONS + +.pthreads: The term "Pthreads" means an implementation of the POSIX +1003.1c-1995 thread standard. (Or the Single UNIX Specification, Version 2, aka +USV2 or UNIX98.) + +.context: The "context" of a thread is a (platform-specific) OS-defined +structure which describes the current state of the registers for that thread. + + +REQUIREMENTS + +.req.suspend: A means to suspend threads, so that they don't make any progress. +.req.suspend.why: Needed by the thread manager so that other threads registered +with an arena can be suspended (see design.mps.thread-manager). Not directly +provided by Pthreads. + +.req.resume: A means to resume suspended threads, so that they are able to make +progress again. .req.resume.why: Needed by the thread manager. Not directly +provided by Pthreads. + +.req.suspend.multiple: Allow a thread to be suspended on behalf of one arena +when it has already been suspended on behalf of one or more other arenas. +.req.suspend.multiple.why: The thread manager contains no design for +cooperation between arenas to prevent this. + +.req.resume.multiple: Allow requests to resume a thread on behalf of each arena +which had previously suspended the thread. The thread must only be resumed +when requests from all such arenas have been received. +.req.resume.multiple.why: A thread manager for an arena must not permit a +thread to make progress before it explicitly resumes the thread. + +.req.suspend.context: Must be able to access the context for a thread when it +is suspended. + +.req.suspend.protection: Must be able to suspend a thread which is currently +handling a protection fault (i.e., an arena access). Such a thread might even +own an arena lock. + +.req.legal: Required to use Pthreads / POSIX APIs in a legal manner. + + + +ANALYSIS + + +.anal.suspend: Thread suspension is inherently asynchronous. MPS needs to be +able to suspend another thread without prior knowledge of the code that thread +is running. (I.e., we can't rely on cooperation between threads.) The only +asynchronous communication available on POSIX is via signals - so the suspend +and resume mechanism must ultimately be built from signals. + +.anal.signal.safety: POSIX imposes some restrictions on what a signal handler +function might do when invoked asynchronously (see +, and +search for the string "reentrant"). In summary, a small number of POSIX +functions are defined to be "async-signal safe", which means they may be +invoked without restriction in signal handlers. All other POSIX functions are +considered to be unsafe. Behaviour is undefined if an unsafe function is +interrupted by a signal and the signal handler then proceeds to call another +unsafe function. See mail.tony.1999-08-24.15-40(0)and followups for some +further analysis. + +.anal.signal.safety.implication: Since we can't assume that we won't attempt to +suspend a thread while it is running an unsafe function, we must limit the use +of POSIX functions in the suspend signal handler to those which are designed to +be "async-signal safe". One of the few such functions related to +synchronization is sem_post. + +.anal.signal.example: An example of how to suspend threads in POSIX was posted +to newsgroup comp.programming.threads in August 1999. The code that was posted +was written by David Butenhof, and may be found here: + +Some further discussion about the code in the newsgroup is recorded here: + + +.anal.signal.linux-hack: In the current implementation of Linux Pthreads, it +would be possible to implement suspend/resume using SIGSTOP and SIGCONT. This +is, however, nonportable and will probably stop working on Linux at some point. + +.anal.component: There is no known way to meet the requirements above in a way +which cooperates with another component in the system which also provides its +own mechanism to suspend and resume threads. The best bet for achieving this +is to provide the functionality in shared low-level component which may be used +by MPS and other clients. This will require some discussion with other +potential clients and/or standards bodies. .anal.component: NB., such +cooperation is actually a requirement for Dylan (req.dylan.dc.env.self), though +this is not a problem, since all the Dylan components share the MPS mechanism. + + + +INTERFACE + +.if.pthreadext.abstract: A thread is represented by the abstract type +PThreadext. A PThreadext object corresponds directly with a PThread (of type +pthread_t). There may be more than one PThreadext object for the same PThread. + +.if.pthreadext.structure: The structure definition of PThreadext +(PThreadextStruct) is exposed by the interface so that it may be embedded in a +client datastructure (e.g. ThreadStruct). This means that all storage +management can be left to the client (which is important because there might be +multiple arenas involved). Clients may not access the fields of a +PThreadextStruct directly. + +.if.init: Initializes a PThreadext object for a thread with the given id: +void PThreadextInit(PThreadext pthreadext, pthread_t id) + +.if.check: Checks a PThreadext object for consistency: +Bool PThreadextCheck(PThreadext pthreadext) +Note that this function takes the mutex, so it must not be called with the +mutex held (doing so will probably deadlock the thread). + +.if.suspend: Suspends a PThreadext object (puts it into a suspended state). +Meets .req.suspend.*. The object must not already be in a suspended state. If +the function returns ResOK, the context of the thread is returned in +contextReturn, and the corresponding PThread will not make any progress until +it is resumed: +Res PThreadextSuspend(PThreadext pthreadext, struct sigcontext **contextReturn) + +.if.resume: Resumes a PThreadext object. Meets .req.resume.*. The object must +already be in a suspended state. Puts the object into a non-suspended state. +Permits the corresponding PThread to make progress again, (although that might +not happen immediately if there is another suspended PThreadext object +corresponding to the same thread): +Res PThreadextResume(PThreadext pthreadext) + +.if.finish: Finishes a PThreadext object: +void PThreadextFinish(PThreadext pthreadext) + + +IMPLEMENTATION + + +.impl.pthreadext: The structure definition for a PThreadext object is: +typedef struct PThreadextStruct { + Sig sig; /* design.mps.sig */ + pthread_t id; /* Thread ID */ + struct sigcontext *suspendedScp; /* sigcontext if suspended */ + RingStruct threadRing; /* ring of suspended threads */ + RingStruct idRing; /* duplicate suspensions for id */ +} PThreadextStruct; + +.impl.field.id: The id field shows which PThread the object corresponds to. + +.impl.field.scp: The suspendedScp field contains the context when in a +suspended state. Otherwise it is NULL. + +.impl.field.threadring: The threadRing field is used to chain the object onto +the suspend ring when it is in the suspended state (see .impl.suspend-ring). +When not in a suspended state, this ring is single. + +.impl.field.idring: The idRing field is used to group the object with other +objects corresponding to the same PThread (same id field) when they are in the +suspended state. When not in a suspended state, or when this is the only +PThreadext object with this id in the suspended state, this ring is single. + +.impl.global.suspend-ring: The module maintains a global suspend-ring - a ring +of PThreadext objects which are in a suspended state. This is primarily so +that it's possible to determine whether a thread is curently suspended anyway +because of another PThreadext object, when a suspend attempt is made. + +.impl.global.victim: The module maintains a global variable which is used to +indicate which PThreadext is the current victim during suspend operations. This +is used to communicate information between the controlling thread and the +thread being suspended (the victim). The variable has value NULL at other times. + +.impl.static.mutex: We use a lock (mutex) around the suspend and resume +operations. This protects the state data (suspend-ring etc. see impl.global.*). +Since only one thread can be suspended at a time, there's no possibility of two +arenas suspending each other by concurrently suspending each other's threads. + +.impl.static.semaphore: We use a semaphore to synchronize between the +controlling and victim threads during the suspend operation. See .impl.suspend +and .impl.suspend-handler). + +.impl.static.init: The static data and global variables of the module are +initialized on the first call to PThreadextSuspend, using pthread_once to avoid +concurrency problems. We also enable the signal handlers at the same time (see +.impl.suspend-handler and .impl.resume-handler). + +.impl.suspend: PThreadextSuspend first ensures the module is initialized (see +.impl.static.init). After this, it claims the mutex (see .impl.static.mutex). +It then checks to see whether thread of the target PThreadext object has +already been suspended on behalf of another PThreadext object. It does this by +iterating over the suspend ring. + +.impl.suspend.already-suspended: If another object with the same id is found on +the suspend ring, then the thread is already suspended. The context of the +target object is updated from the other object, and the other object is linked +into the idRing of the target. + +.impl.suspend.not-suspended: If the thread is not already suspended, then we +forcibly suspend it using a technique similar to Butenhof's (see +.anal.signal.example): First we set the victim variable (see +.impl.global.victim) to indicate the target object. Then we send the signal +PTHREADEXT_SIGSUSPEND to the thread (see .impl.signals), and wait on the +semaphore for it to indicate that it has received the signal and updated the +victim variable with the context. If either of these operations fail (e.g. +because of thread termination) we unlock the mutex and return ResFAIL. + +.impl.suspend.update: Once we have ensured that the thread is definitely +suspended, we add the target PThreadext object to the suspend ring, unlock the +mutex, and return the context to the caller. + +.impl.suspend-handler: The suspend signal handler is invoked in the target +thread during a suspend operation, when a PTHREADEXT_SIGSUSPEND signal is sent +by the controlling thread (see .impl.suspend.not-suspended). The handler +determines the context (received as a parameter, although this may be +platform-specific) and stores this in the victim object (see +.impl.global.victim). The handler then masks out all signals except the one +that will be received on a resume operation (PTHREADEXT_SIGRESUME) and +synchronizes with the controlling thread by posting the semaphore. Finally the +handler suspends until the resume signal is received (using sigsuspend). + +.impl.resume: PThreadextResume first claims the mutex (see .impl.static.mutex). +It then checks to see whether thread of the target PThreadext object has also +been suspended on behalf of another PThreadext object (in which case the id +ring of the target object will not be single). + +.impl.resume.also-suspended: If the thread is also suspended on behalf of +another PThreadext, then the target object is removed from the id ring. + +.impl.resume.not-also: If the thread is not also suspended on behalf of another +PThreadext, then the thread is resumed using the technique proposed by Butenhof +(see .anal.signal.example). I.e. we send it the signal PTHREADEXT_SIGRESUME +(see .impl.signals) and expect it to wake up. If this operation fails (e.g. +because of thread termination) we unlock the mutex and return ResFAIL. + +.impl.resume.update: Once the target thread is in the appropriate state, we +remove the target PThreadext object from the suspend ring, set its context to +NULL and unlock the mutex. + +.impl.resume-handler: The resume signal handler is invoked in the target thread +during a resume operation, when a PTHREADEXT_SIGRESUME signal is sent by the +controlling thread (see .impl.resume.not-also). The resume signal handler +simply returns. This is sufficient to unblock the suspend handler, which will +have been blocking the thread at the time of the signal. The Pthreads +implementation ensures that the signal mask is restored to the value it had +before the signal handler was invoked. + +.impl.finish: PThreadextFinish supports the finishing of objects in the +suspended state, and removes them from the suspend ring and id ring as +necessary. It must claim the mutex for the removal operation (to ensure +atomicity of the operation). Finishing of suspended objects is supported so +that clients can dispose of resources if a resume operation fails (which +probably means that the PThread has terminated). + +.impl.signals: The choice of which signals to use for suspend and restore +operations may need to be platform-specific. Some signals are likely to be +generated and/or handled by other parts of the application and so should not be +used (e.g. SIGSEGV). Some implementations of PThreads use some signals for +themselves, so they may not be used; e.g. LinuxThreads uses SIGUSR1 and SIGUSR2 +for its own purposes. The design abstractly names the signals +PTHREADEXT_SIGSUSPEND and PTHREAD_SIGRESUME, so that they may be easily mapped +to appropriate real signal values. Candidate choices are SIGXFSZ and SIGPWR. + + +ATTACHMENTS + "posix.txt" + "susp.c" + diff --git a/mps/design/reservoir/index.txt b/mps/design/reservoir/index.txt new file mode 100644 index 00000000000..b1dfb823aa1 --- /dev/null +++ b/mps/design/reservoir/index.txt @@ -0,0 +1,95 @@ + THE DESIGN OF THE LOW-MEMORY RESERVOIR + design.mps.reservoir + incomplete design + tony 1998-07-30 + + +INTRODUCTION: + +The low-memory reservoir provides client support for implementing handlers for +low-memory situations which allocate. The reservoir is implemented inside the +arena as a pool of unallocatable segments. + + +OVERVIEW: + +This is just a placeholder at the moment. + + +ARCHITECTURE: + +.adt: The reservoir interface looks (almost) like an abstract data type of type +Reservoir. It's not quite abstract because the arena embeds the structure of +the reservoir (of type ReservoirStruct) into its own structure, for simplicity +of initialization. + +.align: The reservoir is implemented as a pool of available tracts, along with +a size and limit which must always be aligned to the arena alignment. The size +corresponds to the amount of memory currently maintained in the reservoir. The +limit is the maximum amount that it is desired to maintain. + +.wastage: When the reservoir limit is set by the client, the actual limit +should be increased by an arena alignment amount for every active mutator +buffer. + +.really-empty: When the reservoir limit is set to 0, assume that the client +really doesn't have a need for a reservoir at all. In this case, the client +won't even want an allowance to be made for wastage in active buffers. + + +IMPLEMENTATION: + +.interface: The following functions comprise the interface to the reservoir +module: + + +.interface.check: ReservoirCheck checks the reservoir for consistency: +extern Bool ReservoirCheck(Reservoir reservoir); + +.interface.init: ReservoirInit initializes the reservoir and its associated +pool, setting the size and limit to 0: +extern Res ReservoirInit(Reservoir reservoir, Arena arena); + +.interface.finish: ReservoirFinish de-initializes the reservoir and its +associated pool: +extern void ReservoirFinish (Reservoir reservoir); + +.interface.limit: ReservoirLimit returns the limit of the reservoir: +extern Size ReservoirLimit(Reservoir reservoir); + +.interface.set-limit: ReservoirSetLimit sets the limit of the reservoir, making +an allowance for wastage in mutator buffers: +extern void ReservoirSetLimit(Reservoir reservoir, Size size); + +.interface.available: ReservoirAvailable returns the available size of the +reservoir: +extern Size ReservoirAvailable(Reservoir reservoir); + +.interface.ensure-full: ReservoirEnsureFull attempts to fill the reservoir with +memory from the arena, until it is full: +extern Res ReservoirEnsureFull(Reservoir reservoir); + +.interface.deposit: ReservoirDeposit attempts to fill the reservoir with memory +in the supplied range, until it is full. This is called by the arena from +ArenaFree if the reservoir is not known to be full. Any memory which is not +added to the reservoir (because the reservoir is full) is freed via the arena +class's free method. +extern void ReservoirDeposit(Reservoir reservoir, Addr base, Size size); + +.interface.withdraw: ReservoirWithdraw attempts to allocate memory of the +specified size to the specified pool to the reservoir. If no suitable memory +can be found it returns ResMEMORY. +extern Res ReservoirWithdraw(Addr *baseReturn, Tract *baseTractReturn, + Reservoir reservoir, Size size, Pool pool); + +.interface.withdraw.align: Currently, ReservoirWithdraw can only withdraw +memory in chunks of the size of the arena alignment. This is because the +reservoir doesn't attempt to coalesce adjacent memory blocks. This deficiency +should be fixed in the future. + +.pool: The memory managed by the reservoir is owned by the reservoir pool. +This memory is never sub-allocated. Each tract belonging to the pool is linked +onto a list. The head of the list is in the Reservoir object. Links are +stored in the TractP fields of each tract object. + + diff --git a/mps/design/ring/index.txt b/mps/design/ring/index.txt new file mode 100644 index 00000000000..67832dfbce5 --- /dev/null +++ b/mps/design/ring/index.txt @@ -0,0 +1,154 @@ + THE DESIGN OF THE RING DATA STRUCTURE + design.mps.ring + incomplete doc + richard 1996-09-26 + +INTRODUCTION + +.source: rings are derived from the earlier use of Deques. See +design.mps.deque. + + +DESCRIPTION + +.def.ring: Rings are circular doubly-linked lists of ring "nodes". The nodes +are fields of structures which are the "elements" of the ring. + +Ring node structures (RingStruct) are in-lined in the structures on the ring, +like this: + + typedef struct FooStruct *Foo; /* the element type */ + typedef struct FooStruct { /* the element structure */ + int baz, bim; + RingStruct ring; /* the ring node */ + float bip, bop; + } FooStruct; + +This arrangement means that they do not need to be managed separately. This is +especially useful in avoiding re-entrancy and bootstrapping problems in the +memory manager. Rings also provide flexible insertion and deletion because the +entire ring can be found from any node. + +In the MPS, rings are used to connect a "parent" structure (such as a Space) to +a number of "child" structures (such as Pools), as shown in .fig.ring (note the +slight abuse of naming convention (in that barRing is not called +barRingStruct)). + +.fig.ring: A ring of Bar objects owned by a Foo object. + + + +.fig.empty: An empty ring of Bar objects owned by a Foo object. + + +.def.singleton: A "singleton" ring is a ring containing one node, whose +previous and next nodes are itself (see .fig.single). + +.fig.single: A singleton Bar object not on any ring. + + +.fig.elt: How RING_ELT gets a parent pointer from a node pointer. + + + - Ring Diagrams + +INIT / FINISH + +.init: Rings are initialized with the RingInit function. They are initialized +to be a singleton ring (.def.singleton). + +.finish: Rings are finished with the RingFinish funtion. A ring must be a +singleton ring before it can be finished (it is an error to attempt to finish a +non-singleton ring). + + +ITERATION + +.for: A macro is used for iterating over the elements in a ring. This macro is +called RING_FOR. RING_FOR takes three arguments, the first is an iteration +variable: "node", the second is the "parent" element in the ring: "ring", the +third is a variable used by the iterator for working state (it holds a pointer +to the next node): "next". All arguments must be of type Ring. The "node" and +"next" variables must be declared and in scope already. All elements except +for the "parent" element are iterated over. The macro expands to a for +statement. During execution of the loop, the "node" variable (the first +argument to the macro) will be the value of successive elements in the Ring (at +the beginning of the statement in the body of the loop). .for.error: It is an +error (possibly unchecked) for the "node" and "next" variables to be modified +except implicitly by using this iterator. .for.safe: It is safe to delete the +current node during the iteration. + +.for.ex: + +An example: + +Ring node, nextNode; +RING_FOR(node, &foo->barRing, nextNode) { + Bar bar = RING_ELT(Bar, FooRing, node); + frob(bar); +} + +.for.ex.elt: Notice the idiomatic use of RING_ELT which is almost universal +when using RING_FOR. + + +SUBCLASS + +.elt: RING_ELT is a macro that converts a pointer to a ring structure to a +pointer to the enclosing parent structure. +RING_ELT has three arguments which are, in order: +type, the type of a pointer to the enclosing structure, +field, the name of the ring structure field within it, +ring, the ring node. +The result is a pointer to the enclosing structure. + +[ Why does RING_ELT not use PARENT or even offsetof? Apparently it's so that +it can cope with arrays of rings. GavinM 1997-04-15] + + +APPEND / REMOVE + +.append: RingAppend appends a singleton ring to a ring (such that the newly +added element will be last in the iteration sequence). + +.insert: RingInsert adds a singleton rung to a ring (such that the newly added +element will be first in the +iteration sequence). + +.remove: RingRemove removes an element from a ring, the newly removed element +becomes a singleton ring. It is an error for the element to already be a +singleton. + +.improve.join: it would be possible to add a RingJoin operation. This is not +done as it is not required. + + +NAMING + +.naming: By convention, when one structure Foo contains one ring of Bar +structures, the field is Foo is usually known as barRing, and the field in Bar +is known as fooRing. If the Foo structure contains more than one ring of Bar +structures, then they will have names such as spongRing and frobRing. + + +DEQUES + +This section documents where rings differ significantly from deques. + +.head: Deques used a distinguished head structure for the head of the ring. +Rings still have a separate head structure, but it is not distinguished by type. + + +DEFECTS + +This section documents known defects with the current design. + +.app_for.misuse: It is easy to pass RingAppend and RING_FOR the arguments in +the wrong order as all the arguments have the same type. see .head. + +.check.improve: There is no method for performing a full integrity check. This +could be added. + +ATTACHMENT + "Ring Diagrams" + diff --git a/mps/design/root/index.txt b/mps/design/root/index.txt new file mode 100644 index 00000000000..a97db19b609 --- /dev/null +++ b/mps/design/root/index.txt @@ -0,0 +1,65 @@ + THE DESIGN OF THE ROOT MANAGER + design.mps.root + incomplete design + richard 1995-08-25 + +INTRODUCTION + +.intro: + +.readership: + + +BASICS + +.root.def: The root node of the object graph is the node which defines whether +objects are accessible, and the place from which the mutator acts to change the +graph. In the MPS, a root is an object which describes part of the root node. +The root node is the total of all the roots attached to the space. [Note that +this combines two definitions of root: the accessibility is what defines a root +for tracing (see analysis.tracer.root.* and the mutator action for barriers +(see analysis.async-gc.root). pekka 1998-03-20] + +.root.repr: Functionally, roots are defined by their scanning functions. Roots +_could_ be represented as function closures, i.e. a pointer to a C function and +some auxillary fields. The most general variant of roots is just that. +However, for reasons of efficiency, some special variants are separated out. + + +DETAILS + + +Creation + +.create: A root becomes "active" as soon as it is created. + +.create.col: The root inherits its colour from the mutator, since it can only +contain references copied there by the mutator from somewhere else. If the +mutator is grey for a trace when a root is created then that root will be used +to determine accessibility for that trace. More specifically, the root will be +scanned when that trace flips. + + +Destruction + +.destroy: It's OK to destroy a root at any time, except perhaps concurrently +with scanning it, but that's prevented by the arena lock. If a root is +destroyed the references in it become invalid and unusable. + + +Invariants + +.inv.white: Roots are never white for any trace, because they cannot be +condemned. + +.inv.rank: Roots always have a single rank. A root without ranks would be a +root without references, which would be pointless. The tracer doesn't support +multiple ranks in a single colour. + + +Scanning + +.method: Root scanning methods are provided by the client so that the MPS can +locate and scan the root set. See protocol.mps.root for details. [There are +some more notes about root methods in meeting.qa.1996-10-16.] + diff --git a/mps/design/scan/index.txt b/mps/design/scan/index.txt new file mode 100644 index 00000000000..bf81a79fd6b --- /dev/null +++ b/mps/design/scan/index.txt @@ -0,0 +1,76 @@ + THE DESIGN OF THE GENERIC SCANNER + design.mps.scan + incomplete design + richard 1995-08-25 + +SUMMARIES + +Scanned Summary + +.summary.subset: The summary of reference seens by scan (ss.unfixedSummary) is +a subset of the summary previously computed (SegSummary). + +There are two reasons that it is not an equality relation: + +1. If the segment has had objects forwarded onto it then its summary will get +unioned with the summary of the segment that the object was forwarded from. +This may increase the summary. The forwarded object of course may have a +smaller summary (if such a thing were to be computed) and so subsequent +scanning of the segment may reduce the summmary. (The forwarding process may +erroneously introduce zones into the destination's summary). + +2. A write barrier hit will set the summary to RefSetUNIV. + +The reason that ss.unfixedSummary is always a subset of the previous summary is +due to an "optimization" which has not been made in TraceFix. See +impl.c.trace.fix.fixed.all. + + +Partial Scans + +.clever-summary: With enough cleverness, it's possible to have partial scans of +condemned segments contribute to the segment summary. [We had a system which +nearly worked -- see MMsrc(MMdevel_poolams at 1997/08/14 13:02:55 BST), but it +did not handle the situation in which a segment was not under the write barrier +when it was condemned.] + +.clever-summary.acc: Each time we partially scan a segment, we accumulate the +post-scan summary of the scanned objects into a field in the group, called +'summarySoFar'. The post-scan summary is (summary \ white) U fixed. + +.clever-summary.acc.condemn: The cumulative summary is only meaningful while +the segment is condemned. Otherwise it is set to RefSetEMPTY (a value which we +can check). + +.clever-summary.acc.reclaim: Then when we reclaim the segment, we set the +segment summary to the cumulative summary, as it is a post-scan summary of all +the scanned objects. + +.clever-summary.acc.other-trace: If the segment is scanned by another trace +while it is condemned, the cumulative summary must be set to the post-scan +summary of this scan (otherwise it becomes out-of-date). + +.clever-summary.scan: The scan summary is expected to be a summary of all +scanned references in the segment. We don't know this accurately until we've +scanned everything in the segment. So we add in the segment summary each time. + +.clever-summary.scan.fix: TraceScan also expects the scan state fixed summary +to include the post-scan summary of all references which were white. Since we +don't scan all white references, we need to add in an approximation to the +summary of all white references which we didn't scan. This is the intersection +of the segment summary and the white summary. + +.clever-summary.wb: If the cumulative summary is smaller than the mutator's +summary, a write-barrier is needed to pervent the mutator from invalidating it. + This means that sometimes we'd have to put the segment under the write-barrier +at condemn [this is not an operation currently available to pool class +implementations pekka 1998-02-26], which might not be very efficient. + +.clever-summary.method.wb: We need a new pool class method, called when the +write barrier is hit (or possibly any barrier hit). The generic method will do +the usual TraceAccess work, the trivial method will do nothing. + +.clever-summary.acc.wb: When the write barrier is hit, we need to correct the +cumulative summary to the mutator summary. This is approximated by setting the +summary to RefSetUNIV. + diff --git a/mps/design/seg/index.txt b/mps/design/seg/index.txt new file mode 100644 index 00000000000..e8f2b81c3c4 --- /dev/null +++ b/mps/design/seg/index.txt @@ -0,0 +1,273 @@ + THE DESIGN OF THE MPS SEGMENT DATA STRUCTURE + design.mps.seg + incomplete design + drj 1997-04-03 + +INTRODUCTION + +.intro: This document describes the MPS Segment data structure. + + +Document History + +.hist.2: The initial draft (replacing various notes in revisions 0 and 1) was +drafted by Richard Brooksby on 1997-04-03 as part of editing +MMsrc!seg.c(MMdevel_action2.1). + +.hist.3: Rewritten to separate segments and tracts, following +mail.tony.1998-11-02.10-26. tony 1999-04-16 + +OVERVIEW + +.over.segments: Segments are the basic units of tracing and shielding. The MPM +also uses them as units of scanning and colour, although pool classes may +subdivide segments and be able to maintain colour on a finer grain (down to the +object level, for example). + +.over.objects: The mutator's objects are stored in segments. Segments are +contiguous blocks of memory managed by some pool. .segments.pool: The +arrangement of objects within a segment is determined by the class of the pool +which owns the segment. The pool is associated with the segment indirectly via +the first tract of the segment. + +.over.memory: The relationship between segments and areas of memory is +maintained by the segment module. Pools acquire tracts from the arena, and +release them back to the arena when they don't need them any longer. The +segment module can associate contiguous tracts owned by the same pool with a +segment. The segment module provides the methods SegBase, SegLimit, and SegSize +which map a segment onto the addresses of the memory block it represents. + +.over.hierarchy: The Segment datastructure is designed to be subclassable (see +design.mps.protocol). The basic segment class (Seg) supports colour and +protection for use by the tracer, as well as support for a pool ring, and all +generic segment functions. Clients may use Seg directly, but will most probably +want to use a subclass with additional properties. + +.over.hierarchy.gcseg: The segment module provides GCSeg - a subclass of Seg +which has full support for GC including buffering and the ability to be linked +onto the grey ring. + +DATA STRUCTURE + + +typedef struct SegStruct { /* segment structure */ + Sig sig; /* impl.h.misc.sig */ + SegClass class; /* segment class structure */ + Tract firstTract; /* first tract of segment */ + RingStruct poolRing; /* link in list of segs in pool */ + Addr limit; /* limit of segment */ + unsigned depth : SHIELD_DEPTH_WIDTH; /* see impl.c.shield.def.depth */ + AccessSet pm : AccessMAX; /* protection mode, impl.c.shield */ + AccessSet sm : AccessMAX; /* shield mode, impl.c.shield */ + TraceSet grey : TRACE_MAX; /* traces for which seg is grey */ + TraceSet white : TRACE_MAX; /* traces for which seg is white */ + TraceSet nailed : TRACE_MAX; /* traces for which seg has nailed objects */ + RankSet rankSet : RankMAX; /* ranks of references in this seg */ +} SegStruct; + + +typedef struct GCSegStruct { /* GC segment structure */ + SegStruct segStruct; /* superclass fields must come first */ + RingStruct greyRing; /* link in list of grey segs */ + RefSet summary; /* summary of references out of seg */ + Buffer buffer; /* non-NULL if seg is buffered */ + Sig sig; /* design.mps.sig */ +} GCSegStruct; + + +.field.rankSet: The "rankSet" field represents the set of ranks of the +references in the segment. It is initialized to empty by SegInit. +.field.rankSet.single: The Tracer only permits one rank per segment [ref?] so +this field is either empty or a singleton. .field.rankSet.empty: An empty +rankSet indicates that there are no references. If there are no references in +the segment then it cannot contain black or grey references. +.field.rankSet.start: If references are stored in the segment then it must be +updated, along with the summary (.field.summary.start). + +.field.depth: The "depth" field is used by the Sheild (impl.c.shield) to manage +protection of the segment. It is initialized to zero by SegInit. + +.field.sm: The "sm" field is used by the Shield (impl.c.shield) to manage +protection of the segment. It is initialized to AccessSetEMPTY by SegInit. + +.field.pm: The "pm" field is used by the Shield (impl.c.shield) to manage +protection of the segment. It is initialized to AccessSetEMPTY by SegInit. +The field is used by both the shield and the ANSI fake protection +(impl.c.protan). + +.field.black: The "black" field is the set of traces for which there may be +black objects (i.e. objects containing references, but no references to white +objects) in the segment. More precisely, if there is a black object for a +trace in the segment then that trace will appear in the "black" field. It is +initialized to TraceSetEMPTY by SegInit. + +.field.grey: The "grey" field is the set of traces for which there may be grey +objects (i.e containing refrences to white objects) in the segment. More +precisely, if there is a reference to a white object for a trace in the segment +then that trace will appear in the "grey" field. It is initialized to +TraceSetEMPTY by SegInit. + +.field.white: The "white" field is the set of traces for which there may be +white objects in the segment. More precisely, if there is a white object for a +trace in the segment then that trace will appear in the "white" field. It is +initialized to TraceSetEMPTY by SegInit. + +.field.summary: The "summary" field is an approximation to the set of all +references in the segment. If there is a reference R in the segment, then +RefSetIsMember(summary, R) is TRUE. The summary is initialized to RefSetEMPTY +by SegInit. .field.summary.start: If references are stored in the segment then +it must be updated, along with rankSet (.field.rankSet.start). + +.field.buffer: The "buffer" field is either NULL, or points to the descriptor +structure of the buffer which is currently allocating in the segment. The +field is initialized to NULL by SegInit. .field.buffer.owner: This buffer must +belong to the same pool as the segment, because only that pool has the right to +attach it. + +INTERFACE + +Splitting and Merging + +.split-and-merge: There is support for splitting and merging segments, to give +pools the flexibility to rearrange their tracts among segments as they see fit. + +.split: Segments may be split with the function SegSplit + +Res SegSplit(Seg *segLoReturn, Seg *segHiReturn, Seg seg, Addr at, + Bool withReservoirPermit, ...); + +If successful, segment "seg" is split at address "at", yielding two segments +which are returned in segLoReturn and segHiReturn for the low and high segments +respectively. The base of the low segment is the old base of "seg". The limit +of the low segment is "at". The base of the high segment is "at". This limit of +the high segment is the old limit of "seg". "seg" is effectively destroyed +during this operation (actually, it might be reused as one of the returned +segments). Segment subclasses may make use of the optional arguments; the +built-in classes do not. + +.split.invariants: The client must ensure some invariants are met before +calling SegSplit: + +.split.inv.align: "at" must be appropriately aligned to the arena alignment, +and lie between the base and limit of "seg". Justification: the split segments +cannot be represented if this is not so. + +.split.inv.buffer: If "seg" is attached to a buffer, the buffered region must +not include address "at". Justification: the segment module is not in a +position to know how (or whether) a pool might wish to split a buffer. This +permits the buffer to remain attached to just one of the returned segments. + +.split.state: Except as noted above, the segments returned have the same +properties as "seg". I.e. their colour, summary, rankset, nailedness etc. are +set to the values of "seg". + +.merge: Segments may be merged with the function SegMerge + +Res SegMerge(Seg *mergedSegReturn, Seg segLo, Seg segHi, + Bool withReservoirPermit, ...); + +If successful, segments "segLo" and "segHi" are merged together, yielding a +segment which is returned in mergedSegReturn. "segLo" and "segHi" are +effectively destroyed during this operation (actually, one of them might be +reused as the merged segment). Segment subclasses may make use of the optional +arguments; the built-in classes do not. + +.merge.invariants: The client must ensure some invariants are met before +calling SegMerge: + +.merge.inv.abut: The limit of "segLo" must be the same as the base of "segHi". +Justification: the merged segment cannot be represented if this is not so. + +.merge.inv.buffer: One or other of "segLo" and "segHi" may attached to a +buffer, but not both. Justification: the segment module does not support +attachment of a single seg to 2 buffers. + +.merge.inv.similar: "segLo" and "segHi" must be sufficiently similar. Two +segments are sufficiently similar if they have identical values for each of the +following fields: class, sm, grey, white, nailed, rankSet. Justification: there +is no single choice of behaviour for cases where these fields are not +identical. The pool class must make it's own choices about this if it wishes to +permit more flexible merging. If so, it should be a simple matter for the pool +to arrange for the segments to look sufficiently similar before calling +SegMerge. + +.merge.state: The merged segment will share the same state as "segLo" and +"segHi" for those fields which are identical (see .merge.inv.similar). The +summary will be the union of the summaries of "segLo" and "segHi". + + +EXTENSIBILITY + +Splitting and Merging + +.method.split: Segment subclasses may extend the support for segment splitting +by defining their own "split" method. + +Res segSplit(Seg seg, Seg segHi, + Addr base, Addr mid, Addr limit, + Bool withReservoirPermit, va_list args) + +On entry, "seg" is a segment with region [base,limit), "segHi" is +uninitialized, "mid" is the address at which the segment is to be split. The +method is responsible for destructively modifying "seg" and initializing +"segHi" so that on exit "seg" is a segment with region [base,mid) and segHi is +a segment with region [mid,limit). Usually a method would only directly modify +the fields defined for the segment subclass. This might involve allocation, +which may use the reservoir if "withReservoirPermit" is TRUE. +.method.split.next: A split method should always call the next method, either +before or after any class-specific code (see design.mps.protocol +.overview.next-method). + +.method.merge: Segment subclasses may extend the support for segment merging by +defining their own "merge" method. + +Res segMerge(Seg seg, Seg segHi, + Addr base, Addr mid, Addr limit, + Bool withReservoirPermit, va_list args) + +On entry, "seg" is a segment with region [base,mid), "segHi" is a segment with +region [mid,limit), The method is responsible for destructively modifying "seg" +and finishing "segHi" so that on exit "seg" is a segment with region +[base,limit) and segHi is garbage. Usually a method would only modify the +fields defined for the segment subclass. This might involve allocation, which +may use the reservoir if "withReservoirPermit" is TRUE. .method.merge.next: A +merge method should always call the next method, either before or after any +class-specific code (see design.mps.protocol.overview.next-method). + +.split-merge.shield: Split and merge methods may assume that the segments they +are manipulating are not in the shield cache. .split-merge.shield.flush: The +shield cache is flushed before any split or merge methods are invoked. +.split-merge.shield.re-flush: If a split or merge method performs an operation +on a segment which might cause the segment to be cached, the method must flush +the shield cache before returning or calling another split or merge method. + +.split-merge.fail: Split and merge methods might fail, in which case segments +"seg" and "segHi" must be equivalently valid and configured at exit as they +were according to the entry conditions. It's simplest if the failure can be +detected before calling the next method (e.g. by allocating any objects early +in the method). .split-merge.fail.anti: If it's not possible to detect failure +before calling the next method, the appropriate anti-method must be used (see +design.mps.protocol.guide.fail.after-next). Split methods are anti-methods for +merge methods, and vice-versa. .split-merge.fail.anti.constrain: In general, +care should be taken when writing split and merge methods to ensure that they +really are anti-methods for each other. The anti-method must not fail if the +initial method succeeded. The anti-method should reverse any side effects of +the initial method, except where it's known to be safe to avoid this (see +.split-merge.fail.summary for an example of a safe case). +.split-merge.fail.anti.no: If this isn't possible (it might not be) then the +methods won't support after-next failure. This fact should be documented, if +the methods are intended to support further specialization. Note that using +va_arg with the "args" parameter is sufficient to make it impossible to reverse +all side effects. + +.split-merge.fail.summary: The segment summary might not be restored exactly +after a failed merge operation. Each segment would be left with a summary which +is the union of the original summaries (see .merge.state). This increases the +conservatism in the summaries, but is otherwise safe. + +.split-merge.unsupported: Segment classes need not support segment merging at +all. The function SegClassMixInNoSplitMerge is supplied to set the split and +merge methods to unsupporting methods that will report an error in checking +varieties. + + diff --git a/mps/design/sig/index.txt b/mps/design/sig/index.txt new file mode 100644 index 00000000000..75979074a10 --- /dev/null +++ b/mps/design/sig/index.txt @@ -0,0 +1,28 @@ + THE DESIGN OF THE MEMORY POOL SYSTEM SIGNATURE SYSTEM + design.mps.sig + incomplete design + richard 1995-08-25 + + +TESTING: + +.test.uniq: The unix command +sed -n '/^#define [a-zA-Z]*Sig/s/[^(]*(/(/p' *.[ch]| sort| uniq -c +will display all signatures defined in the mps along with a count of how many +times they are defined. If any counts are greater than 1, then the same +signature value is being used for different signatures. This is undesirable +and the problem should be investigated. People not using unix may still find +the RE useful. + + +TEXT: + +Signatures are magic numbers which are written into structures +when they are created and invalidated (by overwriting with +SigInvalid) when they are destroyed. They provide a limited form +of run-time type checking and dynamic scope checking. + +Signature values should be transliterations of the corresponding words into +hex, as guide.hex.trans. The first three hex digits should be the +transliteration of "SIG". + diff --git a/mps/design/splay/index.txt b/mps/design/splay/index.txt new file mode 100644 index 00000000000..9e3971b1d51 --- /dev/null +++ b/mps/design/splay/index.txt @@ -0,0 +1,792 @@ + DESIGN OF SPLAY TREES + design.mps.splay + draft doc + gavinm 1998-05-01 + +INTRODUCTION + +.intro: This document explains the design of impl.c.splay, an implementation of +Splay Trees, including its interface and implementation. + +.readership: This document is intended for any MM developer. + +.source: The primary sources for this design are paper.st85(0) and +paper.sleator96(0). Also as CBS is a client, design.mps.cbs. As PoolMVFF is +an indirect client, design.mps.poolmvff(1). Also, as PoolMV2 is an +(obsolescent?) indirect client, design.mps.poolmv2. + +.background: The following background documents influence the design: +guide.impl.c.adt(0). + + +Document History + +.hist.0: Written by GavinM 1998-05-01, made draft 1998-05-27. + +.hist.1: Added client properties. GavinM 1998-09-09 + +.hist.2: Polished for review (chiefly adding a DEFINITIONS section). drj +1999-03-10 + +.hist.3: Edited after review. tony 1999-03-31 + + +OVERVIEW + +.overview: Splay trees are a form of binary tree where each access brings the +accessed element (or the nearest element) to the root of the tree. The +restructuring of the tree caused by the access gives excellent amortised +performance, as the splay tree adapts its shape to usage patterns. Unused +nodes have essentially no time overhead. For a cute animation of splay trees, +see . + + +DEFINITIONS + +.def.splay-tree: A "Splay Tree" is a self-adjusting binary tree as described in +paper.st85(0), paper.sleator96(0). + +.def.node: A "node" is used in the typical datastructure sense to mean an +element of a tree (see also .type.splay.node). + +.def.key: A "key" is a value associated with each node; the keys are totally +ordered by a client provided comparator. + +.def.comparator: A "comparator" is a function that compares keys to determine +their ordering (see also .type.splay.compare.method). + +.def.successor: Node N1 is the "successor" of node N2 if N1 and N2 are both in +the same tree, and the key of N1 immediately follows the key of N2 in the +ordering of all keys for the tree. + +.def.left-child: Each node N contains a "left child", which is a (possibly +empty) sub-tree of nodes. The key of N is ordered after the keys of all nodes +in this sub-tree. + +.def.right-child: Each node N contains a "right child", which is a (possibly +empty) sub-tree of nodes. The key of N is ordered before the keys of all nodes +in this sub-tree. + +.def.neighbour: A node N which has key Kn is a "neighbour" of a key K if either +Kn is the first key in the total order which compares greater than K or if Kn +is the last key in the total order which compares less than K. + +.def.first: A node is the "first" node in a set of nodes if its key compares +less than the keys of all other nodes in the set. + +.def.last: A node is the "last" node in a set of nodes if its key compares +greater than the keys of all other nodes in the set. + +.def.client-property: A "client property" is a value that the client may +associate with each node in addition to the key (a block size, for example). +This splay tree implementation provides support for efficiently finding the +first or last nodes with suitably large client property values. See also .prop +below. + + +REQUIREMENTS + +.req: These requirements are drawn from those implied by design.mps.poolmv2, +design.mps.poolmvff(1), design.mps.cbs(2) and general inferred MPS requirements. + +.req.order: Must maintain a set of abstract keys which is totally ordered for a +comparator. + +.req.tree: The keys must be associated with nodes arranged in a Splay Tree. + +.req.splay: Common operations must balance the tree by splaying it, to achieve +low amortized cost (see paper.st85(0)). + +.req.add: Must be able to add new members. This is a common operation. + +.req.remove: Must be able to remove members. This is a common operation. + +.req.locate: Must be able to locate a member, given a key. This is a common +operation. + +.req.neighbours: Must be able to locate the neighbouring members (in order) of +a non-member, given a key (see .def.neighbour). This is a common operation. + +.req.iterate: Must be able to iterate over all members in order with reasonable +efficiency. + +.req.protocol: Must support detection of protocol violations. + +.req.debug: Must support debugging of clients. + +.req.stack: Must do all non-debugging operations with stack usage bounded by a +constant size. + +.req.adapt: Must adapt to regularities in usage pattern, for better performance. + +.req.property: Must permit a client to associate a client property (such as a +size) with each node in the tree. + +.req.property.change: Must permit a client to dynamically reassign client +properties to nodes in the tree. This is a common operation. + +.req.property.find: Must support rapid finding of the first and last nodes +which have a suitably large value for their client property. This is a common +operation. + +.req.root: Must be able to find the root of a splay tree (if one exists). + + +EXTERNAL TYPES + +.type.splay.tree: SplayTree is the type of the main object at the root of the +splay tree. It is intended that the SplayTreeStruct can be embedded in another +structure (see .usage.client-tree for an example). No convenience functions +are provided for allocation or deallocation. + typedef struct SplayTreeStruct SplayTreeStruct, *SplayTree; + +.type.splay.node: SplayNode is the type of a node of the splay tree. +SplayNodeStruct contains no fields to store the key associated with the node, +or the client property. Again, it is intended that the SplayNodeStruct can be +embedded in another structure, and that this is how the association will be +made (see .usage.client-node for an example). No convenience functions are +provided for allocation or deallocation. + typedef struct SplayNodeStruct SplayNodeStruct, *SplayNode; + +.type.splay.compare.method: SplayCompareMethod is a pointer to a function with +the following prototype: + Compare compare(void *key, SplayNode node); +The function is required to compare the key with the key the client associates +with that splay tree node, and return the appropriate Compare value (see +.usage.compare for an example). The function compares a key with a node, rather +than a pair of keys or nodes as might seem more obvious. This is because the +details of the mapping between nodes and keys is left to the client (see +.type.splay.node), and the splaying operations compare keys with nodes (see +.impl.splay). + +.type.splay.node.describe.method: SplayNodeDescribeMethod is a pointer to a +function with the following prototype: + Res nodeDescribe(SplayNode node, mps_lib_FILE *stream) +The function is required to write (via WriteF) a client-oriented representation +of the splay node. The output should be non-empty, short, and without return +characters. This is provided for debugging purposes only. + +.type.splay.test.node.method: SplayTestNodeMethod is a pointer to a function +with the following prototype: + Bool testNode(SplayTree tree, SplayNode node, void *closureP, unsigned long +closureS); +The function is required to determine whether the node itself meets some client +determined property (see .prop and .usage.test.node for an example). Parameters +closureP and closureS describe the environment for the function (see +.function.splay.find.first and .function.splay.find.last). + +.type.splay.test.tree.method: SplayTestTreeMethod is a pointer to a function +with the following prototype: + Bool testTree(SplayTree tree, SplayNode node, void *closureP, unsigned long +closureS); +The function is required to determine whether any of the nodes in the sub-tree +rooted at the given node meet some client determined property (see .prop and +.usage.test.tree for an example). In particular, it must be a precise (not +conservative) indication of whether there are any nodes in the sub-tree for +which the testNode method (see .type.splay.test.node.method) would return TRUE. +Parameters closureP and closureS describe the environment for the function (see +.function.splay.find.first and .function.splay.find.last). + +.type.splay.update.node.method: SplayUpdateNodeMethod is a pointer to a +function with the following prototype: + void updateNode(SplayTree tree, SplayNode node, SplayNode leftChild, +SplayNode rightChild); +The function is required to update any client datastructures associated with a +node to maintain some client determined property (see .prop) given that the +children of the node have changed. If the node does not have one or both +children, then NULL will be passed as the relevant parameter. (See +.usage.callback for an example) + + +EXTERNAL FUNCTIONS + +.function.no-thread: The interface functions are not designed to be either +thread-safe or re-entrant. Clients of the interface are responsible for +synchronization, and for ensuring that client-provided methods invoked by the +splay module (.type.splay.compare.method, .type.splay.test.node.method, +.type.splay.test.tree.method, .type.splay.update.node.method) do not call +functions of the splay module. + +.function.splay.tree.check: This is a check function for the SplayTree type +(see guide.impl.c.adt.method.check & design.mps.check(0)): + Bool SplayTreeCheck(SplayTree tree); + +.function.splay.node.check: This is a check function for the SplayNode type +(see guide.impl.c.adt.method.check & design.mps.check(0)): + Bool SplayNodeCheck(SplayNode node); + +.function.splay.tree.init: This function initialises a SplayTree (see +guide.impl.c.adt.method.init). It requires a compare method that defines a +total ordering on nodes (see .req.order); the effect of supplying a compare +method that does not implement a total ordering is undefined. It also requires +an updateNode method, which will be used to keep client properties up to date +when the tree structure changes; the value SplayTrivUpdateNode may be used for +this method if there is no need to maintain client properties. (See +.usage.initialization for an example use). + void SplayTreeInit(SplayTree tree, SplayCompareMethod compare, +SplayUpdateNodeMethod updateNode); + +.function.splay.tree.finish: This function clears the fields of a SplayTree +(see guide.impl.c.adt.method.finish). Note that it does not attempt to finish +or deallocate any associated SplayNode objects; clients wishing to destroy a +non-empty SplayTree must first explicitly descend the tree and call +SplayNodeFinish on each node from the bottom up. + void SplayTreeFinish(SplayTree tree); + +.function.splay.node.init: This function initialises a SplayNode (see +guide.impl.c.adt.method.init). + void SplayNodeInit(SplayNode node); + +.function.splay.node.finish: This function clears the fields of a SplayNode +(see guide.impl.c.adt.method.finish). Note that it does not attempt to finish +or deallocate any referenced SplayNode objects (see.function.splay.tree.finish). + void SplayNodeFinish(SplayNode node); + +.function.splay.root: This function returns the root node of the tree, if any +(see .req.root). If the tree is empty, FALSE is returned and *nodeReturn is +not changed. Otherwise, TRUE is returned and *nodeReturn is set to the root. + Bool SplayRoot(SplayNode *nodeReturn, SplayTree tree); + +.function.splay.tree.insert: This function is used to insert into a splay tree +a new node which is associated with the supplied key (see .req.add). It first +splays the tree at the key. If an attempt is made to insert a node that +compares CompareEQUAL to an existing node in the tree, then ResFAIL will be +returned and the node will not be inserted. (See .usage.insert for an example +use). + Res SplayTreeInsert(SplayTree tree, SplayNode node, void *key); + +.function.splay.tree.delete: This function is used to delete from a splay tree +a node which is associated with the supplied key (see .req.remove). If the +tree does not contain the given node, or the given node does not compare +CompareEQUAL with the given key, then ResFAIL will be returned, and the node +will not be deleted. The function first splays the tree at the given key. +(See .usage.delete for an example use). + Res SplayTreeDelete(SplayTree tree, SplayNode node, void *key); + +.function.splay.tree.search: This function searches the splay tree for a node +that compares CompareEQUAL to the given key (see .req.locate). It splays the +tree at the key. It returns ResFAIL if there is no such node in the tree, +otherwise *nodeReturn will be set to the node. + Res SplayTreeSearch(SplayNode *nodeReturn, SplayTree tree, void *key); + +.function.splay.tree.neighbours: This function searches a splay tree for the +two nodes that are the neighbours of the given key (see .req.neighbours). It +splays the tree at the key. *leftReturn will be the neighbour which compares +less than the key if such a neighbour exists; otherwise it will be NULL. +*rightReturn will be the neighbour which compares greater than the key if such +a neighbour exists; otherwise it will be NULL. The function returns ResFAIL if +any node in the tree compares CompareEQUAL with the given key. (See +.usage.insert for an example use). + Res SplayTreeNeighbours(SplayNode *leftReturn, SplayNode *rightReturn, +SplayTree tree, void *key); + +.function.splay.tree.first: This function splays the tree at the first node, +and returns that node (see .req.iterate). The supplied key should compare +CompareLESS with all nodes in the tree. It will return NULL if the tree has no +nodes. + SplayNode SplayTreeFirst(SplayTree tree, void *zeroKey); + +.function.splay.tree.next: This function receives a node and key and returns +the successor node to that node (see .req.iterate). This function is intended +for use in iteration when the received node will be the current root of the +tree, but is robust against being interspersed with other splay operations +(provided the old node still exists). The supplied key must compare +CompareEQUAL to the supplied node. Note that use of this function rebalances +the tree for each node accessed. If many nodes are accessed as a result of +multiple uses, the resultant tree will be generally well balanced. But if the +tree was previously beneficially balanced for a small working set of accesses, +then this local optimization will be lost. (see .future.parent). + SplayNode SplayTreeNext(SplayTree tree, SplayNode oldNode, void *oldKey); + +.function.splay.tree.describe: This function prints (using WriteF) to the +stream a textual representation of the given splay tree, using nodeDescribe to +print client-oriented representations of the nodes (see .req.debug). + Res SplayTreeDescribe(SplayTree tree, mps_lib_FILE *stream, +SplayNodeDescribeMethod nodeDescribe); + +.function.splay.find.first: SplayFindFirst finds the first node in the tree +that satisfies some client property (as determined by the testNode and testTree +methods) (see .req.property.find). closureP and closureS are arbitrary values, +and are passed to the testNode and testTree methods which may use the values as +closure environments. If there is no satisfactory node, then FALSE is +returned, otherwise *nodeReturn is set to the node. (See .usage.delete for an +example use). + Bool SplayFindFirst(SplayNode *nodeReturn, SplayTree tree, +SplayTestNodeMethod testNode, SplayTestTreeMethod testTree, void *closureP, +unsigned long closureS); + +.function.splay.find.last: SplayFindLast finds the last node in the tree that +satisfies some client property (as determined by the testNode and testTree +methods) (see .req.property.find). closureP and closureS are arbitrary values, +and are passed to the testNode and testTree methods which may use the values as +closure environments. If there is no satisfactory node, then FALSE is +returned, otherwise *nodeReturn is set to the node. + Bool SplayFindFirst(SplayNode *nodeReturn, SplayTree tree, +SplayTestNodeMethod testNode, SplayTestTreeMethod testTree, void *closureP, +unsigned long closureS); + +.function.splay.node.refresh: SplayNodeRefresh must be called whenever the +client property (see .prop) at a node changes (see .req.property.change). It +will call the updateNode method on the given node, and any other nodes that may +require update. The client key for the node must also be supplied; the +function splays the tree at this key. (See .usage.insert for an example use). + void SplayNodeRefresh(SplayTree tree, SplayNode node, void *key); + + +CLIENT-DETERMINED PROPERTIES + +.prop: To support .req.property.find, this splay tree implementation provides +additional features to permit clients to cache maximum (or minimum) values of +client properties for all the nodes in a subtree. The splay tree +implementation uses the cached values as part of SplayFindFirst and +SplayFindLast via the testNode and testTree methods. The client is free to +choose how to represent the client property, and how to compute and store the +cached value. + +.prop.update: The cached values depend upon the topology of the tree, which may +vary as a result of operations on the tree. The client is given the +opportunity to compute new cache values whenever necessary, via the updateNode +method (see .function.splay.tree.init). This happens whenever the tree is +restructured. The client may use the SplayNodeRefresh method to indicate that +the client attributes at a node have changed (see .req.property.change). A call +to SplayNodeRefresh splays the tree at the specified node, which may provoke +calls to the updateNode method as a result of the tree restructuring. The +updateNode method will also be called whenever a new splay node is inserted +into the tree. + +.prop.example: For example, if implementing an address ordered tree of free +blocks using a splay tree, a client might choose to use the base address of +each block as the key for each node, and the size of each block as the client +property. The client can then maintain as a cached value in each node the size +of the largest block in the subtree rooted at that node. This will permit a +fast search for the first or last block of at least a given size. See +.usage.callback for an example updateNode method for such a client. + +.prop.ops: The splay operations must cause client properties for nodes to be +updated in the following circumstances:- (.impl.* for details): + +.prop.ops.rotate: rotate left, rotate right -- We need to update the value at +the original root, and the new root, in that order. + +.prop.ops.link: link left, link right -- We know that the line of right descent +from the root of the left tree and the line of left descent from the root of +the right tree will both need to be updated. This is performed at the assembly +stage. (We could update these chains every time we do a link left or link right +instead, but this would be less efficient) + +.prop.ops.assemble: assemble -- This operation also invalidates the lines of +right and left descent of the left and right trees respectively which need to +be updated (see below). It also invalidates the root which must be updated +last. + +.prop.ops.assemble.reverse: To correct the chains of the left and right trees +without requiring stack or high complexity, we use a judicious amount of +pointer reversal. + +.prop.ops.assemble.traverse: During the assembly, after the root's children +have been transplanted, we correct the chains of the left and right trees. For +the left tree, we traverse the right child line, reversing pointers, until we +reach the node that was the last node prior to the transplantation of the +root's children. Then we update from that node back to the left tree's root, +restoring pointers. Updating the right tree is the same, mutatis mutandis. +(See .future.reverse for an alternative approach). + + +USAGE + +.usage: Here's a simple example of a client which uses a splay tree to +implement an address ordered tree of free blocks. The significant client usages +of the splay tree interface might look as follows:- + +.usage.client-tree: Tree structure to embed a SplayTree (see .type.splay.tree): +typedef struct FreeTreeStruct { + SplayTreeStruct splayTree; /* Embedded splay tree */ + /* no obvious client fields for this simple example */ +} FreeTreeStruct; + +.usage.client-node: Node structure to embed a SplayNode (see .type.splay.node): +typedef struct FreeBlockStruct { + SplayNodeStruct splayNode; /* embedded splay node */ + Addr base; /* base address of block is also the key */ + Size size; /* size of block is also the client property */ + Size maxSize; /* cached value for maximum size in subtree */ +} FreeBlockStruct; + +.usage.callback: updateNode callback method (see +.type.splay.update.node.method): +void FreeBlockUpdateNode(SplayTree tree, SplayNode node, + SplayNode leftChild, SplayNode rightChild) +{ + /* Compute the maximum size of any block in this subtree. */ + /* The value to cache is the maximum of the size of this block, */ + /* the cached value for the left subtree (if any) and the cached */ + /* value of the right subtree (if any) */ + + FreeBlock freeNode = FreeBlockOfSplayNode(node); + + Size maxSize = freeNode.size; + + if(leftChild != NULL) { + FreeBlock leftNode = FreeBlockOfSplayNode(leftChild); + if(leftNode.maxSize > maxSize) + maxSize = leftNode->maxSize; + } + + if(rightChild != NULL) { + FreeBlock rightNode = FreeBlockOfSplayNode(rightChild); + if(rightNode.maxSize > maxSize) + maxSize = rightNode->maxSize; + } + + freeNode->maxSize = maxSize; +} + +.usage.compare: Comparison function (see .type.splay.compare.method): +Compare FreeBlockCompare(void *key, SplayNode node) { + Addr base1, base2, limit2; + FreeBlock freeNode = FreeBlockOfSplayNode(node); + + base1 = (Addr *)key; + base2 = freeNode->base; + limit2 = AddrAdd(base2, freeNode->size); + + if (base1 < base2) + return CompareLESS; + else if (base1 >= limit2) + return CompareGREATER; + else + return CompareEQUAL; +} + +.usage.test.tree: Test tree function (see .type.splay.test.tree.method): +Bool FreeBlockTestTree(SplayTree tree, SplayNode node + void *closureP, unsigned long closureS) { + /* Closure environment has wanted size as value of closureS. */ + /* Look at the cached value for the node to see if any */ + /* blocks in the subtree are big enough. */ + + Size size = (Size)closureS; + FreeBlock freeNode = FreeBlockOfSplayNode(node); + return freeNode->maxSize >= size; +} + +.usage.test.node: Test node function (see .type.splay.test.node.method): +Bool FreeBlockTestNode(SplayTree tree, SplayNode node + void *closureP, unsigned long closureS) { + /* Closure environment has wanted size as value of closureS. */ + /* Look at the size of the node to see if is big enough. */ + + Size size = (Size)closureS; + FreeBlock freeNode = FreeBlockOfSplayNode(node); + return freeNode->size >= size; +} + +.usage.initialization: Client's initialization function (see +.function.splay.tree.init): +void FreeTreeInit(FreeTree tree) { + /* Initialize the embedded splay tree. */ + SplayTreeInit(&tree->splayTree, FreeBlockCompare, FreeBlockUpdateNode); +} + +.usage.insert: Client function to add a new free block into the tree, merging +it with an existing block if possible: +void FreeTreeInsert(FreeTree tree, Addr base, Addr limit) { + SplayTree splayTree = &tree->splayTree; + SplayNode leftNeighbour, rightNeighbour; + void *key = (void *)base; /* use the base of the block as the key */ + Res res; + + /* Look for any neighbouring blocks. (.function.splay.tree.neighbours) */ + res = SplayTreeNeighbours(&leftNeighbour, &rightNeighbour, + splayTree, key); + AVER(res == ResOK); /* this client doesn't duplicate free blocks */ + + /* Look to see if the neighbours are contiguous. */ + + if (leftNeighbour != NULL && + FreeBlockLimitOfSplayNode(leftNeighbour) == base) { + /* Inserted block is contiguous with left neighbour, so merge it. */ + /* The client housekeeping is left as an exercise to the reader. */ + /* This changes the size of a block, which is the client */ + /* property of the splay node. See .function.splay.node.refresh */ + SplayNodeRefresh(tree, leftNeighbour, key); + + } else if (rightNeighbour != NULL && + FreeBlockBaseOfSplayNode(rightNeighbour) == limit) { + /* Inserted block is contiguous with right neighbour, so merge it. */ + /* The client housekeeping is left as an exercise to the reader. */ + /* This changes the size of a block, which is the client */ + /* property of the splay node. See .function.splay.node.refresh */ + SplayNodeRefresh(tree, rightNeighbour, key); + + } else { + /* Not contiguous - so insert a new node */ + FreeBlock newBlock = (FreeBlock)allocate(sizeof(FreeBlockStruct)); + splayNode = &newBlock->splayNode; + + newBlock->base = base; + newBlock->size = AddrOffset(base, limit); + SplayNodeInit(splayNode); /* .function.splay.node.init */ + /* .function.splay.tree.insert */ + res = SplayTreeInsert(splayTree, splayNode, key); + AVER(res == ResOK); /* this client doesn't duplicate free blocks */ + } +} + +.usage.delete: Client function to allocate the first block of a given size in +address order. For simplicity, this allocates the entire block: +Bool FreeTreeAllocate(Addr *baseReturn, Size *sizeReturn, + FreeTree tree, Size size) { + SplayTree splayTree = &tree->splayTree; + SplayNode splayNode; + Bool found; + + /* look for the first node of at least the given size. */ + /* closureP parameter is not used. See .function.splay.find.first. */ + found = SplayFindFirst(&splayNode, splayTree, + FreeBlockTestNode, FreeBlockTestTree, + NULL, (unsigned long)size); + + if (found) { + FreeBlock freeNode = FreeBlockOfSplayNode(splayNode); + Void *key = (void *)freeNode->base; /* use base of block as the key */ + Res res; + + /* allocate the block */ + *baseReturn = freeNode->base; + *sizeReturn = freeNode->size; + + /* remove the node from the splay tree - .function.splay.tree.delete */ + res = SplayTreeDelete(splayTree, splayNode, key); + AVER(res == ResOK); /* Must be possible to delete node */ + + /* Delete the block */ + deallocate(freeNode, (sizeof(FreeBlockStruct)); + + return TRUE; + + } else { + /* No suitable block */ + return FALSE; + } +} + + +IMPLEMENTATION + +.impl: For more details of how splay trees work, see paper.st85(0). For more +details of how to implement operations on splay trees, see paper.sleator96(0). +Here we describe the operations involved. + + +Top-Down Splaying + +.impl.top-down: The method chosen to implement the splaying operation is called +"top-down splay". This is described as "procedure top-down splay" in +paper.st85(0) - although the implementation here additionally permits attempts +to access items which are not known to be in the tree. Top-down splaying is +particularly efficient for the common case where the location of the node in a +tree is not known at the start of an operation. Tree restructuring happens as +the tree is descended, whilst looking for the node. + +.impl.splay: The key to the operation of the splay tree is the internal +function SplaySplay. It searches the tree for a node with a given key and +returns whether it suceeded. In the process, it brings the found node, or an +arbitrary neighbour if not found, to the root of the tree. This +"bring-to-root" operation is performed top-down during the search, and it is +not the simplest possible bring-to-root operation, but the resulting tree is +well-balanced, and will give good amortised cost for future calls to +SplaySplay. (See paper.st85(0)) + +.impl.splay.how: To perform this top-down splay, the tree is broken into three +parts, a left tree, a middle tree and a right tree. We store the left tree and +right tree in the right and left children respectively of a "sides" node to +eliminate some boundary conditions. The initial condition is that the middle +tree is the entire splay tree, and the left and right trees are empty. We also +keep pointers to the last node in the left tree, and the first node in the +right tree. Note that, at all times, the three trees are each validly ordered, +and they form a partition with the ordering left, middle, right. The splay is +then performed by comparing the middle tree with the following six cases, and +performing the indicated operations, until none apply. + +.impl.splay.cases: Note that paper.st85(0)(Fig. 3) describes only 3 cases: zig, +zig-zig and zig-zag. The additional cases described here are the symmetric +variants which are respectively called zag, zag-zag and zag-zig. In the +descriptions of these cases, "root" is the root of the middle tree; node->left +is the left child of node; node->right is the right child of node. The +comparison operators (<, >, ==) are defined to compare a key and a node in the +obvious way by comparing the supplied key with the node's associated key. + +.impl.splay.zig: The "zig" case is where key < root, and either: + - key == root->left; + - key < root->left && root->left->left == NULL; or + - key > root->left && root->left->right == NULL. + +The operation for the zig case is: link right (see .impl.link.right) + +.impl.splay.zag: The "zag" case is where key > root, and either: + - key == root->right; + - key < root->right && root->right->left == NULL; or + - key > root->right && root->right->right == NULL. + +The operation for the zag case is: link left (see .impl.link.left) + +.impl.splay.zig.zig: The "zig-zig" case is where key < root && key < root->left +&& root->left->left != NULL. The operation for the zig-zig case is: rotate +right (see .impl.rotate.right) followed by link right (see .impl.link.right). + +.impl.splay.zig.zag: The "zig-zag" case is where key < root && key > root->left +&& root->left->right != NULL. The operation for the zig-zag case is: link +right (see .impl.link.right) followed by link left (see .impl.link.left). + +.impl.splay.zag.zig: The "zag-zig" case is where key > root && key < +root->right && root->right->left != NULL. The operation for the zag-zig case +is: link left (see .impl.link.left) followed by link right (see +.impl.link.right). + +.impl.splay.zag.zag: The "zag-zag" case is where key > root && key > +root->right && root->right->right != NULL. The operation for the zag-zag case +is: rotate left (see .impl.rotate.left) followed by link left (see +.impl.link.left). + +.impl.splay.terminal.null: A special terminal case is when root == NULL. This +can only happen at the beginning, and cannot arise from the operations above. +In this case, the splay operation must return NULL, and "not found". + +.impl.splay.terminal.found: One typical terminal case is when key == root. +This case is tested for at the beginning, in which case "found" is returned +immediately. If this case happens as a result of other operations, the splay +operation is complete, the three trees are assembled (see .impl.assemble), and +"found" is returned. + +.impl.splay.terminal.not-found: The other typical terminal cases are: + - key < root && root->left == NULL; and + - key > root && root->right == NULL. +In these cases, the splay operation is complete, the three trees are assembled +(see .impl.assemble), and "not found" is returned. + +.impl.rotate.left: The "rotate left" operation (see paper.st85(0) Fig. 1) +rearranges the middle tree as follows (where any of sub-trees A, B and C may be +empty): +.impl.rotate.right: The "rotate right" operation (see paper.st85(0) Fig. 1) +rearranges the middle tree as follows (where any of sub-trees A, B and C may be +empty): +.impl.link.left: The "link left" operation (see paper.st85(0) Fig. 11a for +symmetric variant) rearranges the left and middle trees as follows (where any +of sub-trees A, B, L and R may be empty): + + +The last node of the left tree is now x. + +.impl.link.right: The "link right" operation (see paper.st85(0) Fig. 11a) +rearranges the middle and right trees as follows (where any of sub-trees A, B, +L and R may be empty): +The first node of the right tree is now x. + + +.impl.assemble: The "assemble" operation (see paper.st85(0) Fig. 12) merges the +left and right trees with the middle tree as follows (where any of sub-trees A, +B, L and R may be empty): + + + +Top-Level Operations + +.impl.insert: SplayTreeInsert: (See paper.sleator96(0), chapter 4, function +insert). If the tree has no nodes, [how does it smell?] add the inserted node +and we're done; otherwise splay the tree around the supplied key. If the splay +successfully found a matching node, return failure. Otherwise, add the +inserted node as a new root, with the old (newly splayed, but non-matching) +root as its left or right child as appropriate, and the opposite child of the +old root as the other child of the new root. + +.impl.delete: SplayTreeDelete: (See paper.sleator96(0), chapter 4, function +delete). Splay the tree around the supplied key. Check that the newly splayed +root is the same node as given by the caller, and that it matches the key; +return failure if not. If the given node (now at the root) has fewer than two +children, replace it (as root), with the non-null child or null. Otherwise, +set the root of the tree to be the left child (arbitrarily) of the node to be +deleted, and splay around the same key. The new root will be the last node in +the sub-tree and will have a null right child; this is set to be the right +child of the node to be deleted. + +.impl.search: SplayTreeSearch: Splay the node around the supplied key. If the +splay found a matching node, return it; otherwise return failure. + +.impl.neighbours: SplayTreeNeighbours: Splay the tree around the supplied key. +If the splay found a matching node, return failure. Otherwise, determine +whether the (non-matching) found node is the left or right neighbour of the key +(by comparison with the key). Set the tree root to be the right or left child +of that first neighbour respectively, and again splay the tree around the +supplied key. The new root will be the second neighbour, and will have a null +left or right child respectively. Set this null child to be the first +neighbour. Return the two neighbours. + +.impl.neighbours.note: Note that it would be possible to implement +SplayTreeNeighbours with only one splay, and then a normal binary tree search +for the left or right neighbour of the root. This would be a cheaper +operation, but would give poorer amortised cost if the call to +SplayTreeNeighbours typically precedes a call to SplayTreeInsert (which is +expected to be a common usage pattern - see .usage.insert). It's also possible +to implement SplayTreeNeighbours by simply keeping track of both neighbours +during a single splay. This has about the same cost as a single splay, and +hence about the same amortised cost if the call to SplayTreeNeighbours typically precedes a call to +SplayTreeInsert. + +.impl.next: SplayTreeNext: Splay the tree around the supplied oldKey. During +iteration the "old node" found is probably already at the root, in which case +this will be a null operation with little cost. If this old node has no right +child, return NULL. Otherwise, split the tree into a right tree (which contains +just the right child of the old node) and a left tree (which contains the old +node, its left child and no right child). The next node is the first node in +the right tree. Find this by splaying the right tree around oldKey (which is +known to compare CompareLESS than any keys in the right tree). Rejoin the full +tree, using the right tree as the root and setting the left child of root to be +the left tree. Return the root of this tree. + +TESTING + +.test: There is no plan to test splay trees directly. It is believed that the +testing described in design.mps.cbs.test will be sufficient to test this +implementation. + + +ERROR HANDLING + +.error: This module detects and reports most common classes of protocol error. +The cases it doesn't handle will result in undefined behaviour and probably +cause an AVER to fire. These are: + +.error.bad-pointer: Passing an invalid pointer in place of a SplayTree or +SplayNode. + +.error.bad-compare: Initialising a SplayTree with a compare function that is +not a valid compare function, or which doesn't implement a total ordering on +splay nodes. + +.error.bad-describe: Passing an invalid describe method to SplayTreeDescribe. + +.error.out-of-stack: Stack exhaustion under SplayTreeDescribe. + + +FUTURE + +.future.tree: It would be possible to split the splay tree module into two: one +that implements binary trees; and one that implements splay trees on top of a +binary tree. + +.future.parent: The iterator could be made more efficient (in an amortized +sense) if it didn't splay at each node. To implement this (whilst meeting +.req.stack) we really need parent pointers from the nodes. We could use the +(first-child, right-sibling/parent) trick described in paper.st85 to implement +this, at a slight cost to all other tree operations, and an increase in code +complexity. paper.st85 doesn't describe how to distinguish the first-child +between left-child and right-child, and the right-sibling/parent between +right-sibling and parent. One could either use the comparator to make these +distinctions, or steal some bits from the pointers. + +.future.reverse: The assembly phase could be made more efficient if the link +left and link right operations were modified to add to the left and right trees +with pointers reversed. This would remove the need for the assembly phase to +reverse them. + diff --git a/mps/design/sso1al/index.txt b/mps/design/sso1al/index.txt new file mode 100644 index 00000000000..ff3d14246aa --- /dev/null +++ b/mps/design/sso1al/index.txt @@ -0,0 +1,90 @@ + STACK SCANNER FOR DIGITAL UNIX / ALPHA SYSTEMS + design.mps.sso1al + draft doc + drj 1997-03-27 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: This is the design for Stack Scanner module that runs on DIGITAL UNIX / +Alpha systems (See os.o1 and arch.al). The design adheres to the general +design and interface described (probably not described actually) in +design.mps.ss. + +.source.alpha: book.digital96 (Alpha Architecture Handbook) describes the Alpha +Architecture independently of any particular implementation. The instruction +mnemonics and the semantics for each instruction are specified in that document. +.source.as: + (Assembly Language Programmer's Guide) +describes the assembler syntax and assembler directives. It also summarises +the calling conventions used. Chapters 1 and 6 were especially useful, +especially chapter 6. +.source.convention: + (Calling Standard for Alpha Systems) +describes the calling convetions used for Digital Alpha systems. Chapter 2 was +useful. But the whole document was not used as much as the previous 2 +documents. + + +DEFINITIONS + +.def.saved: Saved Register. A saved register is one whose value is defined to +be preserved across a procedure call according to the Calling Standard. They +are $9-$15, $26, and $30. $30 is the stack pointer. +.def.non-saved: Non-Saved Register. A non-save register is a register that is +assumed to be modified across a procedure call according to the Calling +Standard. +.def.tos: Top of Stack. The top of stack is the youngest portion of the stack. +.def.bos: Bottom of Stack. The bottom of stack is the oldest portion of the +stack. +.def.base: Base. Of a range of addresses, the base is the lowest address in +the range. +.def.limit: Limit. Of a range of addresses, the limit is "one past" the +highest address in the range. + + +OVERVIEW + +.overview: The registers and the stack need to be scanned. This is achieved by +storing the contents of the registers into a frame at the top of the stack and +then passing the base and limit of the stack region, including the newly +created frame, to the function TraceScanAreaTagged. TraceScanAreaTagged +performs the actual scanning and fixing. + + +DETAIL DESIGN + +Functions + +.fun.stackscan: + +Res StackScan(ScanState ss, Addr *StackBot); + +.fun.stackscan.asm: The function is written in assembler. +.fun.stackscan.asm.justify: This is because the machine registers need to be +examined, and it is only possible to access the machine registers using +assembler. +.fun.stackscan.entry: On entry to this procedure all the non-saved (temporary) +registers that contain live pointers must have been saved in some root (usually +the stack) by the mutator (otherwise it would lose the values). Therefore only +the saved registers need to be stored by this procedure. +.fun.stackscan.assume.saved: We assume that all the saved registers are roots. +This is conservative since some of the saved registers might not be used. +.fun.stackscan.frame: A frame is be created on the top of the stack. +.fun.stackscan.frame.justify: This frame is used to store the saved registers +into so that they can be scanned. +.fun.stackscan.save: All the saved registers, apart from $30 the stack pointer, +are to be stored in the frame. .fun.stackscan.save.justify: This is so that +they can be scanned. The stack pointer itself is not scanned as the stack is +assumed to be a root (and therefore a priori alive). +.fun.stackscan.call: TraceScanAreaTagged is called with the current stack +pointer as the base and the (passed in) StackBot as the limit of the region to +be scanned. .fun.stackscan.call.justify: This function does the actual +scanning. The Stack on Alpha systems grows down so the stack pointer (which +points to the top of the stack) is lower in memory than the bottom of the stack. +.fun.stackscan.return: The return value from TraceScanAreaTagged is used as the +return value for StackScan. + diff --git a/mps/design/telemetry/index.txt b/mps/design/telemetry/index.txt new file mode 100644 index 00000000000..3ad929fde3a --- /dev/null +++ b/mps/design/telemetry/index.txt @@ -0,0 +1,352 @@ + THE DESIGN OF THE MPS TELEMETRY MECHANISM + design.mps.telemetry + incomplete design + richard 1997-07-07 + + +INTRODUCTION: + +This documents the design of the telemetry mechanism within the MPS. + +.readership: This document is intended for any MPS developer. + +.source: Various meetings and brainstorms, including +meeting.general.1997-03-04(0), mail.richard.1997-07-03.17-01(0), +mail.gavinm.1997-05-01.12-40(0). + + +Document History + +.hist.0: 1997-04-11 GavinM Rewritten + +.hist.1: 1997-07-07 GavinM Rewritten again after discussion in Pool Hall. + + +OVERVIEW: + +Telemetry permits the emission of events from the MPS. These can be used to +drive a graphical tool, or to debug, or whatever. The system is flexible and +robust, but doesn't require heavy support from the client. + + +REQUIREMENTS: + +.req.simple: It must be possible to generate code both for the MPS and any tool +without using complicated build tools. + +.req.open: We must not constrain the nature of events before we are certain of +what we want them to be. + +.req.multi: We must be able to send events to multiple streams. + +.req.share: It must be possible to share event descriptions between the MPS and +any tool. + +.req.version: It must be possible to version the set of events so that any tool +can detect whether it can understand the MPS. + +.req.back: Tools should be able to understand older and newer version of the +MPS, so far as is appropriate. + +.req.type: It must be possible to transmit a rich variety of types to the tool, +including doubles, and strings. + +.req.port: It must be possible to transmit and receive events between different +platforms. + +.req.control: It must be possible to control whether and what events are +transmitted at least at a coarse level. + +.req.examine: There should be a cheap means to examine the contents of logs. + +.req.pm: The event mechanism should provide for post mortem to detect what +significant events led up to death. + +.req.perf: Events should not have a significant effect on performance when +unwanted. + +.req.small: Telemetry streams should be small. + +.req.avail: Events should be available in all varieties, subject to performance +requirements. + +.req.impl: The plinth support for telemetry should be easy to write and +flexible. + +.req.robust: The telemetry protocol should be robust against some forms of +corruption, e.g. packet loss. + +.req.intern: It should be possible to support string-interning. + + +ARCHITECTURE: + +.arch: Event annotations are scattered throughout the code, but there is a +central registration of event types and properties. Events are written to a +buffer via a specialist structure, and are optionally written to the plinth. +Events can take any number of parameters of a range of types, indicated as a +format both in the annotation and the the registry. + + +ANALYSIS: + +.anal: The proposed order of development, with summary of requirements impact +is as follows: + + v c e + s e o x r i + i m s r n a s a o n + m o u h s t p t m p m v i b t b + p p l a i y o r i e a a m u e a + l e t r o p r o n p r l i p s r c + e n i e n e t l e m f l l l t n k + +.sol.format 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 Merged. +.sol.struct 0 0 0 0 0 + 0 0 0 0 + - 0 0 0 0 0 Merged. +.sol.string 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 + 0 Merged. +.sol.relation + 0 0 + 0 0 0 0 + 0 0 + 0 0 0 0 0 Merged. +.sol.dumper 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 Merged. +.sol.kind 0 - 0 0 0 0 0 + 0 + 0 0 0 0 0 0 0 Merged. +.sol.control 0 0 0 0 0 0 0 + 0 0 + 0 0 0 0 0 0 Merged. + +.sol.variety 0 0 0 0 0 0 0 0 0 + + 0 + 0 0 0 0 + +[ Not yet ordered. ] + +.sol.buffer 0 0 0 0 0 0 0 + 0 + + 0 0 0 0 0 0 +.sol.traceback 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 +.sol.client 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + 0 +.sol.head 0 0 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 +.sol.version 0 0 0 0 + 0 0 0 0 0 0 0 0 0 0 0 + +.sol.exit 0 0 0 0 0 0 0 0 0 + 0 0 0 0 0 0 0 +.sol.block 0 0 0 0 0 0 0 0 0 0 + - 0 0 + 0 0 +.sol.code 0 0 0 0 0 0 0 0 0 0 0 + 0 0 0 0 + +.sol.msg 0 0 + 0 0 0 + 0 0 0 0 0 0 + + 0 0 + +.file-format: One of the objectives of this plan is to minimise the impact of +the changes to the log file format. This is to be achieved firstly by +completing all necessary support before changes are initiated, and secondly by +performing all changes at the same time. + + +IDEAS: + +.sol.format: Event annotations indicate the types of their arguments, e.g. +EVENT_WD for a Word, and a double. (.req.type) + +.sol.struct: Copy event data into a structure of the appropriate type, e.g. +EventWDStruct. (.req.type, .req.perf, but not .req.small because of padding) + +.sol.string: Permit at most one string per event, at the end, and use the char +[1] hack, and specialised code; deduce the string length from the event length +and also NUL-terminate (.req.type, .req.intern) + +.sol.buffer: Enter all events initially into internal buffers, and +conditionally send them to the message stream. (.req.pm, .req.control, +.req.perf) + +.sol.variety: In optimized varieties, have internal events (see .sol.buffer) +for a subset of events and no external events; in normal varieties have all +internal events, and the potential for external events. (.req.avail, .req.pm, +.req.perf) + +.sol.kind: Divide events by some coarse type into around 6 groups, probably +related to frequency. (.req.control, .req.pm, but not .req.open) + +.sol.control: Hold flags to determine which events are emitted externally. +(.req.control, .req.perf) + +.sol.dumper: Write a simple tool to dump event logs as text. (.req.examine) + +.sol.msg: Redesign the plinth interface to send and receive messages, based on +any underlying IPC mechanism, e.g. append to file, TCP/IP, messages, shared +memory. (.req.robust, .req.impl, .req.port, .req.multi) + +.sol.block: Buffer the events and send them as fixed size blocks, commencing +with a timestamp, and ending with padding. (.req.robust, .req.perf, but not +.req.small) + +.sol.code: Commence each event with two bytes of event code, and two bytes of +length. (.req.small, .req.back) + +.sol.head: Commence each event stream with a platform-independent header block +giving information about the session, version (see .sol.version), and file +format; file format will be sufficient to decode the (platform-dependent) rest +of the file. (.req.port) + +.sol.exit: Provide a mechanism to flush events in the event of graceful sudden +death. (.req.pm) + +.sol.version: Maintain a three part version number for the file comprising +major (incremented when the format of the entire file changes (other than +platform differences)), median (incremented when an existing event changes its +form or semantics), and minor (incremented when a new event type is added); +tools should normally fail when the median or major is unsupported. +(.req.version, .req.back) + +.sol.relation: Event types will be defined in terms of a relation specifying +their name, code, optimised behaviour (see .sol.variety), kind (see .sol.kind), +and format (see .sol.format); both the MPS and tool can use this by suitable +#define hacks. (.req.simple. .req.share, .req.examine, .req.small (no format +information in messages)) + +.sol.traceback: Provide a mechanism to output recent events (see .sol.buffer) +as a form of backtrace when AVERs fire or from a debugger, or whatever. +(.req.pm) + +.sol.client: Provide a mechanism for user events. (.req.intern) + + + +IMPLEMENTATION: + +Annotation + +.annot: An event annotation is of the form: + EVENT_PAW(FooCreate, pointer, address, word); + +.annot.format: Note that the format is indicated in the macro name. See +.format. + +.annot.string: If there is a string in the format, it must be the last +parameter (and hence there can be only one). There is currrently a maximum +string length, defined by EventMaxStringLength in impl.h.eventcom. + +.annot.type: The event type should be given as the first parameter to the event +macro, as registered in impl.h.eventdef. + +.annot.param: The parameters of the event should be given as the remaining +parameters of the event macro, in order as indicated in the format. + + +Registration + +.reg: All event types should be registered in impl.h.eventdef, in the form of a +relation. + +.reg.just: This use of a relation macro enables great flexibility in the use of +this file. + +.reg.rel: The relation is of the form: + RELATION(FooCreate, 0x1234, TRUE, Arena, PAW) + +.reg.type: The first parameter of the relation is the event type. This needs +no prefix, and should correspond to that used in the annotation. + +.reg.code: The second parameter is the event code, a 16-bit value used to +represent this event type. [Not yet used. GavinM 1997-07-18] + +.reg.code.temp: On an interim basis, new events also have to be registered in +impl.h.eventcom. This will no longer be required when the event file format is +revised. + +.reg.always: The third parameter is a boolean value indicating whether this +event type should be implemented in all varieties. See .control.buffer. +Unless your event is on the critical path (typically per reference or per +object), you will want this to be TRUE. + +.reg.kind: The fourth parameter is a kind keyword indicating what category this +event falls into. See .control. The possible values are: + Arena -- per space or arena or global + Pool -- pool-related + Trace -- per trace or scan + Seg -- per segment + Ref -- per reference or fix + Object -- per object or allocation +This list can be seen in impl.h.event. + +.reg.format: The fifth parameter is the format (see .format) and should +correspond to the annotation (see .annot.format). + +.reg.dup: It is permissible for the one event type to be used for more than one +annotation. There are generally two reasons for this: + - Variable control flow for successful function completion; + - Platform/Otherwise-dependent implementations of a function. +Note that all annotations for one event type must have the same format (as +implied by .reg.format). + + +Format + +.format: Where a format is used to indicate the type, it is a sequence of +letters from the following list: + P -- void * + A -- Addr + W -- Word + U -- unsigned int + S -- char * + D -- double +The corresponding event parameters must be assignment compatible with these +types. + +.format.zero: If there are no parameters for an event, then the special format +"0" should be used. + +.format.types: When an event has parameters whose type is not in the above +list, use the following guidelines: All C pointer types not representing +strings use P; Size and Count use W; *Set use U; others should be obvious. + +.format.support: Every format used needs bespoke support in impl.h.eventgen. +It has not been possible to provide support for all possible formats, so such +support is added when required. .format.support.auto: There is a tool in +impl.pl.eventgen that will generate impl.h.eventgen automatically. It is used +as follows: + 1. Claim the file eventgen.h. + 2. Invoke eventgen.pl. + 3. Check it compiles correctly in all varieties. + 4. Check in eventgen.h. + + +Control + +.control: There are two types of event control, buffer and output. + +.control.buffer: Buffer control affects whether particular events implemented +at all, and is controlled statically by variety using the always value (see +.reg.always) for the event type. This is only relevant to release varieties. +[Not yet implemented. GavinM 1997-07-18] + +.control.output: Output control affects whether events written to the internal +buffer are output via the plinth. This is set on a per-kind basis (see +.reg.kind), using a control bit table stored in EventKindControl. By default, +all event kinds are on (in variety.ti). You may switch some kinds off using a +debugger. + +For example, to disable Ref events using gdb (see impl.h.event for numeric +codes): + + break ArenaCreate + run + delete 1 + call BTRes(EventKindControl, 4) + continue + +.control.just: These controls are coarse, but very cheap. + +.control.external: There will be an MPS interface function to control +EventKindControl. + +.control.tool: The tools will be able to control EventKindControl. + + +Dumper Tool + +.dumper: A primitive dumper tool is available in impl.c.eventcnv. For details, +see guide.mps.telemetry. + + +Allocation Replayer Tool + +.replayer: A tool for replaying an allocation sequence from a log is available +in impl.c.replay. For details, see design.mps.telemetry.replayer. + + + +TEXT: + +.notes: + - Set always to FALSE for all Ref and Object events; + - Fix use of BT for size in bytes, guess then check, BTInit; + - Resolve protection transgression in impl.h.eventdef; + - Make EventKind* enum members so they can be used from the debugger. + diff --git a/mps/design/trace/index.txt b/mps/design/trace/index.txt new file mode 100644 index 00000000000..4f35fcac5ec --- /dev/null +++ b/mps/design/trace/index.txt @@ -0,0 +1,95 @@ + TRACER + design.mps.trace + incomplete design + drj 1996-09-25 + + +ARCHITECTURE: + +.instance.limit: There will be a limit on the number of traces that can be +created at any one time. This effectively limits the number of concurrent +traces. This limitation is expressed in the symbol TRACE_MAX [currently set to +1, see request.mps.160020 "Multiple traces would not work" drj 1998-06-15]. + +.rate: [see mail.nickb.1997-07-31.14-37]. [Now revised? See +request.epcore.160062 and change.epcore.minnow.160062. drj 1998-06-15] + +.exact.legal: Exact references should either point outside the arena (to +non-managed address space) or to a tract allocated to a pool. Exact references +that are to addresses which the arena has reserved but hasn't allocated memory +to are illegal (the exact reference couldn't possibly refer to a real object). +Depending on the future semantics of PoolDestroy we might need to adjust our +strategy here. See mail.dsm.1996-02-14.18-18 for a strategy of coping +gracefully with PoolDestroy. We check that this is the case in the fixer. It +may be sensible to make this check CRITICAL in certain configurations. + +.fix.fixed.all: ss->fixedSummary is accumulated (in the fixer) for all the +pointers whether or not they are genuine references. We could accumulate fewer +pointers here; if a pointer fails the TractOfAddr test then we know it isn't a +reference, so we needn't accumulate it into the fixed summary. The design +allows this, but it breaks a useful post-condition on scanning (if the +accumulation of ss->fixedSummary was moved the accuracy of ss->fixedSummary +would vary according to the "width" of the white summary). See +mail.pekka.1998-02-04.16-48 for improvement suggestions. + + +ANALYSIS: + +.fix.copy-fail: Fixing can always succeed, even if copying the referenced +object has failed (due to lack of memory, for example), by backing off to +treating a reference as ambiguous. Assuming that fixing an ambiguous reference +doesn't allocate memory (which is no longer true for AMC for example). See +request.dylan.170560 for a slightly more sophisticated way to proceed when you +can no longer allocate memory for copying. + + +IDEAS: + +.flip.after: To avoid excessive barrier impact on the mutator immediately after +flip, we could scan during flip other objects which are "near" the roots, or +otherwise known to be likely to be accessed in the near future. + + +IMPLEMENTATION: + +Speed + +.fix: The fix path is critical to garbage collection speed. Abstractly fix is +applied to all the references in the non-white heap and all the references in +the copied heap. Remembered sets cut down the number of segments we have to +scan. The zone test cuts down the number of references we call fix on. The +speed of the remainder of the fix path is still critical to system +performance. Various modifications to and aspects of the system are concerned +with maintaining the speed along this path. + +.fix.tractofaddr: TractOfAddr is called on every reference that passes the zone +test and is on the critical path, to determine whether the segment is white. +There is no need to examine the segment to perform this test, since whiteness +information is duplicated in tracts, specifically to optimize this test. +TractOfAddr itself is a simple class dispatch function (which dispatches to the +arena class's TractOfAddr method). Inlining the dispatch and inlining the +functions called by VMTractOfAddr makes a small but noticable difference to the +speed of the dylan compiler. + +.fix.noaver: AVERs in the code add bulk to the code (reducing I-cache efficacy) +and add branches to the path (polluting the branch pedictors) resulting in a +slow down. Removing all the AVERs from the fix path improves the overall speed +of the dylan compiler by as much as 9%. + +.fix.nocopy: AMCFix used to copy objects by using the format's copy method. +This involved a function call (through an indirection) and in dylan_copy a call +to dylan_skip (to recompute the length) and call to memcpy with general +parameters. Replacing this with a direct call to memcpy removes these +overheads and the call to memcpy now has aligned parameters. The call to +memcpy is inlined by the (C) compiler. This change results in a 4-5% speed-up +in the dylan compiler. + +.reclaim: Because the reclaim phase of the trace (implemented by TraceReclaim) +examines every segment it is fairly time intensive. rit's profiles presented +in request.dylan.170551 show a gap between the two varieties variety.hi and +variety.wi. + +.reclaim.noaver: Converting AVERs in the loops of TraceReclaim, PoolReclaim, +AMCReclaim (LOReclaim? AWLReclaim) will result in a noticeable speed +improvement [insert actual speed improvement here]. + diff --git a/mps/design/type/index.txt b/mps/design/type/index.txt new file mode 100644 index 00000000000..df14067c13a --- /dev/null +++ b/mps/design/type/index.txt @@ -0,0 +1,397 @@ + THE DESIGN OF THE GENERAL MPS TYPES + design.mps.type + incomplete doc + richard 1996-10-23 + +INTRODUCTION + +.intro: + +See impl.h.mpmtypes. + + +RATIONALE + +Some types are declared to resolve a point of design, such as the best type to +use for array indexing. + +Some types are declared so that the intention of code is clearer. For example, +Byte is necessarily unsigned char, but it's better to say Byte in your code if +it's what you mean. + + +CONCRETE TYPES + + +Bool + +.bool: The Bool type is mostly defined so that the intention of code is +clearer. In C, boolean expressions evaluate to int, so Bool is in fact an +alias for int. + +.bool.value: Bool has two values, TRUE and FALSE. These are defined to be 1 and +0 respectively, for compatibility with C boolean expressions (so one may set a +Bool to the result of a C boolean expression). + +.bool.use: Bool is a type which should be used when a boolean value is +intended, for example, as the result of a function. Using a boolean type in C +is a tricky thing. Non-zero values are "true" (when used as control +conditions) but are not all equal to TRUE. Use with care. + +.bool.check: BoolCheck simply checks whether the argument is TRUE (1) or FALSE +(0). + +.bool.check.inline: The inline macro version of BoolCheck casts the int to +unsigned and checks that it is <= 1. This is safe, well-defined, uses the +argument exactly once, and generates reasonable code. +.bool.check.inline.smaller: In fact we can expect that the "inline" version of +BoolCheck to be smaller than the equivalent function call (on intel for +example, a function call will be 3 instructions (total 9 bytes), the inline +code for BoolCheck will be 1 instruction (total 3 bytes) (both sequences not +including the test which is the same length in either case)). +.bool.check.inline.why: As well as being smaller (see +.bool.check.inline.smaller) it is faster. On 1998-11-16 drj compared +w3i3mv\hi\amcss.exe running with and without the macro for BoolCheck on the PC +Aaron. "With" ran in 97.7% of the time (averaged over 3 runs). + + +Res + +.res: Res is the type of result codes. A result code indicates the success or +failure of an operation, along with the reason for failure. Like Unix error +codes, the meaning of the code depends on the call that returned it. These +codes are just broad categories with mnemonic names for various sorts of +problems. + +ResOK: The operation succeeded. Return parameters may only be updated if OK is +returned, otherwise they must be left untouched. +ResFAIL: Something went wrong which doesn't fall into any of the other +categories. The exact meaning depends on the call. See documentation. +ResRESOURCE: A needed resource could not be obtained. Which resource depends +on the call. See also MEMORY, which is a special case of this. +ResMEMORY: Needed memory (committed memory, not address space) could not be +obtained. +ResLIMIT: An internal limitation was reached. For example, the maximum number +of somethings was reached. We should avoid returning this by not including +static limitations in our code, as far as possible. (See rule.impl.constrain +and rule.impl.limits.) +ResUNIMPL: The operation, or some vital part of it, is unimplemented. This +might be returned by functions which are no longer supported, or by operations +which are included for future expansion, but not yet supported. +ResIO: An I/O error occurred. Exactly what depends on the function. +ResCOMMIT_LIMIT: The arena's commit limit would have been exceeded as a reult +of allocation. +ResPARAM: An invalid parameter was passed. Normally reserved for parameters +passed from the client. + +.res.use: Res should be returned from any function which might fail. Any other +results of the function should be passed back in "return" parameters (pointers +to locations to fill in with the results). [This is documented elsewhere, I +think -- richard].res.use.spec: The most specific code should be returned. + + +Fun + +.fun: Fun is the type of a pointer to a function about which nothing more is +known. + +.fun.use: Fun should be used where it's necessary to handle a function without +calling it in a polymorphic way. For example, if you need to write a function +g which passes another function f through to a third function h, where h knows +the real type of f but g doesn't. + + +Word + +.word: Word is an unsigned integral type which matches the size of the machine +word, i.e. the natural size of the machine registers and addresses. + +.word.use: It should be used where an unsigned integer is required that might +range as large as the machine word. + +.word.source: Word is derived from the macro MPS_T_WORD which is declared in +impl.h.mpstd according to the target platform. + +.word.conv.c: Word is converted to mps_word_t in the MPS C Interface. + + +Byte + +.byte: Byte is an unsigned integral type corresponding to the unit in which +most sizes are measured, and also the units of sizeof(). + +.byte.use: Byte should be used in preference to char or unsigned char wherever +it is necessary to deal with bytes directly. + +.byte.source: Byte is a just pedagogic version of unsigned char, since char is +the unit of sizeof(). + + +Index + +.index: Index is an unsigned integral type which is large enough to hold any +array index. + +.index.use: Index should be used where the maximum size of the array cannot be +statically determined. If the maximum size can be determined then the smallest +unsigned integer with a large enough range may be used instead. + + +Count + +.count: Count is an unsigned integral type which is large enough to hold the +size of any collection of objects in the MPS. + +.count.use: Count should be used for a number of objects (control or managed) +where the maximum number of objects cannot be statically determined. If the +maximum number can be statically determined then the smallest unsigned integer +with a large enough range may be used instead (although Count may be preferable +for clarity). [ Should Count be used to count things that aren't represented +by objects (e.g. a level)? I would say yes. gavinm 1998-07-21 ] [Only where +it can be determined that the maximum count is less than the number of +objects. pekka 1998-07-21] + + +Accumulation + +.accumulation: Accumulation is an arithmetic type which is large enough to hold +accumulated totals of objects of bytes (e.g. total number of objects allocated, +total number of bytes allocated). + +.accumulation.type: Currently it is double, but reason for the interface is so +that we can more easily change it if we want to (if we decide we need more +accuracy for example). + +.accumulation.use: Currently the only way to use an Accumulation is to reset it +(AccumulatorReset) and accumulate (Accumulate) amounts into it. There is no +way to read it at the moment, but that's okay, because noone seems to want to. +.accumulation.future: Probably we should have methods which return the +accumulation into an unsigned long, and also a double; these functions should +return bools to indicate whether the accumulation can fit in the requested +type. Possibly we could have functions which returned scaled accumulations +(e.g. AccumulatorScale(a, d) would divide the Accumulation a by double d and +return the double result if the result fitted into a double). + + +Addr + +.addr: Addr is the type used for "managed addresses", that is, addresses of +objects managed by the MPS. + +.addr.def: Addr is defined as struct AddrStruct *, but AddrStruct is never +defined. This means that Addr is always an incomplete type, which prevents +accidental dereferencing, arithmetic, or assignment to other pointer types. + +.addr.use: Addr should be used whenever the code needs to deal with addresses. +It should not be used for the addresses of memory manager data structures +themselves, so that the memory manager remains amenable to working in a +separate address space. Be careful not to confuse Addr with void *. + +.addr.ops: Limited arithmetic is allowed on addresses using AddrAdd and +AddrOffset (impl.c.mpm). Addresses may also be compared using the relational +operators ==, !=, <, <=, >, and >=. .addr.ops.mem: We need efficient operators +similar to memset, memcpy, and memcmp on Addr; these are called AddrSet, +AddrCopy, and AddrComp. When Addr is compatible with void *, these are +implemented through the mps_lib_mem* functions in the plinth (impl.h.mpm) [and +in fact, no other implementation exists at present, pekka 1998-09-07]. + +.addr.conv.c: Addr is converted to mps_addr_t in the MPS C Interface. +mps_addr_t is defined to be the same as void *, so using the MPS C Interface +confines the memory manager to the same address space as the client data. + + +Size + +.size: Size is an unsigned integral type large enough to hold the size of any +object which the MPS might manage. + +.size.byte: Size should hold a size calculated in bytes. Warning: This may not +be true for all existing code. + +.size.use: Size should be used whenever the code needs to deal with the size of +managed memory or client objects. Is should not be used for the sizes of the +memory manager's own data structures, so that the memory manager is amenable to +working in a separate address space. Be careful not to confuse it with size_t. + +.size.ops: [Size operations?] + +.size.conv.c: Size is converted to size_t in the MPS C Interface. This +constrains the memory manager to the same address space as the client data. + + +Align + +.align: Align is an unsigned integral type which is used to represent the +alignment of managed addresses. All alignments are positive powers of two. +Align is large enough to hold the maximum possible alignment. + +.align.use: Align should be used whenever the code needs to deal with the +alignment of a managed address. + +.align.conv.c: Align is converted to mps_align_t in the MPS C Interface. + + +Shift + +.shift: Shift is an unsigned integral type which can hold the amount by which a +Word can be shifted. It is therefore large enough to hold the word width (in +bits). + +.shift.use: Shift should be used whenever a shift value (the right-hand operand +of the << or >> operators) is intended, to make the code clear. It should also +be used for structure fields which have this use. + +.shift.conv.c: Shift is converted to mps_shift_t in the MPS C Interface. + + +Ref + +.ref: Ref is a reference to a managed object (as opposed to any old managed +address). Ref should be used where a reference is intended. + +[This isn't too clear -- richard] + + +RefSet + +.refset: RefSet is a conservative approximation to a set of references. See +design.mps.refset. + + +Rank + +.rank: Rank is an enumeration which represents the rank of a reference. The +ranks are: + +RankAMBIG (0): the reference is ambiguous, i.e. must be assumed to be a +reference, and not update in case it isn't; +RankEXACT (1): the reference is exact, and refers to an object; +RankFINAL (2): the reference is exact and final, so special action is required +if only final or weak references remain to the object; +RankWEAK (3): the reference is exact and weak, so should be deleted if only +weak references remain to the object. + +Rank is stored with segments and roots, and passed around. + +Rank is converted to mps_rank_t in the MPS C Interface. + +The ordering of the ranks is important. It is the order in which the +references must be scanned in order to respect the properties of references of +the ranks. Therefore they are declared explicitly with their integer values. + +[Could Rank be a short?] + +[This documentation should be expanded and moved to its own document, then +referenced from the implementation more thoroughly.] + + +Epoch + +.epoch: An Epoch is a count of the number of flips that the mutator have +occurred. [Is it more general than that?] It is used in the implementation of +location dependencies (LDs). + +Epoch is converted to mps_word_t in the MPS C Interface, as a field of mps_ld_s. + + +TraceId + +.traceid: A TraceId is an unsigned integer which is less than TRACE_MAX. Each +running trace has a different TraceId which is used to index into tables and +bitfields used to remember the state of that trace. + + +TraceSet + +.traceset: A TraceSet is a bitset of TraceIds, represented in the obvious way, +i.e. + + member(ti, ts) <=> (2^ti & ts) != 0 + +TraceSets are used to represent colour in the Tracer. [Expand on this.] + + +AccessSet + +.access-set: An AccessSet is a bitset of Access modes, which are AccessREAD and +AccessWRITE. AccessNONE is the empty AccessSet. + + +Attr + +.attr: Pool attributes. A bitset of pool or pool class attributes, which are: + + AttrFMT: the pool contains formatted objects; + + AttrSCAN: the pool contains references and must be scanned for GC; + + AttrPM_NO_READ: the pool may not be read protected; + + AttrPM_NO_WRITE: the pool may not be write protected; + + AttrALLOC: the pool supports the PoolAlloc interface; + + AttrFREE: the pool supports the PoolFree interface; + + AttrBUF: the pool supports the allocation buffer interface; + + AttrBUF_RESERVE: the pool suppors the reserve/commit protocol on allocation +buffers; + + AttrBUF_ALLOC: the pool supports the alloc protocol on allocation buffers; + + AttrGC: the pool is garbage collecting, i.e. parts may be reclaimed; + + AttrINCR_RB: the pool is incremental requiring a read barrier; + + AttrINCR_WB: the pool is incremental requiring a write barrier. + +There is an attribute field in the pool class (PoolClassStruct) which declares +the attributes of that class. These attributes are only used for consistency +checking at the moment. [no longer true that they are only used for consistency +checking -- drj 1998-05-07] + + +RootVar + +.rootvar: The type RootVar is the type of the discriminator for the union +within RootStruct. + + +Serial + +.serial: A Serial is a number which is assigned to a structure when it is +initialized. The serial number is taken from a field in the parent structure, +which is incremented. Thus, every instance of a structure has a unique "name" +which is a path of structures from the global root. For example: + + space[3].pool[5].buffer[2] + +Why? Consistency checking, debugging, and logging. Not well thought out. + + +Compare + +.compare: Compare is the type of tri-state comparison values. + +CompareLESS: Indicates that a value compares less than another value. +CompareEQUAL: Indicates that two value compares the same +CompareGREATER: Indicates that a value compares greater than another value. + + +ABSTRACT TYPES + +.adts: The following types are abstract data types, implemented as pointers to +structures. For example, Ring is a pointer to a RingStruct. They are +described elsewhere [where?]. + + Ring, Buffer, AP, Format, LD, Lock, Pool, Space, PoolClass, Trace, + ScanState, Seg, Arena, VM, Root, Thread. + + +POINTERS + +.pointer: The type Pointer is the same as "void *", and exists to sanctify +functions such as PointerAdd. + + diff --git a/mps/design/version-library/index.txt b/mps/design/version-library/index.txt new file mode 100644 index 00000000000..fbccf7cfd38 --- /dev/null +++ b/mps/design/version-library/index.txt @@ -0,0 +1,90 @@ + DESIGN OF THE MPS LIBRARY VERSION MECHANISM + design.mps.version-library + incomplete doc + drj 1998-08-19 + +INTRODUCTION + +.intro: This describes the design of a mechanism to be used to determine the +version (that is, product, version, and release) of an MPS library. + + +READERSHIP + +.readership: Any MPS developer. + + +SOURCE + +.source: Various requirements demand such a mechanism. See +request.epcore.160021: There is no way to tell which version and release of the +MM one is using. + + +OVERVIEW + +.overview: See design.mps.version for discussion and design of versions of +other aspects of the software. This document concentrates on a design for +determining which version of the library one has linked with. There are two +aspects to the design, allowing humans to determine the version of an MPS +library, and allowing programs to determine the version of an MPS library. +Only the former is currently designed (a method for humans to determine which +version of an MPS library is being used). + +.overview.impl: The overall design is to have a distinctive string compiled +into the library binary. Various programs and tools will be able to extract +the string and display it. The string will identify the version of the MPS +begin used. + + +ARCHITECTURE + +.arch.structure: The design consists of 3 components: + +.arch.string: A string embedded into any delivered library binaries (which will +encode the necessary information). + +.arch.proc: A process by which the string is modified appropriately whenever +releases are made. + +.arch.tool: A tool and its documentation (it is expected that standard tools +can be used). The tool will be used to extract the version string from a +delivered library or an executable linked with the library. + +.arch.not-here: Only the string component (:arch.string) is directly described +here. The other components are described elsewhere. (where?) + +The string will contain information to identify the following items: +.arch.string.platform: the platform being used. +.arch.string.product: the name of the product. +.arch.string.variety: the variety of the product. +.arch.string.version: the version and release of the product. + + +IMPLEMENTATION + +.impl.file: The string itself is a declared C object in the file version.c +(impl.c.version). It consists of a concatenation of various strings which are +defined in other modules. + +.impl.variety: The string containing the name of the variety is the expansion +of the macro MPS_VARIETY_STRING defined by config.h (impl.h.config). + +.impl.product: The string containing the name of the product is the expansion +of the macro MPS_PROD_STRING defined by config.h (impl.h.config). + +.impl.platform: The string containing the name of the platform is the expansion +of the macro MPS_PF_STRING defined by mpstd.h (impl.h.mpstd). + +.impl.date: The string contains the date and time of compilation by using the +__DATE__ and __TIME__ macros defined by ISO C (ISO C clause 6.8.8). + +.impl.version: The string contains the version and release of the product. +This is by the expansion of the macro MPS_RELEASE which is defined in this +module (version.c). + +.impl.usage: To make a release, the MPS_RELEASE macro (see +impl.c.version.release) is edited to contain the release name (e.g., +"release.epcore.brisling"), and then changed back immediately after the release +checkpoint is made. + diff --git a/mps/design/version/index.txt b/mps/design/version/index.txt new file mode 100644 index 00000000000..19303cd4345 --- /dev/null +++ b/mps/design/version/index.txt @@ -0,0 +1,23 @@ + DESIGN OF MPS SOFTWARE VERSIONS + design.mps.version + incomplete doc + drj 1998-08-19 + +INTRODUCTION + +.intro: This is the design of the support in the MPS for describing and +inspecting versions. + +OVERVIEW + +.overview: There different sorts of version under consideration: Versions of +the (MPS) library used (linked with), versions of the interface used (header +files in C) when compiling the client's program, versions of the documentation +used when the client was writing the program. There are issues of programmatic +and human access to these versions. + +.overview.split: The design is split accordingly. See +design.mps.version-library for the design of a system for determining the +version of the library one is using. And other non-existant documents for the +others. + diff --git a/mps/design/vm/index.txt b/mps/design/vm/index.txt new file mode 100644 index 00000000000..c66d2f26fe8 --- /dev/null +++ b/mps/design/vm/index.txt @@ -0,0 +1,95 @@ + THE DESIGN OF THE VIRTUAL MAPPING INTERFACE + design.mps.vm + incomplete design + richard 1998-05-11 + +.intro: This the design of the VM interface. The VM interface provides a +simple, low-level, operating-system independent interface to address-space. +Each call to VMCreate() reserves (from the operating-system) a single +contiguous range of addresses, and returns a VMStruct thereafter used to manage +this address-space. The VM interface has separate implementations for each +platform that supports it (at least conceptually, in practice some of them may +be the same). The VM module provides a mechanism to reserve large (relative to +the amount of RAM) amounts of address space, and functions to map (back with +RAM) and unmap portions of this address space. + +.motivation: The VM is used by the VM Arena Class. It provides the basic +substrate to provide sparse address maps. Sparse address maps have at least +two uses: to encode information into the address of an object which is used in +tracing (the Zone Test) to speed things up; to avoid fragmentation at the +segment level and above (since the amount of address space reserved is large +compared to the RAM, the hope is that there will also be enough address space +somewhere to fit any particular segment in). + + +DEFINITIONS + +.def.reserve: The "reserve" operation: Exclusively reserve a portion of the +virtual address space without arranging RAM or backing store for the virtual +addresses. The intention is that no other component in the process will make +use of the reserved virtual addresses, but in practice this may entail assuming +a certain amount of cooperation. When reserving address space, the requester +simply asks for a particular size, not a particular range of virtual +addresses. Accessing (read/write/execute) reserved addresses is illegal unless +those addresses have been mapped. + +.def.map: The "map" operation: Arrange that a specified portion of the virtual +address space is mapped from the swap, effectively allocating RAM and/or swap +space for a particular range of addresses. If successful, accessing the +addresses is now legal. Only reserved addresses should be mapped. + +.def.unmap: The "unmap" operation: The inverse of the map operation. Arrange +that a specified portion of the virtual address space is no longer mapped, +effectively freeing up the RAM and swap space that was in use. Accessing the +addresses is now illegal. The addresses return to the reserved state. + +.def.vm: "VM" stands for Virtual Memory. Various meanings: A processor +architecture's virtual space and structure; The generic idea / interface / +implementation of the MPS VM module; The C structure (struct VMStruct) used to +encapsulate the functionality of the MPS VM module; An instance of such a +structure. + +.def.vm.mps: In the MPS, a "VM" is a VMStruct, providing access to the single +contiguous range of address-space that was reserved (from the operating-system) +when VMCreate was called. + + +INTERFACE + +.if.create: Res VMCreate(VM *VMReturn, Size size) + +VMCreate is responsible both for allocating a VMStruct and for reserving an +amount of virtual address space. A VM is created and a pointer to it is +returned in the return parameter VMReturn. This VM has at least size bytes of +virtual memory reserved. If there's not enough space to allocate the VM, +ResMEMORY is returned. If there's not enough address space to reserve a block +of the given size, ResRESOURCE is returned. The reserved virtual memory can be +mapped and unmapped using VMMap and VMUnmap. + + +.if.destroy: void VMDestroy(VM vm) + +A VM is destroyed by calling VMDestroy. Any address space that was mapped +through this VM is unmapped. + + +[lots of interfaces missing here] + + +NOTES + +.diagram: + + + +.testing: It is important to test that a VM implementation will work in extreme +cases. .testing.large: It must be able to reserve a large address space. +Clients will want multi-GB spaces, more than that OSs will allow. If they ask +for too much, mps_arena_create (and hence VMCreat4e) must fail in a predictable +way. .testing.larger: It must be possible to allocate in a large space; +sometimes commiting will fail, because there's not enough space to replace the +"reserve" mapping. See request.epcore.160201 for details. .testing.lots: It +must be possible to have lots of mappings. The OS must either combine adjacent +mappings or have lots of space in the kernel tables. See request.epcore.160117 +for ideas on how to test this. + diff --git a/mps/design/vman/index.txt b/mps/design/vman/index.txt new file mode 100644 index 00000000000..8d4ab8d8b8c --- /dev/null +++ b/mps/design/vman/index.txt @@ -0,0 +1,14 @@ + ANSI FAKE VM + design.mps.vman + incomplete doc + drj 1996-11-07 + +.intro: The ANSI fake VM is an implementation of the MPS VM interface (see +design.mps.vm) using services provided by the ANSI C Library (standard.ansic.7) +(malloc and free as it happens). + +.align: The VM is aligned to VMAN_ALIGN (defined in impl.h.mpmconf) by adding +VMAN_ALIGN to the reqested size, mallocing a block that large, then rounding +the pointer to the base of the block. vm->base is the aligned pointer, +vm->block is the pointer returned by malloc (used when during VMDestroy). + diff --git a/mps/design/vmo1/index.txt b/mps/design/vmo1/index.txt new file mode 100644 index 00000000000..bd883354562 --- /dev/null +++ b/mps/design/vmo1/index.txt @@ -0,0 +1,34 @@ + VM MODULE ON DEC UNIX + design.mps.vmo1 + incomplete doc + drj 1997-03-25 + +INTRODUCTION + +.readership: Any MPS developer. + +.intro: This is the design of the VM Module for DEC UNIX (aka OSF/1 os.o1). In +general aspects (including interface) the design is as for design.mps.vm. + +DETAILS + +Functions + +.fun.unmap: + +VMUnmap + +It "unmaps" a region by replacing the existing mapping with a mapping using the +vm->none_fd file descriptor (see mumble mumble, VMCreate), and protection set +to PROT_NONE (ie no access). .fun.unmap.justify: Replacing the mapping in this +way means that the address space is still reserved and will not be used by +calls to mmap (perhaps in other libraries) which specify MAP_VARIABLE. +.fun.unmap.offset: The offset for this mapping is the offset of the region +being unmapped in the VM; this gives the same effect as if there was one +mapping of the vm->none_fd from the base to the limit of the VM (but "behind" +all the other mappings that have been created). .fun.unmap.offset.justify: If +this is not done (if for example the offset is always specified as 0) then the +VM will cause the kernel to create a new file reference for each mapping +created with VMUnmap; eventually the kernel refuses the mmap call because it +can't create a new file reference. + diff --git a/mps/design/vmso/index.txt b/mps/design/vmso/index.txt new file mode 100644 index 00000000000..7158682ac6b --- /dev/null +++ b/mps/design/vmso/index.txt @@ -0,0 +1,97 @@ + VM DESIGN FOR SOLARIS + design.mps.vmso + incomplete doc + drj 1998-05-08 + +INTRODUCTION + +.intro: This is the design for the VM implementation on Solaris 2.x (see os.so +for OS details). The implementation is in MMsrc!vmso.c (impl.c.vm). The +design follows the design for and implements the contract of the generic VM +interface (design.mps.vm). To summarize: The VM module provides a mechanism to +reserve large (relative to the amount of RAM) amounts of address space, and +functions to map (back with RAM) and unmap portions of this address space. + +.source: Much of the implementation (and hence the design) was inherited from +the SunOS4 implementation. Not that there's any design for that. You'll find +the mmap(2) (for the system call mmap) and the zero(7d) (for the device +/dev/zero) man pages useful as well. The generic interface and some generic +design is in design.mps.vm. + + +DEFINITIONS + +.def: See design.mps.vm.def.* for definitions common to all VMs. + + +OVERVIEW + +.over: The system calls mmap and munmap are used to access the underlying +functionality. They are used in slightly unusual ways, typically to overcome +baroque features or implementation details of the OS. .over.reserve: In order +to reserve address space, a mapping to a file (/etc/passwd as it happens) is +created with no protection allowed. .over.map: In order to map memory, a +mapping to /dev/zero is created. .over.destroy: When the VM is destroyed, +munmap is used to remove all the mappings previously created. + + +IMPLEMENTATION + +.impl.create: VMCreate + +.impl.create.vmstruct: Enough pages to hold the VMStruct are allocated by +creating a mapping to /dev/zero (a read/write private mapping), and using +initializing the memory as a VMStruct. .impl.create.reserve: The size +parameter is rounded up to page size and this amount of address space is +reserved. The address space is reserved by creating a shared mapping to +/etc/passwd with no access allowed (prot argument is PROT_NONE, flags argument +is MAP_SHARED). .impl.create.reserve.mmap.justify: mmap gives us a flexible +way to allocate address space without interfering with any other component in +the process. Because we don't specify MAP_FIXED we are guaranteed to get a +range of addresses that are not in use. Other components must cooperate by not +attempting to create mappings specifying MAP_FIXED and an address in the range +that the MPS has reserved. .impl.create.reserve.passwd.justify: Mapping +/etc/passwd like this worked on SunOS4 (so this implementation inherited it). +Mapping /dev/zero with prot=PROT_NONE and flags=MAP_PRIVATE does not work +because Solaris gratuitously allocates swap (even though you can't use the +memory). .impl.create.reserve.improve: However, it would appears that ORing in +MAP_NORESERVE mapping /dev/zero will reserve address space without allocating +swap, so this might be worth trying. I.e., with prot=PROT_NONE, +flags=MAP_PRIVATE|MAP_NORESERVE. However the following caveat comes from the +original implementation: "Experiments have shown that attempting to reserve +address space by mapping /dev/zero results in swap being reserved. This +appears to be a bug, so we work round it by using /etc/passwd, the only file we +can think of which is pretty much guaranteed to be around." So that might not +work after all. + +.impl.map: VMMap + +.impl.map.zero: A mapping to /dev/zero is created at the relevant addresses +(overriding the map to /etc/passwd that was previously in place for those +addresses). The prot argument is specified as PROT_READ|PROT_WRITE|PROT_EXEC +(so that any access is allowed), the flags argument as MAP_PRIVATE|MAP_FIXED +(MAP_PRIVATE means that the mapping is not shared with child processes (child +processes will have a mapping, but changes to the memory will not be shared). +MAP_FIXED guarantees that we get the mapping at the specified address). The +zero(7d) man page documents this as a way to create a "zero-initialized unnamed +memory object". .impl.map.error: If there's not enough swap space for the +mapping, mmap will return EAGAIN, not ENOMEM, although you might not think so +from the man page. + + +.impl.unmap: VMUnmap + +.impl.unmap.reserve: The relevant addresses are returned to the reserved state +by creating a mapping to /etc/passwd (overriding the map /dev/zero that was +previously in place for those addresses). As for VMCreate (see +.impl.create.reserve above) the prot argument is PROT_NONE, but the flags +argument has the addition MAP_FIXED flags (so is MAP_SHARED|MAP_FIXED). +.impl.unmap.reserve.offset: The offset argument is specified to be the offset +of the addresses being unmapped from the base of the reserved VM area. +.impl.unmap.reserve.offset.justify: Not specifying the offset like this makes +Solaris create a separate mapping (in the kernel) each time Unmap is used, +eventually the call to mmap will fail. Specifying offset like this does not +cause Solaris to create any extra mappings, the existing mapping to /etc/passwd +gets reused. + + diff --git a/mps/design/writef/index.txt b/mps/design/writef/index.txt new file mode 100644 index 00000000000..b99c24aac6c --- /dev/null +++ b/mps/design/writef/index.txt @@ -0,0 +1,88 @@ + THE DESIGN OF THE MPS WRITEF FUNCTION + design.mps.writef + draft doc + richard 1996-10-18 + +INTRODUCTION + +.intro: This document describes the WriteF function, which allows formatted +output in a manner similar to ANSI C printf, but allows the MPM to operate in a +freestanding environment (see design.mps.exec-env). + +.background: The documents design.mps.exec-env and design.mps.lib describe the +design of +the library interface and the reason that it exists. + + +DESIGN + +.no-printf: There is no dependency on printf has been removed. The MPM only +depends on fputc and fputs, via the Library Interface (design.mps.lib). This +makes it much easier to deploy the MPS in a freestanding environment. This is +achieved by implementing our own internal output routines in mpm.c. + +Our output requirements are few, so the code is short. The only output +function which should be used in the rest of the MPM is WriteF, which is +similar to fprintf: + + Res WriteF(mps_lib_FILE *stream, ...); + +WriteF expects a format string followed by zero or more items to insert into +the output, followed by another format string, more items, etc., then a NULL +format string, e.g. + + WriteF(stream, + "Hello: $A\n", address, + "Spong: $U ($S)\n", number, string, + NULL); + +This makes Describe methods much easier to do, e.g.: + + WriteF(stream, + "Buffer $P ($U) {\n", (WriteFP)buffer, (WriteFU)buffer->serial, + " base $A init $A alloc $A limit $A\n", + (WriteFA)buffer->base, (WriteFA)buffer->ap.init, + (WriteFA)buffer->ap.alloc, (WriteFA)buffer->ap.limit, + " Pool $P\n", (WriteFP)buffer->pool, + " Seg $P\n", (WriteFP)buffer->seg, + " rank $U\n", (WriteFU)buffer->rank, + " alignment $W\n", (WriteFW)buffer->alignment, + " grey $B\n", (WriteFB)buffer->grey, + " shieldMode $B\n", (WriteFB)buffer->shieldMode, + " p $P i $U\n", (WriteFP)buffer->p, (WriteFU)buffer->i, + "} Buffer $P ($U)\n", (WriteFP)buffer, (WriteFU)buffer->serial, + NULL); + +.types: For each format $X that WriteF supports, there is a type defined in +impl.h.mpmtypes WriteFX which is the promoted version of that type. These are +provided both to ensure promotion and to avoid any confusion about what type +should be used in a cast. It is easy to check the casts against the formats to +ensure that they correspond. .types.future: It is possibly that this type set +or similar may be used in future in some generalisation of varargs in the MPS. + +.formats: The formats supported are as follows. + + code name type example rendering + + $A address Addr 9EF60010 + $P pointer void * 9EF60100 + $F function void *(*)() 9EF60100 (may be plaform-specific length +and format) + $S string char * hello + $C character char x + $W word unsigned long 00109AE0 + $U decimal unsigned long 42 + $B binary unsigned long 00000000000000001011011110010001 + $$ dollar - $ + +Note that WriteFC is an int, because that is the default promotion of a char +(see .types). + +.snazzy: We should resist the temptation to make WriteF an incredible snazzy +output engine. We only need it for Describe methods and assertion messages. +At the moment it's a very simple bit of code -- let's keep it that way. + +.f: The F code is used for function pointers. They are currently printed as a +hexedecimal string of the appropriate length for the platform, and may one day +be extended to include function name lookup. +