Check in incomplete strategy design doc.

Copied from Perforce Change: 182491 ServerID: perforce.ravenbrook.com
2026-03-26 00:34:17 -07:00 · 2013-06-04 15:28:02 +01:00 · 2013-06-04 15:28:02 +01:00 · 1486031ba0
commit 1486031ba0
parent c8362915e1
1 changed files with 466 additions and 0 deletions
--- a/mps/design/strategy.txt
+++ b/mps/design/strategy.txt
@ -0,0 +1,466 @@
+.. mode: -*- rst -*-
+
+MPS Strategy
+============
+
+:Tag: design.mps.strategy
+:Author: Nick Barnes
+:Organization: Ravenbrook Limited
+:Date: 2013-06-04
+       :Revision: $Id$ : //info.ravenbrook.com/project/mps/master/design/txt#8 $
+:Copyright: See section `Copyright and License`_.
+
+
+.. sources:
+
+    `<https://info.ravenbrook.com/project/mps/master/design/strategy/>`_
+
+.. mps:prefix:: design.mps.strategy
+
+Introduction
+------------
+
+_`.intro` This is the design of collection strategy for the MPS.
+
+_`.readership` MPS developers.
+
+
+Overview
+--------
+
+_`.overview` The MPS uses "strategy" code to make three decisions:
+
+- when to start a collection trace;
+
+- what to condemn;
+
+- how to schedule tracing work.
+
+This document describes the current strategy, identifies some
+weaknesses in it, and outlines some possible future development
+directions.
+
+
+Requirements
+------------
+
+[We could source some from req.dylan, or do an up-to-date requirements
+analysis -- NB 2013-03-25]
+
+Garbage collection is a trade-off between time and space: it consumes
+some [CPU] time in order to save some [memory] space.  Strategy shifts
+the balance point.  A better strategy will take less time to produce
+more space.  Examples of good strategy might include:
+
+- choosing segments to condemn which contain high proportions of dead
+  objects;
+
+- starting a trace when a large number of objects have just died;
+
+- doing enough collection soon enough that the client program never
+  suffers low-memory problems;
+
+- using otherwise-idle CPU resources for tracing.
+
+Conversely, it would be bad strategy to do the reverse of each of
+these (condemning live objects; tracing when there's very little
+garbage; not collecting enough; tracing when the client program is
+busy).
+
+Abstracting from these notions, requirements on strategy would
+relate to:
+
+- Maximum pause time and other utilization metrics (for example,
+  bounded mutator utilization, minimum mutator utilization, total MPS
+  CPU usage);
+
+- Collecting enough garbage (for example: overall heap size;
+  low-memory requirements).
+
+- Allowing client control (for example, client recommendations for
+  collection timing or condemnation).
+
+There are other possible strategy considerations which are so far
+outside the scope of current strategy and MPS design that this
+document disregards them. For example, either inferring or allowing
+the client to specify preferred relative object locations ("this
+object should be kept in the same cache line as that one"), to improve
+cache locality.
+
+Generations
+-----------
+
+The largest part of the current MPS strategy implementation is the
+support for generational GC.  Generations are only fully supported for
+AMC (and AMCZ) pools.  See under "Non-AMC Pools", below, for more
+information.
+
+Data Structures
+...............
+
+The fundamental structure of generational GC is the ``Chain``,
+which describes a set of generations.  A chain is created by client
+code calling ``mps_chain_create()``, specifying the "size" and
+"mortality" for each generation.  When creating an AMC pool, the
+client code must specify the chain which will control collections for
+that pool.  The same chain may be used for multiple pools.
+
+Each generation in a chain has a ``GenDesc`` structure,
+allocated in an array pointed to from the chain.  Each AMC pool has a
+set of ``PoolGen`` structures, one per generation.  The PoolGens
+for each generation point to the GenDesc and are linked together in a
+ring on the GenDesc.  These structures are (solely?) used to gather
+information for strategy decisions.
+
+The arena has a unique ``GenDesc`` structure, named
+``topGen`` and described in comments as "the dynamic generation"
+(although in fact it is the *least* dynamic generation).  Each AMC
+pool has one more PoolGen than there are GenDescs in the chain.  The
+extra PoolGen refers to this topGen.
+
+AMC segments have a segment descriptor ``amcSegStruct`` which is
+a ``GCSegStruct`` with two additional fields.  One field
+``segTypeP`` is a pointer either to the per-generation per-pool
+``amcGen`` structure (a subclass of ``PoolGen``), or to a
+nailboard (which then points to an amcGen).  The other field
+``new`` is a boolean used for keeping track of memory usage for
+strategy reasons (see below under 'Accounting').  The ``amcGen``
+is used for statistics (``->segs``) and forwarding buffers
+(``->forward``).
+
+The AMC pool class only ever allocates a segment in order to fill a
+buffer: either the buffer for a client Allocation Point, or a
+forwarding buffer.  In order to support generational collection, there
+is a subclass ``amcBuf`` of ``SegBuf``, with a
+``gen`` field (pointing to a ``amcGen``).  So in
+``AMCBufferFill()`` the generation of the new segment can be
+determined.
+
+When an AMC pool is created, these ``amcGen`` and
+``amcBuf`` structures are all created, and the
+``amcBuf->gen`` fields initialized so that the forwarding buffer
+of each amcGen knows that it belongs to the next "older" amcGen (apart
+from the "oldest" amcGen - that which refers to the topGen - whose
+forwarding buffer belongs to itself).
+
+When copying an object in ``AMCFix()``, the object's current
+generation is determined (``amcSegGen()``), and the object is
+copied to that amcGen's forwarding buffer, using the buffer protocol.
+Thus, objects are "promoted" up the chain of generations until they
+end up in the topGen, which is shared between all chains and all
+pools.
+
+For statistics and reporting purposes, when ``STATISTICS`` is
+on, each AMC pool has an array of ``PageRetStruct``s, one per
+trace.  This structure has many ``Count`` fields, and is
+intended to help to assess AMC page retention code.  See job001811.
+
+Zones
+.....
+
+All collections in the MPS start with condemnation of a complete
+``ZoneSet``.  Each generation in each chain has a zoneset
+associated with it (``chain->gen[N].zones``); the condemned
+zoneset is the union of some number of generation's zonesets.  It is
+condemned by code in the chain system calling
+``TraceCondemnZones()``.  This is either for all chains
+(``ChainCondemnAll()`` called for every chain from
+``traceCondemnAll()``) or for some number of generations in a
+single chain (``ChainCondemnAuto()`` called from
+``TracePoll()``).  Note that the condemnation is of every
+automatic-pool segment in any zone in the zoneset.  It is not limited
+to the segments actually associated with the condemned generation(s).
+
+An attempt is made to use distinct zonesets for different generations.
+Whenever a segment is allocated (``AMCBufferFill()``), a
+``SegPref`` is created containing the generation number
+(obtained from ``amcBuf->gen->pgen->nr``) and passed to
+``SegAlloc()``.  The arena keeps a zoneset for each generation
+number (up to ``VMArenaGenCount``, defined in
+``arenavm.c`` to be ``MPS_WORD_WIDTH/2``), and a
+``freeSet``.  The zoneset for each generation number starts out
+empty, and the ``freeSet`` starts out ``ZoneSetUNIV``.
+When a segment is allocated with a ``SegPref`` with a generation
+number, an attempt is made to allocate it in the corresponding zoneset
+(``pagesFindFreeInZones()``).  If the zoneset is empty, an
+attempt is made to allocate it in the ``freeSet`` zoneset.
+After it is allocated, the zones it occupies are removed from the
+``freeSet`` and (if there's a generation ``SegPref``)
+added to the zoneset for that generation number.
+
+Note that this zone placement code knows nothing of chains,
+generations, pool classes, etc.  It is based solely on the generation
+*number*, so generations with the same number from different chains
+share a zoneset preference for the purpose of placing newly allocated
+segments.  Combined with the fact that condemnation is per-zone, this
+effectively means that generations in distinct chains are collected
+together.  One consequence of this is that we don't have a very fine
+granularity of control over collection: a garbage collection of all
+chains together is triggered by the most eager chain.  There's no way
+for a library or other small part of a client program to arrange
+independent collection of a separate pool or chain.
+
+When ``AMCBufferFill()`` gets the allocated segment back, it
+adds it to the zoneset associated with that generation in the pool's
+controlling chain.  Note that a chain's per-generation zonesets, which
+represent the zones in which segments for that generation in that
+chain have been placed, are quite distinct from the arena-wide
+per-generation-number zonesets, which represent the zones in which
+segments for that generation number in any chain have been placed.
+The arena-wide per-generation-number zoneset
+``vmArena->genZoneSet[N]`` is augmented in
+``vmAllocComm()``.  The per-chain per-generation zoneset
+``chain->gen[N].zones`` is augmented in
+``PoolGenUpdateZones()``.  Neither kind of zoneset can ever
+shrink.
+
+Accounting
+..........
+
+Each PoolGen
+
+gen[N].proflow ??
+
+gen[N].mortality
+gen[N].capacity
+
+pgen->totalSize
+pgen->newSize
+pgen->newSizeAtCreate
+
+amcSeg->new
+
+
+pgen->totalSize incremented by AMCBufferFill(), decremented by
+amcReclaimNailed() and AMCReclaim(), added up by
+GenDescTotalSize(gen).
+
+pgen->newSize incremented by AMCBufferFill() (*when not ramping*) and
+AMCRampEnd(), decremented by AMCWhiten().
+
+gen[N].proflow set to 1.0 by ChainCreate().  arena->topGen.proflow set
+to 0.0 by LocusInit(arena).  This field is never used.
+
+added up by GenDescNewSize(gen).
+
+
+
+
+pgen->newSizeAtCreate is set by traceCopySizes() (that is its
+purpose), and output by TraceStartGenDesc_diag().
+
+
+
+
+
+
+
+
+Ramps
+.....
+The intended semantics of ramping are pretty simple.  It allows the
+client to advise us of periods of large short-lived allocation on a
+particular AP.  Stuff allocated using that AP during its "ramp" will
+probably be dead when the ramp finishes.  How the MPS makes use of this
+advice is up to us, but for instance we might segregate those objects,
+collect them less enthusiastically during the ramp and then more
+enthusiastically soon after the ramp finishes.  Ramps can nest.
+
+A ramp is entered by calling::
+
+    mps_ap_alloc_pattern_begin(ap, mps_alloc_pattern_ramp())
+
+or similar, and left in a similar way.
+
+This is implemented on a per-pool basis, for AMC only (it's ignored by
+the other automatic pools).  PoolAMC throws away the identity of the AP
+specified by the client.  The implementation is intended to work by
+changing the generational forwarding behaviour, so that there is a "ramp
+generation" - one of the regular AMC generations - which forwards to
+itself if collected during a ramp (instead of promoting to an older
+generation).  It also tweaks the strategy calculation code, in a way
+with consequences I am documenting elsewhere.
+
+Right now, the code sets this ramp generation to the last generation
+specified in the pool's "chain": it ordinarily forwards to the
+"after-ramp" generation, which is the "dynamic generation" (i.e. the
+least dynamic generation, i.e. the arena-wide "top generation").  My
+recollection, and some mentions in design/poolamc, suggests that the
+ramp generation used to be chosen differently from this.
+
+So far, it doesn't sound too ghastly, I guess, although the subversion
+of the generational system seems a little daft.  Read on....
+
+An AMC pool has a ``rampMode`` (which is really a state of a state
+machine), taking one of five values: OUTSIDE, BEGIN, RAMPING, FINISH,
+and COLLECTING (actually the enum values are called RampX for these
+X). We initialize in OUTSIDE.  The pool also has a ``rampCount``,
+which is the ramp nesting depth and is used to allow us to ignore ramp
+transitions other than the outermost.  According to design/poolamc,
+there's an invariant (in BEGIN or RAMPING, ``rampCount > 0``; in
+COLLECTING or OUTSIDE, ``rampCount == 0``), but this isn't checked in
+``AMCCheck()`` and in fact is false for COLLECTING (see below).
+
+There is a small set of events causing state machine transitions:
+
+- entering an outermost ramp;
+- leaving an outermost ramp;
+- condemning any segment of a ramp generation (detected in AMCWhiten);
+- reclaiming any AMC segment.
+
+Here's pseudo-code for all the transition events:
+
+Entering an outermost ramp:
+ if not FINISH, go to BEGIN.
+
+Leaving an outermost ramp:
+ if RAMPING, go to FINISH.   Otherwise, go to OUTSIDE.
+
+Condemning a ramp generation segment:
+ If BEGIN, go to RAMPING and make the ramp generation forward
+ to itself (detach the forwarding buffer and reset its generation).
+ If FINISH, go to COLLECTING and make the ramp generation
+ forward to the after-ramp generation.
+
+Reclaiming any AMC segment:
+ If COLLECTING:
+   if ``rampCount > 0``, go to BEGIN.  Otherwise go to OUTSIDE.
+
+Now, some deductions:
+
+1. When OUTSIDE, the count is always zero, because (a) it starts that
+way, and the only ways to go OUTSIDE are (b) by leaving an outermost
+ramp (count goes to zero) or (c) by reclaiming when the count is zero.
+
+2. When BEGIN, the count is never zero (consider the transitions to
+BEGIN and the transition to zero).
+
+3. When RAMPING, the count is never zero (again consider transitions to
+RAMPING and the transition to zero).
+
+4. When FINISH, the count can be anything (the transition to FINISH has
+zero count, but the Enter transition when FINISH can change that and
+then it can increment to any value).
+
+5. When COLLECTING, the count can be anything (from the previous fact,
+and the transition to COLLECTING).
+
+6. **** THE BUG **** the ramp generation is not always reset (to forward
+to the after-ramp generation).  If we get into FINISH and then see
+another ramp before the next condemnation of the ramp generation, we
+will Enter followed by Leave.  The Enter will keep us in FINISH, and the
+Leave will take us back to OUTSIDE, skipping the transition to the
+COLLECTING state which is what resets the ramp generation forwarding
+buffer.
+
+The simplest change to fix this is to change the behaviour of the Leave
+transition, which should only take us OUTSIDE if we are in BEGIN or
+COLLECTING.  We should also update design/poolamc to tell the truth, and
+check the invariants, which will be these:
+
+     OUTSIDE => zero
+     BEGIN => non-zero
+     RAMPING => non-zero
+
+A cleverer change might radically rearrange the state machine
+(e.g. reduce the number of states to three) but that would require
+closer design thought and should probably be postponed until we have a
+clearer overall strategy plan.
+
+While I'm writing pseudo-code versions of ramp-related code, I should
+mention this other snippet, which is the only other code relating to
+ramping (these notes are useful when thinking about the broader strategy
+code):
+
+- In ``AMCBufferFill()``, if we're RAMPING, and filling the forwarding
+buffer of the ramp generation, and the ramp generation is the
+forwarding buffer's generation, set ``amcSeg->new`` to FALSE.  Otherwise,
+add the segment size to ``poolGen.newSize``.
+
+And since I've now mentioned the ``amcSeg->new`` flag, here are the only
+other uses of that:
+
+- it initializes as TRUE.
+
+- When leaving an outermost ramp, go through all the segments in the
+pool.  Any non-white segment in the rampGen with new set to FALSE has
+its size added to ``poolGen->newSize`` and gets new set to TRUE.
+
+- in ``AMCWhiten()``, if new is TRUE, the segment size is deducted
+from ``poolGen.newSize`` and new is set to FALSE.
+
+Non-AMC Pools 
+.............
+
+The implementations of AMS, AWL, and LO pool classes are all aware of
+generations (this is necessary because all tracing is driven by the
+generational data structures described above), but do not make use of
+them.  For LO and AWL, when a pool is created, a chain with a single
+generation is also created, with size and mortality parameters
+hard-wired into the pool-creation function (LOInit, AWLInit).  For
+AMS, a chain is passed as a pool creation parameter into
+``mps_pool_create()``, but this chain must also have only a
+single generation (otherwise ``ResPARAM`` is returned).
+
+Note that these chains are separate from any chain used by an AMC pool
+(except in the trivial case when a single-generation chain is used for
+both AMC and AMS).  Note also that these pools do not use or point to
+the ``arena->topGen``, which applies only to AMC.
+
+Non-AMC pools have no support for ramps.
+
+Starting a Trace
+................
+
+Trace Progress
+..............
+
+Document History
+----------------
+- 2013-06-04 NB Checked this in although it's far from complete.
+  Pasted in my 'ramping notes' from email, which mention some bugs
+  which I may have fixed (TODO: check this).
+
+.. _NB: http://www.ravenbrook.com/consultants/nb/
+
+
+Copyright and License
+---------------------
+Copyright © 2013 Ravenbrook Limited. All rights reserved. 
+<http://www.ravenbrook.com/>. This is an open source license. Contact
+Ravenbrook for commercial licensing options.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+1. Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation and/or other materials provided with the distribution.
+
+3. Redistributions in any form must be accompanied by information on how
+to obtain complete source code for this software and any
+accompanying software that uses this software.  The source code must
+either be included in the distribution or be available for no more than
+the cost of distribution plus a nominal fee, and must be freely
+redistributable under reasonable conditions.  For an executable file,
+complete source code means the source code for all modules it contains.
+It does not include source code for modules or files that typically
+accompany the major components of the operating system on which the
+executable file runs.
+
+**This software is provided by the copyright holders and contributors
+"as is" and any express or implied warranties, including, but not
+limited to, the implied warranties of merchantability, fitness for a
+particular purpose, or non-infringement, are disclaimed.  In no event
+shall the copyright holders and contributors be liable for any direct,
+indirect, incidental, special, exemplary, or consequential damages
+(including, but not limited to, procurement of substitute goods or
+services; loss of use, data, or profits; or business interruption)
+however caused and on any theory of liability, whether in contract,
+strict liability, or tort (including negligence or otherwise) arising in
+any way out of the use of this software, even if advised of the
+possibility of such damage.**