1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2025-12-31 09:20:54 -08:00
emacs/mps/manual/wiki/apguide.html
Richard Kistruck e2a2a48f82 Mps wiki apguide: (tweak) replace field "family" with "tribe"
Copied from Perforce
 Change: 161341
 ServerID: perforce.ravenbrook.com
2006-12-28 13:54:52 +00:00

580 lines
23 KiB
HTML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>MPS Wiki: Allocation Points -- User's Guide</title>
<style type = "text/css">
<!--
div.banner {text-align:center}
dt {font-weight:bold}
-->
</style>
</head>
<body>
<div class="banner">
<p>
<a href="/">Ravenbrook</a>
/ <a href="/project/">Projects</a>
/ <a href="/project/mps/">Memory Pool System</a>
/ <a href="/project/mps/master/">Master Product Sources</a>
/ <a href="/project/mps/master/manual/">Manuals</a>
/ <a href="/project/mps/master/manual/wiki/">Wiki</a>
</p>
<p><i><a href="/project/mps/">Memory Pool System Project</a></i></p>
<hr />
<h1>MPS Wiki: Allocation Points -- User's Guide</h1>
<address>
<a href="mailto:rhsk@ravenbrook.com">Richard Kistruck</a>,
<a href="http://www.ravenbrook.com/">Ravenbrook Limited</a>,
2006-06-09
</address>
</div>
<p> This wiki article contains incomplete and informal notes about the MPS, the precursor to more formal documentation. Not confidential. Readership: MPS users and developers. </p>
<p>These notes on Allocation Points were written by RHSK between
2006-06-07 and 2006-06-22, following research and discussion with
RB and NB.
<strong>Warning:</strong> the text in this User's Guide
is preliminary, but believed to be 'conservatively correct'.
In other words, I think if you follow these guidelines, your code
will be correct, and will not violate the current or future
definitions of the MPS ap protocol. But this is not (yet) an
accurate statement of the MPS ap protocol. RHSK 2006-06-13.</p>
<p>[<strong>Note:</strong> some constraints
may be mentioned only here, and not yet in other places they
should be mentioned, such as the Reference Manual.
Notably, the client's obligation to ensure there are no exact
references to a failed new-object, before it calls mps_ap_trip,
is suspected to be new. RHSK 2006-06-13]</p>
<h2>Introduction</h2>
<p>Allocation points are an MPS protocol that the client uses to
allocate memory with low overhead, and low synchronization
cost (with an asynchronous collector and with other threads).</p>
<p>The allocation point protocol is designed to work with
incremental collections and multi-threaded clients.
(And even if your particular client is
single-threaded and non-incremental,
for the purposes of using allocation points
it's easiest to assume you are coding for this case).</p>
<p>The assumption is that, <em>between any
two client instructions</em>, the MPS can interrupt you, move
objects around (if they are in a moving pool) and collect
garbage. To cope with this, allocating is a two-step process.</p>
<p><strong>Important:</strong>
<a id="assume.ambig-workspace" class="tag">.assume.ambig-workspace</a>:
this User's Guide
assumes that you have declared your stack and registers to
be a root that is ambiguously scanned, using mps_root_create_reg
and passing the mps_stack_scan_ambig function to it.
This is the simplest way to write a client.
Other scenarios are possible, but
their implications for correct Allocation Point use
are not yet documented here.</p>
<p>The rest of this Allocation Points User's Guide contains the following
sections:</p>
<ul>
<li>Creating and destroying allocation points</li>
<li>Overview of two-step allocation</li>
<li>The graph of managed references</li>
<li>mps_reserve</li>
<li>Building the object</li>
<li>mps_commit</li>
<li>Common mistakes</li>
</ul>
<h2>Creating and destroying allocation points</h2>
<p>To create an allocation point in a pool,
call <code>mps_ap_create</code>.
This may require additional arguments, depending on the pool
class. See pool class documentation.</p>
<p>An allocation point <strong>MUST NOT</strong> be used by more
than one thread. (Each thread must have its own allocation point or
points).</p>
<p>Destroy an allocation point with <code>mps_ap_destroy</code>.</p>
<h2>Overview of two-step allocation</h2>
<p>When the client is building (creating and formatting) a new
object, you can think of it as being 'in a race' with the MPS.
The object is 'under
construction', and the MPS cannot manage it in the normal way.
So the client should build the object quickly, and then
commit it to MPS management. Rarely, the MPS has to move other
objects around right in the middle of this build phase: that's
a (small) price you pay for having an asynchronous collector.
If this happens, the MPS will tell the client that it has 'lost the
race'. Objects have moved around, and the new object is invalid.
The client must start building it again from scratch.</p>
<p>The client starts building the new object with
<code>mps_reserve</code>, and
marks it complete by calling <code>mps_commit</code>.
Almost always, <code>mps_commit</code> succeeds. But if the
client did not complete the object in time, then
<code>mps_commit</code> fails (returns 0).</p>
<p>This is how the client should build a new object:</p>
<ol>
<li> <code>mps_reserve</code> some memory,</li>
<li> build a new object in it,</li>
<li> store a reference to the new object in an
ambiguously-scanned place
(but <strong>NOT</strong> in any exactly-scanned place),</li>
<li> <code>mps_commit</code> the new object to MPS management;</li>
</ol>
<p>If commit succeeds, the object is complete, and immediately
becomes just a normal allocated object.
The client may write a reference to the
new object into some older object (thereby connecting the
new object into the client's graph of objects).</p>
<p>If commit fails, the new object no longer exists:
the data has gone and any references that used to refer to it
are now dangling pointers.
The client should simply try to build the
object again.</p>
<p>In pseudo-code, the standard allocation point idiom is:</p>
<blockquote><pre><code>
do
mps_reserve
initialize new object
make an ambiguous reference to new object
while (! mps_commit)
link new object into my object graph
</code></pre></blockquote>
<p>(Do not worry about getting stuck in this loop:
commit usually fails at most once per collection,
so it is very rare for commit to fail even once,
let alone twice).</p>
<p>In C, this typically looks like this:</p>
<blockquote><pre><code>int make_object(mps_ap_t ap, object *parent)
{
void *p;
object *neo = NULL;
do {
if (mps_reserve(&amp;p, ap, SIZE_OBJECT) != MPS_RES_OK) {
goto fail_make_object;
}
/* Build the new object */
neo = p;
neo-&gt;formatcode = FORMAT_CLIENT; /* (not fwd or pad) */
neo-&gt;type = TYPE_OBJECT;
neo-&gt;size = SIZE_OBJECT;
neo-&gt;parent = parent;
neo-&gt;tribe = parent-&gt;tribe;
neo-&gt;child = NULL;
/* neo (ambiguous reference) preserves the new object */
} while (! mps_commit(ap, p, SIZE_OBJECT));
/* Success: link the new object into my object graph */
parent-&gt;child = neo;
return TRUE;
fail_make_object:
return FALSE; /* out of memory, etc */
}
</code></pre></blockquote>
<p>Note that, throughout this User's Guide, we assume that
the stack and registers are declared as ambiguous roots
(<a href="#assume.ambig-workspace">.assume.ambig-workspace</a>)
which means that the <code>neo</code> pointer
keeps the new object alive for us.</p>
<p>The rest of this User's Guide goes through these steps in more
detail.</p>
<h2>The graph of managed references</h2>
<p>The MPS is a moving garbage collector: it supports
preserve-by-copying pools, whose objects are 'mobile'.
Whenever the MPS moves an object, it will ensure that
all managed references are updated to point to the
new location -- and this happens instantaneously as far
as the client sees it.</p>
<p>The client should assume that, between <em>any pair of
instructions</em>, the MPS may 'shake' this graph, moving
all the mobile objects, and updating all the managed
references.</p>
<p>Any parts of the graph that are no longer connected
(no longer reachable from declared roots) may be collected,
and the memory that those objects occupied may be unmapped,
or re-used for different objects.</p>
<p>The client usually takes care to ensure that all the
references it
holds are managed. To be managed, the reference must be
in a declared root (such as a scanned stack or a global variable),
or in a formatted object that is reachable from a root.</p>
<p>It is okay for a careful client to hold unmanaged references,
but:</p>
<ul>
<li> they'd better not be to a mobile object! Remember, mobile
objects could move at any time, and unmanaged references
will be left 'dangling'.</li>
<li> they'd better not be the only reference to an object,
or that object might get collected, again leaving a dangling
reference.</li>
</ul>
<h2>mps_reserve</h2>
<p>Call <code>mps_reserve</code>, passing the size of the
new object you wish to create.
The size must be aligned to the pool alignment.
This is in contrast to mps_alloc, which (for some pools) allows
unaligned sizes.</p>
<p><em class="note">[Normally, use <code>mps_reserve</code>
(the lower-case C macro).
But if you are using a weak compiler that does not detect common
subexpressions, you may find that using
<code>MPS_RESERVE_BLOCK</code> (functionally identical) generates
faster code.
Or it may generate slower code. It depends on your compiler, and
you will have to conduct tests to find out.]</em></p>
<p><code>mps_reserve</code> returns a reference to a piece
of new memory for the client to build a new object in.
During this build, the MPS pins the piece of memory, and
treats it as raw data.</p>
<p>"Pinned" means: it will not move, be collected, be unmapped,
or anything like that. You may keep an unmanaged reference to
it at this time.</p>
<p>"Raw data" means two things:</p>
<p>Firstly, "raw data" means that any references stored IN the
new object are <em>unmanaged</em>. This means:</p>
<ul>
<li> references in the new object will not get updated if
the graph of managed references to mobile objects is 'shaken';</li>
<li> references in the new object do not
preserve any old objects they point to.</li>
</ul>
<p>Secondly, "raw data" means that any references TO the new
object are treated like other references to unmanaged memory:</p>
<ul>
<li> the MPS will not call the client's format code to answer
questions about the new object.</li>
</ul>
<h2>Building the object</h2>
<p>The client will typically do all these things:</p>
<ul>
<li> write data that makes the new object 'valid' for the
client's format;</li>
<li> write other data into the new object;</li>
<li> store references to existing objects IN
the new object;</li>
<li> keep (in a local variable) an ambiguous reference TO the
new object.</li>
</ul>
<p>However, during the build, there are a couple of restrictions:</p>
<ul>
<li>
<p>Once the client has stored a reference IN the new object,
it <strong>MUST NOT</strong> read it out again &mdash;
any reference stored in the new object is unmanaged,
and may have become stale.</p>
<p>(Actually, the restriction is: the moment a reference to an
existing mobile object is written into the new object, that
reference (in the new object) may become stale. And you'd better
not use (dereference) a stale reference. And you'd better not
write it into any exactly-scanned cell (such as in an existing
object).
Reading it into an ambiguously-scanned cell (such as an ambiguously
scanned register or stack cell) is okay as long as you don't
dereference it.
Writing it back into another part of the new object is okay too.
Just don't trust it to be a valid reference.)</p>
</li>
<li>
<p>The client <strong>MUST NOT</strong> store a reference TO the
new object in any exactly-scanned place.</p>
<p><em class="note">[Note: this is in fact possible, but the
protocol for doing it is more complex, and beyond the scope of
this guide. RHSK 2006-06-22]</em></p>
<p>This means the client should NOT connect the new object into
the graph of managed objects during the build.</p>
</li>
</ul>
<p>Before the end of the build phase:</p>
<ul>
<li> the new object must be validly formatted;</li>
<li> all exactly-scanned cells in the new object must
contain valid references;</li>
<li> the new object must be ambiguously reachable.</li>
</ul>
<p>Optionally, for improved robustness to bugs, consider initialising
<em>all</em> parts of the new object, including parts that are not
yet being used to store useful data (such as a string buffer).
You might want to make this compile-time switchable, for debugging.</p>
<blockquote class="note">
<p><em class="note">If you leave these unused parts
uninitialised, they may contain data that
looks like a valid object -- this is called a "spoof object".
(This might be the 'ghost' of a previous object, or just random
junk that happens to look like a valid object).</em></p>
<p><em class="note">This is completely legal: spoof objects
do not cause a problem for the MPS.</em></p>
<p><em class="note">
However, this might leave you with less safety margin than you
want, especially when developing a new client.
If there were to be a bug in your code (or indeed in
the MPS) that resulted in a bogus exact reference to this spoof,
it might go undetected,
and arbitrary corruption might occur before the bug came to light.
So, consider filling these as-yet unused parts with specially
chosen dummy values, at least as an option for debugging.
Choose dummy values that your format code will recognise as
not permitted at the start of a valid formatted object.
You will then detect bogus exact references more promptly.</em></p>
<p><em class="note">[RHSK 2006-06-15:
In poolamc, these ghosts will be forwarding pointers,
and they will usually get unmapped (though
unless we use zeroed / secure / etc VM they may get mapped-in
again intact).
But if the tract is nailed they won't even get unmapped.
And ghost forwarding pointers are just as bad news as any other
spoof.
There's currently no format method "destroy". If there was,
we could call it in the reclaim phase, to allow format code to
safely mark these ghosts as dead. Actually, perhaps that's a
valid use of the 'pad' method?
]</em></p>
</blockquote>
<h2>mps_commit</h2>
<p>When you call <code>mps_commit</code>, it will either fail or succeed.</p>
<p>Almost always, <code>mps_commit</code> succeeds.
If it succeeds, that means: </p>
<ul>
<li> all the references written IN the new object are valid (in
other words, a successful commit is the MPS's way of telling
you that these references did not become stale while they were
sitting unmanaged in the new object);</li>
<li> all the references TO the new object are valid;</li>
<li> the new object is now just a normal object like any other;</li>
<li> it may get collected if there are no references
to it;</li>
<li> if the pool supports mps_free, you may manually free the
new object.</li>
</ul>
<p>Occasionally but rarely, <code>mps_commit</code> fails.
This means:</p>
<ul>
<li> the new object no longer exists &mdash;
the memory may even be unmapped by the time
<code>mps_commit</code> returns;</li>
<li> there must be no exact references to the new object.</li>
</ul>
<p>If commit fails, the client usually tries making the object again
(although this is not required: it is allowed to just give up!).
This is why the standard allocation point idiom has a
do...while loop.</p>
<h2>Common mistakes</h2>
<p>Here are some examples of mistakes to avoid:</p>
<blockquote><pre><strong>The example below is INCORRECT.</strong>
<code>typedef struct object_s {
int formatcode; /* FORMAT_CLIENT, _FWD, or _PAD */
int type;
size_t size;
struct object_s *tribe;
struct object_s *parent;
struct object_s *child;
} object;
int make_object(mps_ap_t ap, object *parent)
{
void *p;
object *neo = NULL;
do {
if (mps_reserve(&amp;p, ap, SIZE_OBJECT) != MPS_RES_OK) {
goto fail_make_object;
}
/* Build the new object */
neo = p;
neo-&gt;formatcode = FORMAT_CLIENT;
neo-&gt;type = TYPE_OBJECT;
neo-&gt;size = SIZE_OBJECT;
neo-&gt;parent = parent;
neo-&gt;tribe = neo-&gt;parent-&gt;tribe; /*--- incorrect-1 ---*/
parent-&gt;child = neo; /*--- incorrect-2 ---*/
/* neo (ambiguous reference) preserves the new object */
} while (! mps_commit(ap, p, SIZE_OBJECT));
neo-&gt;child = NULL; /*--- incorrect-3 ---*/
return TRUE;
fail_make_object:
return FALSE; /* out of memory, etc */
}
<strong>The example above is INCORRECT.</strong>
</code></pre></blockquote>
<p>Incorrect-1: do not read references from the new object.
Dereferencing <code>neo-&gt;parent</code> is illegal.
(The code should use <code>parent-&gt;tribe</code>).</p>
<p>Incorrect-2: making an exact reference to the new object is illegal.
(The code should only do this after a successful commit).</p>
<p>Incorrect-3: the <code>child</code> slot (in this example) is
exactly scanned, and it MUST be initialised before the call to
commit.
(The code shown is initialising it too late).</p>
<h2>Conclusion and further details</h2>
<p>Although this User's Guide explains the protocol in terms of the
pre-packaged macros
<code>mps_reserve</code> and <code>mps_commit</code>,
that is a simplification.
The MPS allocation point protocol is designed as a
binary protocol, defined at the level of atomic machine operations.
The precise specification of the binary protocol is beyond the
scope of this document.</p>
<p>For further discussion of Allocation Points, see
<a href="apinternals.html">Allocation Points -- Internals</a>.
</p>
<h2><a id="section-B" name="section-B">B. Document History</a></h2>
<pre>
(As part of GC article)
2006-06-09 RHSK Allocation points: how it's supposed to be, from RB
2006-06-09.
2006-06-09 RHSK Allocation points: must make at least one ambiguous ref
and no exact refs to new object before calling commit.
2006-06-12 RHSK Allocation points: minor edits for clarity.
2006-06-12 RHSK Allocation point internals: Scenario; A mutator with
ambiguous references only (to the new object); How
bad is a bogus exact reference?; quick notes on: with
exact refs; flipping at other times.
2006-06-13 RHSK Allocation point user's guide: Tidy; clarify-- not
yet complete.
Distinguish it from section on AP internals.
2006-06-14 RHSK Allocation point user's guide: standard reserve-commit
idiom: corrections, and now assume ambiguous reg+stack:
transitioning to this assumption.
Add .talk.RB.2006-06-13 about standard r-c idiom
Add .talk.RB.2006-06-14 about Modes of use of MPS,
and Promises to not flip (single-threaded).
Add .think.RHSK.2006-06-14 about how we could support
unscanned reg+stack, even with multi-threaded, moving
collector.
2006-06-15 RHSK Allocation Points: clarifications and further notes
into User Guide and Internals, including initializing
as-yet unused parts of new objects to reduce spoofing.
2006-06-22 RHSK Move rough notes .talk.RB.2006-06-13 etc elsewhere.
Make AP User Guide consistently .assume.ambig-workspace,
moving notes to APInternals; add Intro and Mistakes;
check example code.
(This article)
2006-06-22 RHSK Created from GC article.
2006-06-23 RHSK Move AP Internals discussion to its own article.
</pre>
<h2><a id="section-C" name="section-C">C. Copyright and License</a></h2>
<p> This document is copyright &copy; 2006 <a href="http://www.ravenbrook.com/">Ravenbrook Limited</a>. All rights reserved. This is an open source license. Contact Ravenbrook for commercial licensing options. </p>
<p> Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: </p>
<ol>
<li> Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. </li>
<li> Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. </li>
<li> Redistributions in any form must be accompanied by information on how to obtain complete source code for the this software and any accompanying software that uses this software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions. For an executable file, complete source code means the source code for all modules it contains. It does not include source code for modules or files that typically accompany the major components of the operating system on which the executable file runs. </li>
</ol>
<p> <strong> This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage. </strong> </p>
<hr />
<div class="banner">
<p><code>$Id$</code></p>
<p>
<a href="/">Ravenbrook</a>
/ <a href="/project/">Projects</a>
/ <a href="/project/mps/">Memory Pool System</a>
/ <a href="/project/mps/master/">Master Product Sources</a>
/ <a href="/project/mps/master/manual/">Manuals</a>
/ <a href="/project/mps/master/manual/wiki/">Wiki</a>
</p>
</div>
</body>
</html>