1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2026-01-01 01:41:01 -08:00

Mps wiki: split ap internals into its own article

Copied from Perforce
 Change: 159392
 ServerID: perforce.ravenbrook.com
This commit is contained in:
Richard Kistruck 2006-06-23 16:20:20 +01:00
parent 9232833382
commit 49dcec4d8f
3 changed files with 489 additions and 371 deletions

View file

@ -42,16 +42,6 @@
<p> This wiki article contains incomplete and informal notes about the MPS, the precursor to more formal documentation. Not confidential. Readership: MPS users and developers. </p>
<h2>Contents:</h2>
<ul>
<li><a href="#Allocation-Point-User">Allocation Points -- User's Guide</a></li>
<li><a href="#Allocation-Point-Internals">Allocation Point Internals</a></li>
</ul>
<h2><a id="Allocation-Point-User">Allocation Points -- User's Guide</a></h2>
<p>These notes on Allocation Points were written by RHSK between
2006-06-07 and 2006-06-22, following research and discussion with
RB and NB.
@ -69,7 +59,7 @@
references to a failed new-object, before it calls mps_ap_trip,
is suspected to be new. RHSK 2006-06-13]</p>
<h3>Introduction</h3>
<h2>Introduction</h2>
<p>Allocation points are an MPS protocol that the client uses to
allocate memory with low overhead, and low synchronization
@ -94,19 +84,12 @@
be a root that is ambiguously scanned, using mps_root_create_reg
and passing the mps_stack_scan_ambig function to it.
This is the simplest way to write a client.
Other scenarios are possible, but are not yet documented here.</p>
Other scenarios are possible, but
their implications for correct Allocation Point use
are not yet documented here.</p>
<p><em class="note">
Note: This document explains the protocol in terms of the
pre-packaged macros
<code>mps_reserve</code> and <code>mps_commit</code>,
but this is a simplification.
The MPS allocation point protocol is actually designed as
binary protocol, defined at the level of atomic machine operations.
The precise specification of the binary protocol is beyond the
scope of this document.</em></p>
<p>The rest of this Allocation Points User's Guide contains the following sub-sections:</p>
<p>The rest of this Allocation Points User's Guide contains the following
sections:</p>
<ul>
<li>Creating and destroying allocation points</li>
<li>Overview of two-step allocation</li>
@ -118,7 +101,7 @@
</ul>
<h3>Creating and destroying allocation points</h3>
<h2>Creating and destroying allocation points</h2>
<p>To create an allocation point in a pool,
call <code>mps_ap_create</code>.
@ -131,7 +114,7 @@
<p>Destroy an allocation point with <code>mps_ap_destroy</code>.</p>
<h3>Overview of two-step allocation</h3>
<h2>Overview of two-step allocation</h2>
<p>When the client is building (creating and formatting) a new
object, you can think of it as being 'in a race' with the MPS.
@ -231,7 +214,7 @@ fail_make_object:
detail.</p>
<h3>The graph of managed references</h3>
<h2>The graph of managed references</h2>
<p>The MPS is a moving garbage collector: it supports
preserve-by-copying pools, whose objects are 'mobile'.
@ -267,7 +250,7 @@ fail_make_object:
reference.</li>
</ul>
<h3>mps_reserve</h3>
<h2>mps_reserve</h2>
<p>Call <code>mps_reserve</code>, passing the size of the
new object you wish to create.
@ -311,7 +294,7 @@ fail_make_object:
questions about the new object.</li>
</ul>
<h3>Building the object</h3>
<h2>Building the object</h2>
<p>The client will typically do all these things:</p>
<ul>
@ -409,7 +392,7 @@ fail_make_object:
</blockquote>
<h3>mps_commit</h3>
<h2>mps_commit</h2>
<p>When you call <code>mps_commit</code>, it will either fail or succeed.</p>
@ -443,7 +426,7 @@ fail_make_object:
do...while loop.</p>
<h3>Common mistakes</h3>
<h2>Common mistakes</h2>
<p>Here are some examples of mistakes to avoid:</p>
@ -501,351 +484,22 @@ fail_make_object:
commit.
(The code shown is initialising it too late).</p>
<h2><a id="Allocation-Point-Internals">Allocation Point Internals</a></h2>
<p><strong>This section should be moved out of this article. RHSK 2006-06-22</strong></p>
<h2>Conclusion and further details</h2>
<p>The synchronization issues are a little tricky. This following
notes on these issues were written by RHSK between
2006-06-12 and 2006-06-12.</p>
<p>Although this User's Guide explains the protocol in terms of the
pre-packaged macros
<code>mps_reserve</code> and <code>mps_commit</code>,
that is a simplification.
The MPS allocation point protocol is designed as a
binary protocol, defined at the level of atomic machine operations.
The precise specification of the binary protocol is beyond the
scope of this document.</p>
<h3>scenario</h3>
<p>The MPS allocation point protocol is a binary protocol. You don't
have to use the mps_reserve and mps_commit macros, but they are
conventional and useful pre-packaged implementations.
I'm using them as an example of correct protocol, referring to
<a href="http://www.ravenbrook.com/project/mps/master/code/mps.h">mps.h#17</a>.
It looks like this:</p>
<pre><code>#define mps_reserve(_p_o, _mps_ap, _size) \
((char *)(_mps_ap)-&gt;alloc + (_size) &gt; (char *)(_mps_ap)-&gt;alloc &amp;&amp; \
(char *)(_mps_ap)-&gt;alloc + (_size) &lt;= (char *)(_mps_ap)-&gt;limit ? \
((_mps_ap)-&gt;alloc = \
(mps_addr_t)((char *)(_mps_ap)-&gt;alloc + (_size)), \
*(_p_o) = (_mps_ap)-&gt;init, \
MPS_RES_OK) : \
mps_ap_fill(_p_o, _mps_ap, _size))
#define mps_commit(_mps_ap, _p, _size) \
((_mps_ap)-&gt;init = (_mps_ap)-&gt;alloc, \
(_mps_ap)-&gt;limit != 0 || mps_ap_trip(_mps_ap, _p, _size))
</code></pre>
<p>Abstractly, the mutator code has a cycle of four operations on the
allocation point:</p>
<dl>
<dt> Lr </dt>
<dd> read Limit at reserve </dd>
<dt> A </dt>
<dd> write Alloc at reserve </dd>
<dt> I </dt>
<dd> write Init at commit </dd>
<dt> Lc </dt>
<dd> read Limit at commit </dd>
</dl>
<p>The normal cycle is Lr A I Lc.</p>
<p>After Lr, the mutator checks that A + size &lt;= Lr. If this fails,
then the mutator does mps_ap_fill instead of A. So Lr mps_ap_fill I Lc
is also a valid cycle.</p>
<p>After Lc, the mutator checks that Lc != 0. If this fails, the
mutator will call mps_ap_trip before beginning a new cycle. So
Lr A I Lc mps_ap_trip is also a valid cycle.</p>
<p>The mutator cannot interrupt the
collector.</p>
<p>The collector can interrupt the mutator (between any
two mutator instructions). The collector can only see the A, I,
and mps_ap_trip (if present) operations; Lr and Lc are invisible
to it.</p>
<p>If the ap is not going to trip, the collector sees two
equivalence classes:</p>
<ul>
<li> (A)</li>
<li> (I Lc Lr)</li>
</ul>
<p>If the ap is going to trip, the collector sees four
equivalence classes:</p>
<ul>
<li> (A)</li>
<li> (I Lc)</li>
<li> (the call to mps_ap_trip)</li>
<li> (mps_ap_trip Lr)</li>
</ul>
<p>The collector has two responsibilities:</p>
<dl>
<dt> FIX </dt>
<dd> to fix (or not fix) references TO the object;</dd>
<dt> SCAN </dt>
<dd> to scan (or not scan) references IN the object.</dd>
</dl>
<h3>A mutator with ambiguous references only (to the new object)</h3>
<p>Let's see what's required for a mutator that promises to make no
exact references to the new object until Lc succeeds,
and promises to have at least one ambiguous reference to the
new object when it sets I.</p>
<p>Consider a flip -- f1 -- in state A.</p>
<p>The collector at f1 sees (A), and recognises that the I..A region
is unmanaged memory.
The object is never going to be valid,
because it may contain stale references from before the flip, and the
collector can't fix them.</p>
<p>The f1 collector writes L = 0, tripping the allocation
point, and the rest of this collection is easy.
The FIX responsibility is easy:
all references to the new object are ambiguous, so the
collector can fix them or not -- it
doesn't matter.
For SCAN, the collector must not scan the new object
(because it is not yet formatted), so
it must not scan the buffered pool beyond I.</p>
<p>Further collections at this point are idempotent.</p>
<p>Now, suppose the mutator then restarts, but only gets as far
as (I Lc) before we have a second collection and a
second flip -- f2.</p>
<p>For FIX, f2 can fix or not -- it still doesn't matter.</p>
<p>But for SCAN, f2 <strong>MUST NOT</strong> scan the new object.
Why not?</p>
<p>The mutator's format code will be okay: the fact
that I is set tells f2 that the object
is fully formatted.</p>
<p>But there is a problem: the object may contain stale exact
references -- unfixed references from before f1.
These stale exact references will be bogus.
Opinions differ on the implications of bogus exact references.
See the next section for a discussion.</p>
<p>Additionally, the collector is not <em>obliged</em> to scan the
object: this object is dead!
The mutator will later notice that Lr == 0 and realise the object
is invalid. The mutator is not relying
on this object to keep anything else alive.</p>
<p>It is easy to avoid scanning the invalid object:
the first flip in state (A), f1, must record the value of I.
Call the recorded pointer If1.
Collections must not scan beyond If1.
If1 remains set until the mutator calls mps_ap_trip,
which resets If1 to 0.
[Note: I have not checked whether the MPS
records and uses If1 -- RHSK 2006-06-12]</p>
<h3>How bad is a bogus exact reference?</h3>
<p>Tucker said:</p>
<blockquote><p>I believe the collector must be prepared to deal
with exact dangling pointers anyways, so this complex
mechanism is unnecessary.
[<a href="/project/mps/mail/1997/05/12/12-46/1.txt">//info.ravenbrook.com/project/mps/mail/1997/05/12/12-46/1.txt:X-MMInfo-Tag: mail.ptw.1997-05-12.12-46(1)</a>]
</p></blockquote>
<p>I'm not convinced.</p>
<p>Fixing a bogus exact reference is bad news.
The worst case is spoofing:
Imagine the bogus exact reference points at part of a
JPEG that <em>just happens</em> to look like a validly formatted
object.
If this JPEG is in the middle of an object in a moving MPS
pool, fixing the bogus exact reference will overwrite a portion
of the JPEG with a forwarding object.
It gets worse: the spoof object is preserved and will later be
scanned, producing more bogus exact references!
For "JPEG", you can substitute "encrypted string".</p>
<p>The only safe way to cope with such a bogus reference is for
Fix to verify that the reference points to the start of a real
object, which it could do by iterating the format skip method
from some known point in the pool.</p>
<p>The consequences of fixing a bogus exact
reference may be:</p>
<ul>
<li> Referent is not in MPS memory:
<ul>
<li> MPS must detect and ignore these; there should be a
debug option to report it;</li>
</ul>
</li>
<li> Referent is not validly formatted:
<ul>
<li> mutator format code crashes; or</li>
<li> mutator format code does something stupid or
illegal, and causes corruption or makes MPS crash; or</li>
<li> (if we're lucky) mutator format code detects this
and avoids any harmful action.</li>
</ul>
</li>
<li> Referent appears validly formatted, but is a spoof
(that is, it is actually data inside some other
object):
<ul>
<li> in a moving pool: the object is silently corrupted
with a forwarding pointer, and the spoof object is
preserved and will later be scanned, producing more
bogus exact references!</li>
<li> in a non-moving pool: object is safe from corruption,
but MPS greying/blackening code needs to be robust to
this case.</li>
</ul>
</li>
<li> Referent is a validly formatted but dead object:
<ul>
<li>it gets needlessly preserved;</li>
<li>if it had been dead for some time, this dead object may
itself contain bogus exact references, and these will get
'fixed' too.</li>
</ul>
</li>
</ul>
<p>Also, more arcanely, I wonder whether there's a problem if the
referent violates the tri-colour invariant (when scanning at
black)? Does the MPS care about this, or does it 'safely' just
cause more conservatism? Does MPS detect and flag this in
debug versions? Could it?</p>
<p>If MPS or format code detects a bogus exact reference, it could
assert, and/or to fix them to a canonical special value.</p>
<p>Some possible contracts between mutator and collector:</p>
<ol>
<li>Mutator and collector promise never to have bogus exact
references. The collector is permitted to crash if it
encounters one.</li>
<li>Mutator agrees that bogus exact references are wrong.
But mutator would like a helpful error message, please.
(Collector is not permitted to crash).</li>
<li>Mutator wants permission to keep bogus exact references
in certain circumstances, such as on the stack. Collector
must cope safely. Need to specify how this is guaranteed,
based on intimate knowledge of the mutator's behaviour.
Dodgy.</li>
<li>Mutator wants permission to keep bogus exact references
in general. Collector must safely cope with all the cases
of bogus exact references listed above. I believe the MPS
does not currently do this. It would require a very fast
check at fix time that the reference is to the start of a
valid object in an MPS client pool.</li>
</ol>
<h3>with exact refs</h3>
<p>For an all-exact mode of use, mutator must make an exact pointer
before setting I=A, and must keep it at least until it after
it has read Lc to see whther the commit succeeded.</p>
<p>So, some additions to the AP User's Guide, in the section on mps_reserve:</p>
<blockquote>
<p>Secondly, "raw data" means that any references TO the new
object are treated like other references to unmanaged memory.
[.belief.refs-to-uninit-safe:
We're 'sure', but I need to check this. What does Fix
actually do with a pointer into the init-alloc zone? We
hope it ignores it. RHSK 2006-06-09]
Because of this, you are permitted to connect the new object
into your graph of managed objects immediately. The MPS gives
you these guarantees:</p>
<ul>
<li> the new object is pinned (won't move), so references to it
that you write in old objects won't become stale at this time
(but see <code>mps_commit</code> below!)</li>
<li> although the new object is 'reachable' by these references,
at this stage (while I &lt; A) the MPS knows it
is not yet managed, and will not try to preserve it;</li>
<li> the MPS will not call the client's format code to answer
questions about the new object.</li>
</ul>
</blockquote>
<p>And to the section on mps_commit:</p>
<blockquote>
<p>When this happens the client should take care to clear up any
managed references to the (now vanished) new object.</p>
<p>[But there's a hole here, before the client does this.
Are managed (aka scanned) references TO
it still safe? They were safe during building
(by .belief.refs-to-uninit-safe). But now the AP pointers have
gone away. Are they still safe?
Clearly, if they are only RankAMBIG, they are safe.
What if they are RankEXACT?
RHSK 2006-06-09]</p>
<p>[Discussion with RB 2006-06-09: yes, that's a problem for exact
references. Must not make any exact refs to a new object. And
unmanaged refs are not sufficient, because they won't preserve the
new object during commit. So must make at least one ambiguous
ref to new object before calling commit. That's the truth
currently. There are various ways to solve this to allow
purely-exact mutators. For instance,
keep the old init..alloc address-space flagged as a zombie zone,
until some communication with mutator (perhaps another reserve
from same AP?) indicates that mutator has removed all those
pesky exact refs to the now-dead ex-new object. RHSK 2006-06-09]</p>
</blockquote>
<p>See <a href="unmanaged.html">issues with unmanaged workspace</a> for
a discussion of the "abort" phase, which we need to define
for allocation points.</p>
<p>The MPS needs to keep the buffer mapped for some time --
until the mutator has stopped writing into it.
The MPS needs to remember If1 forever.
The MPS needs to remember the rest of the
about-to-be-detached buffer structure for some time --
when is the MPS permitted to detach the buffer?</p>
<p>Keeping the buffer mapped consumes resource.
There must be a contract with the mutator about when the
MPS is permitted to free it.
<p>For further discussion of Allocation Points, see
<a href="apinternals.html">Allocation Points -- Internals</a>.
</p>
<p>We could define the protocol such that there is a window:
after the mutator discovers that it has lost the race, but
while the collector is still remembering that this object
is invalid. The mutator could use this window to null-out
exact references to the new object.</p>
<p>This window could be after Lc and before calling mps_ap_trip.
During this time,
the collector must keep the new object as pinned unmanaged
memory. By calling mps_ap_trip, the mutator promises that
it has nulled-out any exact references to the new object,
and the collector in mps_ap_trip may unmap the memory or
safely re-use it without fear of spoofing.</p>
<p>Or, we could leave the window open until mps_ap_fill is
called. We just need to define it.</p>
<h3>flipping at other times</h3>
<p>Perhaps the collector wants to flip even when the ap is not
at (A). Need to analyse synchronization issues here.</p>
<h2><a id="section-B" name="section-B">B. Document History</a></h2>
<pre>
@ -881,6 +535,7 @@ it has read Lc to see whther the commit succeeded.</p>
(This article)
2006-06-22 RHSK Created from GC article.
2006-06-23 RHSK Move AP Internals discussion to its own article.
</pre>

View file

@ -0,0 +1,456 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>MPS Wiki: Allocation Points -- Internals</title>
<style type = "text/css">
<!--
div.banner {text-align:center}
dt {font-weight:bold}
-->
</style>
</head>
<body>
<div class="banner">
<p>
<a href="/">Ravenbrook</a>
/ <a href="/project/">Projects</a>
/ <a href="/project/mps/">Memory Pool System</a>
/ <a href="/project/mps/master/">Master Product Sources</a>
/ <a href="/project/mps/master/manual/">Manuals</a>
/ <a href="/project/mps/master/manual/wiki/">Wiki</a>
</p>
<p><i><a href="/project/mps/">Memory Pool System Project</a></i></p>
<hr />
<h1>MPS Wiki: Allocation Points -- Internals</h1>
<address>
<a href="mailto:rhsk@ravenbrook.com">Richard Kistruck</a>,
<a href="http://www.ravenbrook.com/">Ravenbrook Limited</a>,
2006-06-09
</address>
</div>
<p> This wiki article contains incomplete and informal notes about the MPS, the precursor to more formal documentation. Not confidential. Readership: MPS developers. </p>
<p>These notes on Allocation Points were written by RHSK between
2006-06-12 and 2006-06-23, following research and discussion with
RB and NB.</p>
<p><strong>Warning:</strong> this article is more of a technical
discussion than a set of facts. It is not authoritative.
This is not (yet) an accurate statement of the MPS ap protocol.
RHSK 2006-06-23.</p>
<h2><a id="synchronization">Synchronization</a></h2>
<p>The synchronization issues are a little tricky.</p>
<h3>scenario</h3>
<p>The MPS allocation point protocol is a binary protocol. You don't
have to use the mps_reserve and mps_commit macros, but they are
conventional and useful pre-packaged implementations.
I'm using them as an example of correct protocol, referring to
<a href="http://www.ravenbrook.com/project/mps/master/code/mps.h">mps.h#17</a>.
It looks like this:</p>
<pre><code>#define mps_reserve(_p_o, _mps_ap, _size) \
((char *)(_mps_ap)-&gt;alloc + (_size) &gt; (char *)(_mps_ap)-&gt;alloc &amp;&amp; \
(char *)(_mps_ap)-&gt;alloc + (_size) &lt;= (char *)(_mps_ap)-&gt;limit ? \
((_mps_ap)-&gt;alloc = \
(mps_addr_t)((char *)(_mps_ap)-&gt;alloc + (_size)), \
*(_p_o) = (_mps_ap)-&gt;init, \
MPS_RES_OK) : \
mps_ap_fill(_p_o, _mps_ap, _size))
#define mps_commit(_mps_ap, _p, _size) \
((_mps_ap)-&gt;init = (_mps_ap)-&gt;alloc, \
(_mps_ap)-&gt;limit != 0 || mps_ap_trip(_mps_ap, _p, _size))
</code></pre>
<p>Abstractly, the mutator code has a cycle of four operations on the
allocation point:</p>
<dl>
<dt> Lr </dt>
<dd> read Limit at reserve </dd>
<dt> A </dt>
<dd> write Alloc at reserve </dd>
<dt> I </dt>
<dd> write Init at commit </dd>
<dt> Lc </dt>
<dd> read Limit at commit </dd>
</dl>
<p>The normal cycle is Lr A I Lc.</p>
<p>After Lr, the mutator checks that A + size &lt;= Lr. If this fails,
then the mutator does mps_ap_fill instead of A. So Lr mps_ap_fill I Lc
is also a valid cycle.</p>
<p>After Lc, the mutator checks that Lc != 0. If this fails, the
mutator will call mps_ap_trip before beginning a new cycle. So
Lr A I Lc mps_ap_trip is also a valid cycle.</p>
<p>The mutator cannot interrupt the
collector.</p>
<p>The collector can interrupt the mutator (between any
two mutator instructions). The collector can only see the A, I,
and mps_ap_trip (if present) operations; Lr and Lc are invisible
to it.</p>
<p>If the ap is not going to trip, the collector sees two
equivalence classes:</p>
<ul>
<li> (A)</li>
<li> (I Lc Lr)</li>
</ul>
<p>If the ap is going to trip, the collector sees four
equivalence classes:</p>
<ul>
<li> (A)</li>
<li> (I Lc)</li>
<li> (the call to mps_ap_trip)</li>
<li> (mps_ap_trip Lr)</li>
</ul>
<p>The collector has two responsibilities:</p>
<dl>
<dt> FIX </dt>
<dd> to fix (or not fix) references TO the object;</dd>
<dt> SCAN </dt>
<dd> to scan (or not scan) references IN the object.</dd>
</dl>
<h3>A mutator with ambiguous references only (to the new object)</h3>
<p>Let's see what's required for a mutator that promises to make no
exact references to the new object until Lc succeeds,
and promises to have at least one ambiguous reference to the
new object when it sets I.</p>
<p>Consider a flip -- f1 -- in state A.</p>
<p>The collector at f1 sees (A), and recognises that the I..A region
is unmanaged memory.
The object is never going to be valid,
because it may contain stale references from before the flip, and the
collector can't fix them.</p>
<p>The f1 collector writes L = 0, tripping the allocation
point, and the rest of this collection is easy.
The FIX responsibility is easy:
all references to the new object are ambiguous, so the
collector can fix them or not -- it
doesn't matter.
For SCAN, the collector must not scan the new object
(because it is not yet formatted), so
it must not scan the buffered pool beyond I.</p>
<p>Further collections at this point are idempotent.</p>
<p>Now, suppose the mutator then restarts, but only gets as far
as (I Lc) before we have a second collection and a
second flip -- f2.</p>
<p>For FIX, f2 can fix or not -- it still doesn't matter.</p>
<p>But for SCAN, f2 <strong>MUST NOT</strong> scan the new object.
Why not?</p>
<p>The mutator's format code will be okay: the fact
that I is set tells f2 that the object
is fully formatted.</p>
<p>But there is a problem: the object may contain stale exact
references -- unfixed references from before f1.
These stale exact references will be bogus.
Opinions differ on the implications of bogus exact references.
See the next section for a discussion.</p>
<p>Additionally, the collector is not <em>obliged</em> to scan the
object: this object is dead!
The mutator will later notice that Lr == 0 and realise the object
is invalid. The mutator is not relying
on this object to keep anything else alive.</p>
<p>It is easy to avoid scanning the invalid object:
the first flip in state (A), f1, must record the value of I.
Call the recorded pointer If1.
Collections must not scan beyond If1.
If1 remains set until the mutator calls mps_ap_trip,
which resets If1 to 0.
[Note: I have not checked whether the MPS
records and uses If1 -- RHSK 2006-06-12]</p>
<h3>How bad is a bogus exact reference?</h3>
<p>Tucker said:</p>
<blockquote><p>I believe the collector must be prepared to deal
with exact dangling pointers anyways, so this complex
mechanism is unnecessary.
[<a href="/project/mps/mail/1997/05/12/12-46/1.txt">//info.ravenbrook.com/project/mps/mail/1997/05/12/12-46/1.txt:X-MMInfo-Tag: mail.ptw.1997-05-12.12-46(1)</a>]
</p></blockquote>
<p>I'm not convinced.</p>
<p>Fixing a bogus exact reference is bad news.
The worst case is spoofing:
Imagine the bogus exact reference points at part of a
JPEG that <em>just happens</em> to look like a validly formatted
object.
If this JPEG is in the middle of an object in a moving MPS
pool, fixing the bogus exact reference will overwrite a portion
of the JPEG with a forwarding object.
It gets worse: the spoof object is preserved and will later be
scanned, producing more bogus exact references!
For "JPEG", you can substitute "encrypted string".</p>
<p>The only safe way to cope with such a bogus reference is for
Fix to verify that the reference points to the start of a real
object, which it could do by iterating the format skip method
from some known point in the pool.</p>
<p>The consequences of fixing a bogus exact
reference may be:</p>
<ul>
<li> Referent is not in MPS memory:
<ul>
<li> MPS must detect and ignore these; there should be a
debug option to report it;</li>
</ul>
</li>
<li> Referent is not validly formatted:
<ul>
<li> mutator format code crashes; or</li>
<li> mutator format code does something stupid or
illegal, and causes corruption or makes MPS crash; or</li>
<li> (if we're lucky) mutator format code detects this
and avoids any harmful action.</li>
</ul>
</li>
<li> Referent appears validly formatted, but is a spoof
(that is, it is actually data inside some other
object):
<ul>
<li> in a moving pool: the object is silently corrupted
with a forwarding pointer, and the spoof object is
preserved and will later be scanned, producing more
bogus exact references!</li>
<li> in a non-moving pool: object is safe from corruption,
but MPS greying/blackening code needs to be robust to
this case.</li>
</ul>
</li>
<li> Referent is a validly formatted but dead object:
<ul>
<li>it gets needlessly preserved;</li>
<li>if it had been dead for some time, this dead object may
itself contain bogus exact references, and these will get
'fixed' too.</li>
</ul>
</li>
</ul>
<p>Also, more arcanely, I wonder whether there's a problem if the
referent violates the tri-colour invariant (when scanning at
black)? Does the MPS care about this, or does it 'safely' just
cause more conservatism? Does MPS detect and flag this in
debug versions? Could it?</p>
<p>If MPS or format code detects a bogus exact reference, it could
assert, and/or to fix them to a canonical special value.</p>
<p>Some possible contracts between mutator and collector:</p>
<ol>
<li>Mutator and collector promise never to have bogus exact
references. The collector is permitted to crash if it
encounters one.</li>
<li>Mutator agrees that bogus exact references are wrong.
But mutator would like a helpful error message, please.
(Collector is not permitted to crash).</li>
<li>Mutator wants permission to keep bogus exact references
in certain circumstances, such as on the stack. Collector
must cope safely. Need to specify how this is guaranteed,
based on intimate knowledge of the mutator's behaviour.
Dodgy.</li>
<li>Mutator wants permission to keep bogus exact references
in general. Collector must safely cope with all the cases
of bogus exact references listed above. I believe the MPS
does not currently do this. It would require a very fast
check at fix time that the reference is to the start of a
valid object in an MPS client pool.</li>
</ol>
<h3>with exact refs</h3>
<p>For an all-exact mode of use, mutator must make an exact pointer
before setting I=A, and must keep it at least until it after
it has read Lc to see whther the commit succeeded.</p>
<p>So, some additions to the AP User's Guide, in the section on mps_reserve:</p>
<blockquote>
<p>Secondly, "raw data" means that any references TO the new
object are treated like other references to unmanaged memory.
[.belief.refs-to-uninit-safe:
We're 'sure', but I need to check this. What does Fix
actually do with a pointer into the init-alloc zone? We
hope it ignores it. RHSK 2006-06-09]
Because of this, you are permitted to connect the new object
into your graph of managed objects immediately. The MPS gives
you these guarantees:</p>
<ul>
<li> the new object is pinned (won't move), so references to it
that you write in old objects won't become stale at this time
(but see <code>mps_commit</code> below!)</li>
<li> although the new object is 'reachable' by these references,
at this stage (while I &lt; A) the MPS knows it
is not yet managed, and will not try to preserve it;</li>
<li> the MPS will not call the client's format code to answer
questions about the new object.</li>
</ul>
</blockquote>
<p>And to the section on mps_commit:</p>
<blockquote>
<p>When this happens the client should take care to clear up any
managed references to the (now vanished) new object.</p>
<p>[But there's a hole here, before the client does this.
Are managed (aka scanned) references TO
it still safe? They were safe during building
(by .belief.refs-to-uninit-safe). But now the AP pointers have
gone away. Are they still safe?
Clearly, if they are only RankAMBIG, they are safe.
What if they are RankEXACT?
RHSK 2006-06-09]</p>
<p>[Discussion with RB 2006-06-09: yes, that's a problem for exact
references. Must not make any exact refs to a new object. And
unmanaged refs are not sufficient, because they won't preserve the
new object during commit. So must make at least one ambiguous
ref to new object before calling commit. That's the truth
currently. There are various ways to solve this to allow
purely-exact mutators. For instance,
keep the old init..alloc address-space flagged as a zombie zone,
until some communication with mutator (perhaps another reserve
from same AP?) indicates that mutator has removed all those
pesky exact refs to the now-dead ex-new object. RHSK 2006-06-09]</p>
</blockquote>
<p>See <a href="unmanaged.html">issues with unmanaged workspace</a> for
a discussion of the "abort" phase, which we need to define
for allocation points.</p>
<p>The MPS needs to keep the buffer mapped for some time --
until the mutator has stopped writing into it.
The MPS needs to remember If1 forever.
The MPS needs to remember the rest of the
about-to-be-detached buffer structure for some time --
when is the MPS permitted to detach the buffer?</p>
<p>Keeping the buffer mapped consumes resource.
There must be a contract with the mutator about when the
MPS is permitted to free it.
</p>
<p>We could define the protocol such that there is a window:
after the mutator discovers that it has lost the race, but
while the collector is still remembering that this object
is invalid. The mutator could use this window to null-out
exact references to the new object.</p>
<p>This window could be after Lc and before calling mps_ap_trip.
During this time,
the collector must keep the new object as pinned unmanaged
memory. By calling mps_ap_trip, the mutator promises that
it has nulled-out any exact references to the new object,
and the collector in mps_ap_trip may unmap the memory or
safely re-use it without fear of spoofing.</p>
<p>Or, we could leave the window open until mps_ap_fill is
called. We just need to define it.</p>
<h3>flipping at other times</h3>
<p>Perhaps the collector wants to flip even when the ap is not
at (A). Need to analyse synchronization issues here.</p>
<h2><a id="section-B" name="section-B">B. Document History</a></h2>
<pre>
(As part of GC article)
2006-06-12 RHSK Allocation point internals: Scenario; A mutator with
ambiguous references only (to the new object); How
bad is a bogus exact reference?; quick notes on: with
exact refs; flipping at other times.
2006-06-15 RHSK Allocation Points: clarifications and further notes
into User Guide and Internals, including initializing
as-yet unused parts of new objects to reduce spoofing.
2006-06-22 RHSK Move rough notes .talk.RB.2006-06-13 etc elsewhere.
Make AP User Guide consistently .assume.ambig-workspace,
moving notes to APInternals; add Intro and Mistakes;
check example code.
(As part of AP User Guide article)
2006-06-22 RHSK Created from GC article.
(This article)
2006-06-23 RHSK Created from AP User Guide article.
</pre>
<h2><a id="section-C" name="section-C">C. Copyright and License</a></h2>
<p> This document is copyright &copy; 2006 <a href="http://www.ravenbrook.com/">Ravenbrook Limited</a>. All rights reserved. This is an open source license. Contact Ravenbrook for commercial licensing options. </p>
<p> Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: </p>
<ol>
<li> Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. </li>
<li> Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. </li>
<li> Redistributions in any form must be accompanied by information on how to obtain complete source code for the this software and any accompanying software that uses this software. The source code must either be included in the distribution or be available for no more than the cost of distribution plus a nominal fee, and must be freely redistributable under reasonable conditions. For an executable file, complete source code means the source code for all modules it contains. It does not include source code for modules or files that typically accompany the major components of the operating system on which the executable file runs. </li>
</ol>
<p> <strong> This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement, are disclaimed. In no event shall the copyright holders and contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage. </strong> </p>
<hr />
<div class="banner">
<p><code>$Id$</code></p>
<p>
<a href="/">Ravenbrook</a>
/ <a href="/project/">Projects</a>
/ <a href="/project/mps/">Memory Pool System</a>
/ <a href="/project/mps/master/">Master Product Sources</a>
/ <a href="/project/mps/master/manual/">Manuals</a>
/ <a href="/project/mps/master/manual/wiki/">Wiki</a>
</p>
</div>
</body>
</html>

View file

@ -67,11 +67,17 @@
</ul>
<p>Internal or arcane:</p>
<ul>
<li><a href="unmanaged.html">issues with unmanaged workspace</a>.</li>
<li><a href="apinternals.html">discussion of Allocation Point internals</a></li>
<li><a href="unmanaged.html">issues with unmanaged workspace</a></li>
</ul>
<p>Not much use yet:</p>
<ul>
<li>The available <a href="pool_classes.html">pool classes</a>.</li>
@ -139,6 +145,7 @@
2006-06-06 RHSK Article: Modes of use of MPS.
2006-06-22 RHSK Article: Allocation Point User Guide.
2006-06-23 RHSK Section: About the MPS Wiki
2006-06-23 RHSK Article: Allocation Point Internals
</pre>
<h2><a id="section-C" name="section-C">C. Copyright and License</a></h2>