mirror of
git://git.sv.gnu.org/emacs.git
synced 2026-01-12 14:30:42 -08:00
395 lines
17 KiB
ReStructuredText
395 lines
17 KiB
ReStructuredText
.. sources:
|
|
|
|
`<https://info.ravenbrook.com/project/mps/master/design/object-debug/>`_
|
|
|
|
.. mps:prefix:: design.mps.object-debug
|
|
|
|
Debugging features for client objects
|
|
=====================================
|
|
|
|
|
|
Introduction
|
|
------------
|
|
|
|
:mps:tag:`intro` This is the design for all the various debugging
|
|
features that MPS clients (and sometimes MPS developers) can use to
|
|
discover what is happening to their objects and the memory space.
|
|
|
|
:mps:tag:`readership` MPS developers.
|
|
|
|
|
|
History
|
|
-------
|
|
|
|
:mps:tag:`hist.0` The first draft merely records all the various ideas
|
|
about fenceposting that came up in discussions in June, July and
|
|
September 1998. This includes the format wrapping idea from
|
|
:mps:ref:`mail.ptw.1998-06-19.21-13(0)`. Pekka Pirinen, 1998-09-10.
|
|
|
|
:mps:tag:`hist.1` Converted from MMInfo database design document.
|
|
Richard Brooksby, 2002-06-07.
|
|
|
|
:mps:tag:`hist.2` Converted to reStructuredText. Gareth Rees,
|
|
2013-04-14.
|
|
|
|
|
|
Overview
|
|
--------
|
|
|
|
:mps:tag:`over.fenceposts` In its current state, this document mostly
|
|
talks about fenceposts, straying a little into tagging where theses
|
|
features have an effect on each other. [There exist other documents
|
|
that list other required features, and propose interfaces and
|
|
implementations. These will eventually be folded into this one. pekka
|
|
1998-09-10]
|
|
|
|
|
|
Requirements
|
|
------------
|
|
|
|
:mps:tag:`req.fencepost` Try to detect overwrites and underwrites of
|
|
allocated blocks by adding fenceposts (source req.product.??? VC++,
|
|
:mps:ref:`req.epcore.fun.debug.support`).
|
|
|
|
:mps:tag:`req.fencepost.size` The fenceposts should be at least 4
|
|
bytes on either side or 8 bytes if on one side only, with an
|
|
adjustable content (although VC++ only has 4 bytes with pattern
|
|
0xFDFDFDFD, having unwisely combined the implementation with other
|
|
debug features).
|
|
|
|
:mps:tag:`req.fencepost.check` There should be a function to check all
|
|
the fenceposts (source :mps:ref:`req.epcore.fun.debug.support`).
|
|
|
|
:mps:tag:`req.free-block` Try to detect attempts to write and read
|
|
free blocks.
|
|
|
|
:mps:tag:`req.walk` There should be a way to map ("walk") a user
|
|
function over all allocated objects (except PS VM objects), possibly
|
|
only in a separate debugging variety/mode (source
|
|
:mps:ref:`req.epcore.fun.debug.support`).
|
|
|
|
:mps:tag:`req.tag` There should be a way to store at least a word of
|
|
user data (a "tag", borrowing the SW term) with every object in
|
|
debugging mode, to be used in memory dumps (source req.product.???
|
|
VC++).
|
|
|
|
:mps:tag:`req.tag.walk` The walking function (as required by
|
|
:mps:ref:`.req.walk`) should have access to this data (source
|
|
:mps:ref:`req.epcore.fun.debug.support`).
|
|
|
|
:mps:tag:`req.dump.aver` It must be possible to perform a memory dump
|
|
after an :c:func:`AVER` has fired (naturally, if the information
|
|
required for the dump has been corrupted, it will fail, as softly as
|
|
possible). (source @@@@)
|
|
|
|
[There are more, especially about memory dumps and allocation
|
|
locations. pekka 1998-09-10]
|
|
|
|
|
|
Solution ideas
|
|
--------------
|
|
|
|
:mps:tag:`note.assumptions` I've tried not to assume anything about
|
|
the coincidence of manual/automatic, formatted/unformatted, and
|
|
ap/mps_alloc. I think those questions deserve to be decided on their
|
|
own merits. instead of being constrained by a debug feature.
|
|
|
|
:mps:tag:`fence.content.repeat` The content of a fencepost could be
|
|
specified as a byte/word which used repeatedly to fill the fencepost.
|
|
|
|
:mps:tag:`fence.content.template` The content could be given as a
|
|
template which is of the right size and is simply copied onto the
|
|
fencepost.
|
|
|
|
:mps:tag:`fence.walk` :mps:ref:`.req.fencepost.check` requires the
|
|
ability to find all the allocated objects. In formatted pools, this is
|
|
not a problem. In unformatted pools, we could use the walker. It's a
|
|
feasible strategy to bet that any pool that might have to support
|
|
fenceposting will also have a walking requirement.
|
|
|
|
:mps:tag:`fence.tag` Fenceposting also needs to keep track which
|
|
objects have fenceposts. unless we manage to do them all. It would be
|
|
easiest to put this in the tags.
|
|
|
|
:mps:tag:`fence.check.object` A function to check the fenceposts on a
|
|
given object would be nice.
|
|
|
|
:mps:tag:`fence.ap` AP's could support fenceposting transparently by
|
|
having a mode where :c:func:`mps_reserve` always goes out-of-line and
|
|
fills in the fenceposts (the pool's :c:func:`BufferFill` method isn't
|
|
involved). This would leave the MPS with more freedom of
|
|
implementation, especially when combined with some of the other ideas.
|
|
We think doing a function call for every allocation is not too bad for
|
|
debugging.
|
|
|
|
:mps:tag:`fence.outside-ap` We could also let the client insert their
|
|
own fenceposts outside the MPS allocation mechanism. Even if
|
|
fenceposting were done like this, we'd still want it to be an MPS
|
|
feature, so we'd offer sample C macros for adding the size of the
|
|
fencepost and filling in the fencepost pattern. Possibly something
|
|
like this (while we could still store the parameters in the pool or
|
|
allocation point, there seems little point in doing so in this case,
|
|
and having them as explicit parameters to the macros allows the client
|
|
to specify constants to gain effiency)::
|
|
|
|
#define mps_add_fencepost(size, fp_size)
|
|
#define mps_fill_fenceposts(obj, size, fp_size, fp_pattern)
|
|
|
|
The client would need to supply their own fencepost checking function,
|
|
obviously, but again we could offer one that matches the sample
|
|
macros.
|
|
|
|
:mps:tag:`fence.tail-only` In automatic pools, the presence of a
|
|
fencepost at the head of the allocated block results in the object
|
|
reference being an internal pointer. This means that the format or the
|
|
pool would need to know about fenceposting and convert between
|
|
references and pointers. This would slow down the critical path when
|
|
fenceposting is used. This can be ameliorated by putting a fencepost
|
|
at the tail of the block only: this obviates the internal pointer
|
|
problem and could provide almost the same degree of checking (provided
|
|
the size was twice as large), especially in copying pools, where there
|
|
are normally no gaps between allocated blocks. In addition to the
|
|
inescapable effects on allocation and freeing (including copying and
|
|
reclaim thereunder), only scanning would have to know about
|
|
fenceposts.
|
|
|
|
:mps:tag:`fence.tail-only.under` Walking over all the objects in the
|
|
pool would be necessary to detect underwrites, as one couldn't be sure
|
|
that there is a fencepost before any given object (or where it's
|
|
located exactly). If the pool were doing the checking, it could be
|
|
sure: it would know about alignments and it could put fenceposts in
|
|
padding objects (free blocks will have them because they were once
|
|
allocated) so there'd be one on either side of any object (except at
|
|
the head of a segment, which is not a major problem, and could be
|
|
fixed by adding a padding object at the beginning of every segment).
|
|
This requires some cleverness to avoid splinters smaller than the
|
|
fencepost size, but it can be done.
|
|
|
|
:mps:tag:`fence.wrapper` On formatted pools, fenceposting could be
|
|
implemented by "wrapping" the client-supplied format at creation time.
|
|
The wrapper can handle the conversion from the fenceposted object and
|
|
back. This will be invisible to the client and gives the added benefit
|
|
that the wrapper can validate fenceposts on every format operation,
|
|
should it desire. That is, the pool would see the fenceposts as part
|
|
of the client object, but the client would only see its object; the
|
|
format wrapper would translate between the two. Note that hiding the
|
|
fenceposts from scan methods, which are required to take a contiguous
|
|
range of objects, is a bit complicated.
|
|
|
|
:mps:tag:`fence.client-format` The MPS would supply such a wrapper,
|
|
but clients could also be allowed to write their own fenceposted
|
|
formats (provided they coordinate with allocation, see below). This
|
|
would make scanning fenceposted segments more efficient.
|
|
|
|
:mps:tag:`fence.wrapper.variable` Furthermore, you could create
|
|
different classes of fencepost within a pool, because the fencepost
|
|
itself could have a variable format. For instance, you might choose to
|
|
have the fencepost be minimal (one to two words) for small objects,
|
|
and more detailed/complex for large objects (imagining that large
|
|
objects are likely vector-ish and subject to overruns). You could get
|
|
really fancy and have the fencepost class keyed to the object class
|
|
(for example, different allocation points create different classes of
|
|
fenceposting).
|
|
|
|
:mps:tag:`fence.wrapper.alloc` Even with a wrapped format, allocation
|
|
and freeing would still have know about the fenceposts. If allocation
|
|
points are used, either MPS-side (:mps:ref:`.fence.ap`) or client-side
|
|
(:mps:ref:`.fence.outside-ap`) fenceposting could be used, with the obvious
|
|
modifications.
|
|
|
|
:mps:tag:`fence.wrapper.alloc.format` We could add three format
|
|
methods, to adjust the pointer and the size for alloc and free, to put
|
|
down the fenceposts during alloc, and to check them; to avoid slowing
|
|
down all allocation, this would require some MOPping to make the
|
|
format class affect the choice of the alloc and free methods (see
|
|
:mps:ref:`mail.pekka.1998-06-11.18-18`).
|
|
|
|
:mps:tag:`fence.wrapper.alloc.size` We could just communicate the size
|
|
of the fenceposts between the format and the allocation routines, but
|
|
then you couldn't use variable fenceposts (.fence.wrapper.variable).
|
|
[All this applies to copying and reclaim in a straight-forward manner,
|
|
I think.]
|
|
|
|
:mps:tag:`fence.pool.wrapper` Pools can be wrapped as well. This could
|
|
be a natural way to represent/implement the fenceposting changes to
|
|
the Alloc and Free methods. [@@@@alignment]
|
|
|
|
:mps:tag:`fence.pool.new-class` We could simply offer a debugging
|
|
version of each pool class (e.g., :c:func:`mps_pool_class_mv_debug`).
|
|
As we have seen, debugging features have synergies which make it
|
|
advantageous to have a coordinated implementation, so splitting them
|
|
up would not just complicate the client interface, it would also be an
|
|
implementation problem; we can turn features on or off with pool init
|
|
parameters.
|
|
|
|
:mps:tag:`fence.pool.abstract` We could simply use pool init
|
|
parameters only to control all debugging features (optargs would be
|
|
useful here). While there migh be subclasses and wrappers internally,
|
|
the client would only see a single pool class; in the internal view,
|
|
this would be an abstract class, and the parameters would determine
|
|
which concrete class actually gets instantiated.
|
|
|
|
:mps:tag:`tag.out-of-line` It would be nice if tags were stored
|
|
out-of-line, so they can be used to study allocation patterns and
|
|
fragmentation behaviours. Such an implementation of tagging could also
|
|
easily be shared among several pools.
|
|
|
|
|
|
Architecture
|
|
------------
|
|
|
|
:mps:tag:`pool` The implementation is at the pool level, because pools
|
|
manage allocated objects. A lot of the code will be generic,
|
|
naturally, but the data structures and the control interfaces attach
|
|
to pools. In particular, clients will be able to use tagging and
|
|
fenceposting separately on each pool.
|
|
|
|
:mps:tag:`fence.size` Having fenceposts of adjustable size and pattern
|
|
is quite useful. We feel that restricting the size to an integral
|
|
multiple of the [pool or format?] alignment is harmless and simplifies
|
|
the implementation enormously.
|
|
|
|
:mps:tag:`fence.template` We use templates
|
|
(:mps:ref:`.fence.content.template`) to fill in the fenceposts, but we
|
|
do not give any guarantees about the location of the fenceposts, only
|
|
that they're properly aligned. This leaves us the opportunity to do
|
|
tail-only fenceposting, if we choose.
|
|
|
|
:mps:tag:`fence.slop` [see :mps:ref:`impl.c.dbgpool.FenceAlloc` @@@@]
|
|
|
|
:mps:tag:`fence.check.free` We check the fenceposts when freeing an
|
|
object.
|
|
|
|
:mps:tag:`unified-walk` Combine the walking and tagging requirements
|
|
(:mps:ref:`.req.tag.walk` and @@@@) into a generic facility for
|
|
walking and tagging objects with just one interface and one name:
|
|
tagging. Also combine the existing formatted object walker into this
|
|
metaphor, but allowing the format and tag parameters of the step
|
|
function be optional [this part has not been implemented yet pekka
|
|
1998-09-10].
|
|
|
|
:mps:tag:`init` It simplifies the implementation of both tagging and
|
|
fenceposting if they are always on, so that we don't have to keep
|
|
track of which objects have been fenceposted and which have not, and
|
|
don't have to have three kinds of tags: for user data, for
|
|
fenceposting, and for both. So we determine this at pool init time
|
|
(and let fenceposting turn on tagging, if necessary).
|
|
|
|
:mps:tag:`pool-parameters` Fencepost templates and tag formats are
|
|
passed in as pool parameters.
|
|
|
|
:mps:tag:`modularity` While a combined generic implementation of tags
|
|
and fenceposts is provided, it is structured so that each part of it
|
|
could be implemented by a pool-specific mechanism with a minimum of
|
|
new protocol. [This will be improved, when we figure out formatted
|
|
pools -- they don't need tags for fenceposting.]
|
|
|
|
:mps:tag:`out-of-space` If there's no room for tags, we will not dip
|
|
into the reservoir, just fail to allocate the tag. If the alloc call
|
|
had a reservoir permit, we let it succeed even without a tag, and just
|
|
make sure the free method will not complain if it can't find a tag. If
|
|
the call didn't have a reservoir permit, we free the block allocated
|
|
for the object and fail the allocation, so that the client gets a
|
|
chance to do whatever low-memory actions they might want to do.
|
|
[Should this depend on whether there is anything in the reservoir?]
|
|
This breaks the one-to-one relationship between tags and objects, so
|
|
some checks cannot be made, but we do count the "lost" tags.
|
|
|
|
[need to hash out how to do fenceposting in formatted pools]
|
|
|
|
|
|
Client interface
|
|
----------------
|
|
|
|
:mps:tag:`interface.fenceposting.check`
|
|
:c:func:`mps_pool_check_fenceposts` is a function to check all
|
|
fenceposts in a pool (:c:func:`AVER` if a problem is found)
|
|
|
|
[from here on, these are tentative and incomplete]
|
|
|
|
.. c:function:: mps_res_t mps_fmt_fencepost_wrap(mps_fmt_t *format_return, mps_arena_t arena, mps_fmt_t format, [fp parameters])
|
|
|
|
:mps:tag:`interface.fenceposting.format` A function to wrap a format
|
|
(class) to provide fenceposting.
|
|
|
|
.. c:function:: void (*mps_fmt_adjust_fencepost_t)(size_t *size_io)
|
|
|
|
:mps:tag:`interface.fenceposting.add` A format method to adjust size
|
|
of a block about to be allocted to allow for fenceposts.
|
|
|
|
.. c:function:: void (*mps_fmt_put_fencepost_t)(mps_addr_t * addr_io, size_t size)
|
|
|
|
:mps:tag:`interface.fenceposting.add` A format method to add a fencepost around a block
|
|
about to be allocated [the NULL method adds a tail fencepost]
|
|
|
|
.. c:function:: mps_bool_t (*mps_fmt_check_fenceposts_t)(mps_addr_t)
|
|
|
|
:mps:tag:`interface.fenceposting.add` A format method to check the
|
|
fenceposts around an object [the ``NULL`` method checks tails].
|
|
|
|
.. c:function:: mps_class_t mps_debug_class(mps_class_t class)
|
|
|
|
:mps:tag:`interface.fenceposting.pool` A function to wrap a pool class
|
|
to provide fenceposting (note absence of arena parameter).
|
|
|
|
.. c:function:: mps_res_t mps_alloc(mps_addr_t *, mps_pool_t, size_t);
|
|
.. c:function:: mps_res_t mps_alloc_dbg(mps_addr_t *, mps_pool_t, size_t, ...);
|
|
.. c:function:: mps_res_t mps_alloc_dbg_v(mps_addr_t *, mps_pool_t, size_t, va_list);
|
|
|
|
:mps:tag:`interface.tags.alloc` Three functions to replace existing
|
|
:c:func:`mps_alloc` (request.???.??? proposes to remove the varargs)
|
|
|
|
.. c:function:: void (*mps_objects_step_t)(mps_addr_t addr, size_t size, mps_fmt_t format, mps_pool_t pool, void *tag_data, void *p)
|
|
.. c:function:: void mps_pool_walk(mps_arena_t arena, mps_pool_t pool, mps_objects_step_t step, void *p)
|
|
.. c:function:: void mps_arena_walk(mps_arena_t arena, mps_objects_step_t step, void *p)
|
|
|
|
:mps:tag:`interface.tags.walker` Functions to walk all the allocated
|
|
objects in a pool or an arena (only client pools in this case),
|
|
``format`` and ``tag_data`` can be ``NULL`` (``tag_data`` really wants
|
|
to be ``void *``, not :c:type:`mps_addr_t`, because it's stored
|
|
together with the internal tag data in an MPS internal pool)
|
|
|
|
|
|
Examples
|
|
--------
|
|
|
|
:mps:tag:`example.debug-alloc` ::
|
|
|
|
#define MPS_ALLOC_DBG(res_io, addr_io, pool, size)
|
|
MPS_BEGIN
|
|
static mps_tag_A_s _ts = { __FILE__, __LINE__ };
|
|
|
|
*res_io = mps_alloc(addr_io, pool, size, _ts_)
|
|
MPS_END
|
|
|
|
|
|
Implementation
|
|
--------------
|
|
|
|
:mps:tag:`new-pool` The client interface to control fenceposting
|
|
consists of the new classes :c:func:`mps_pool_class_mv_debug`,
|
|
:c:func:`mps_pool_class_epdl_debug`, and
|
|
:c:func:`mps_pool_class_epdr_debug`, and their new init parameter of
|
|
type :c:type:`mps_pool_debug_option_s`. [This is a temporary solution,
|
|
to get it out without writing lots of new interface. pekka 1998-09-10]
|
|
|
|
:mps:tag:`new-pool.impl` The debug pools are implemented using the
|
|
"class wrapper" :c:func:`EnsureDebugClass`, which produces a subclass
|
|
with modified ``init``, ``finish``, ``alloc``, and ``free`` methods.
|
|
These methods are implemented in the generic debug class code
|
|
(:c:func:`impl.c.dbgpool`), and are basically wrappers around the
|
|
superclass methods (invoked through the ``pool->class->super`` field).
|
|
To find the data stored in the class for the debugging features, they
|
|
use the ``debugMixin`` method provided by the subclass. So to make a
|
|
debug subclass, three things should be provided: a structure
|
|
definition of the instance containing a
|
|
:c:type:`PoolDebugMixinStruct`, a pool class function that uses
|
|
:c:func:`EnsureDebugClass`, and a ``debugMixin`` method that locates
|
|
the :c:type:`PoolDebugMixinStruct` within an instance.
|
|
|
|
:mps:tag:`tags.splay` The tags are stored in a splay tree of tags
|
|
allocated from a subsidiary MFS pool. The client needs to specify the
|
|
(maximum) size of the client data in a tag, so that the pool can be
|
|
created.
|
|
|
|
[Lots more should be said, eventually. pekka 1998-09-10]
|