mirror of
git://git.sv.gnu.org/emacs.git
synced 2025-12-25 15:00:45 -08:00
Convert [notes in square brackets] to .. note:: to make them stand out more and be easier to find and fix. Modernize config a bit. Copied from Perforce Change: 182162 ServerID: perforce.ravenbrook.com
462 lines
19 KiB
ReStructuredText
462 lines
19 KiB
ReStructuredText
.. mode: -*- rst -*-
|
|
|
|
Debugging features for client objects
|
|
=====================================
|
|
|
|
:Tag: design.mps.object-debug
|
|
:Author: Pekka P. Pirinen
|
|
:Date: 1998-09-10
|
|
:Status: incomplete design
|
|
:Revision: $Id$
|
|
:Copyright: See `Copyright and License`_.
|
|
|
|
|
|
Introduction
|
|
------------
|
|
|
|
_`.intro`: This is the design for all the various debugging features
|
|
that MPS clients (and sometimes MPS developers) can use to discover
|
|
what is happening to their objects and the memory space.
|
|
|
|
_`.readership`: MPS developers.
|
|
|
|
|
|
Overview
|
|
--------
|
|
|
|
_`.over.fenceposts`: In its current state, this document mostly talks
|
|
about fenceposts, straying a little into tagging where theses features
|
|
have an effect on each other.
|
|
|
|
.. note::
|
|
|
|
There exist other documents that list other required features, and
|
|
propose interfaces and implementations. These will eventually be
|
|
folded into this one. Pekka P. Pirinen, 1998-09-10.
|
|
|
|
|
|
Requirements
|
|
------------
|
|
|
|
_`.req.fencepost`: Try to detect overwrites and underwrites of
|
|
allocated blocks by adding fenceposts (source req.product.??? VC++,
|
|
req.epcore.fun.debug.support).
|
|
|
|
_`.req.fencepost.size`: The fenceposts should be at least 4 bytes on
|
|
either side or 8 bytes if on one side only, with an adjustable content
|
|
(although VC++ only has 4 bytes with pattern 0xFDFDFDFD, having
|
|
unwisely combined the implementation with other debug features).
|
|
|
|
_`.req.fencepost.check`: There should be a function to check all the
|
|
fenceposts (source req.epcore.fun.debug.support).
|
|
|
|
_`.req.free-block`: Try to detect attempts to write and read free
|
|
blocks.
|
|
|
|
_`.req.walk`: There should be a way to map ("walk") a user function
|
|
over all allocated objects (except PS VM objects), possibly only in a
|
|
separate debugging variety/mode (source req.epcore.fun.debug.support).
|
|
|
|
_`.req.tag`: There should be a way to store at least a word of user
|
|
data (a "tag", borrowing the SW term) with every object in debugging
|
|
mode, to be used in memory dumps (source req.product.??? VC++).
|
|
|
|
_`.req.tag.walk`: The walking function (as required by `.req.walk`_)
|
|
should have access to this data (source req.epcore.fun.debug.support).
|
|
|
|
_`.req.dump.aver`: It must be possible to perform a memory dump after
|
|
an ``AVER()`` has fired. Naturally, if the information required for
|
|
the dump has been corrupted, it will fail, as softly as possible
|
|
(source @@@@).
|
|
|
|
.. note::
|
|
|
|
There are more requirements, especially about memory dumps and
|
|
allocation locations. Pekka P. Pirinen, 1998-09-10.
|
|
|
|
|
|
Solution ideas
|
|
--------------
|
|
|
|
_`.note.assumptions`: I've tried not to assume anything about the
|
|
coincidence of manual/automatic, formatted/unformatted, and
|
|
ap/mps_alloc. I think those questions deserve to be decided on their
|
|
own merits. instead of being constrained by a debug feature.
|
|
|
|
_`.fence.content.repeat`: The content of a fencepost could be
|
|
specified as a byte/word which used repeatedly to fill the fencepost.
|
|
|
|
_`.fence.content.template`: The content could be given as a template
|
|
which is of the right size and is simply copied onto the fencepost.
|
|
|
|
_`.fence.walk`: `.req.fencepost.check`_ requires the ability to find
|
|
all the allocated objects. In formatted pools, this is not a problem.
|
|
In unformatted pools, we could use the walker. It's a feasible
|
|
strategy to bet that any pool that might have to support fenceposting
|
|
will also have a walking requirement.
|
|
|
|
_`.fence.tag`: Fenceposting also needs to keep track which objects
|
|
have fenceposts. unless we manage to do them all. It would be easiest
|
|
to put this in the tags.
|
|
|
|
_`.fence.check.object`: A function to check the fenceposts on a given
|
|
object would be nice.
|
|
|
|
_`.fence.ap`: AP's could support fenceposting transparently by having
|
|
a mode where ``mps_reserve()`` always goes out-of-line and fills in the
|
|
fenceposts (the pool's ``BufferFill()`` method isn't involved). This
|
|
would leave the MPS with more freedom of implementation, especially
|
|
when combined with some of the other ideas. We think doing a function
|
|
call for every allocation is not too bad for debugging.
|
|
|
|
_`.fence.outside-ap`: We could also let the client insert their own
|
|
fenceposts outside the MPS allocation mechanism. Even if fenceposting
|
|
were done like this, we'd still want it to be an MPS feature, so we'd
|
|
offer sample C macros for adding the size of the fencepost and filling
|
|
in the fencepost pattern. Possibly something like this (while we could
|
|
still store the parameters in the pool or allocation point, there
|
|
seems little point in doing so in this case, and having them as
|
|
explicit parameters to the macros allows the client to specify
|
|
constants to gain effiency)::
|
|
|
|
#define mps_add_fencepost(size, fp_size)
|
|
#define mps_fill_fenceposts(obj, size, fp_size, fp_pattern)
|
|
|
|
The client would need to supply their own fencepost checking function,
|
|
obviously, but again we could offer one that matches the sample
|
|
macros.
|
|
|
|
_`.fence.tail-only`: In automatic pools, the presence of a fencepost
|
|
at the head of the allocated block results in the object reference
|
|
being an internal pointer. This means that the format or the pool
|
|
would need to know about fenceposting and convert between references
|
|
and pointers. This would slow down the critical path when fenceposting
|
|
is used. This can be ameliorated by putting a fencepost at the tail of
|
|
the block only: this obviates the internal pointer problem and could
|
|
provide almost the same degree of checking (provided the size was
|
|
twice as large), especially in copying pools, where there are normally
|
|
no gaps between allocated blocks. In addition to the inescapable
|
|
effects on allocation and freeing (including copying and reclaim
|
|
thereunder), only scanning would have to know about fenceposts.
|
|
|
|
_`.fence.tail-only.under`: Walking over all the objects in the pool
|
|
would be necessary to detect underwrites, as one couldn't be sure that
|
|
there is a fencepost before any given object (or where it's located
|
|
exactly). If the pool were doing the checking, it could be sure: it
|
|
would know about alignments and it could put fenceposts in padding
|
|
objects (free blocks will have them because they were once allocated)
|
|
so there'd be one on either side of any object (except at the head of
|
|
a segment, which is not a major problem, and could be fixed by adding
|
|
a padding object at the beginning of every segment). This requires
|
|
some cleverness to avoid splinters smaller than the fencepost size,
|
|
but it can be done.
|
|
|
|
_`.fence.wrapper`: On formatted pools, fenceposting could be
|
|
implemented by "wrapping" the client-supplied format at creation time.
|
|
The wrapper can handle the conversion from the fenceposted object and
|
|
back. This will be invisible to the client and gives the added benefit
|
|
that the wrapper can validate fenceposts on every format operation,
|
|
should it desire. That is, the pool would see the fenceposts as part
|
|
of the client object, but the client would only see its object; the
|
|
format wrapper would translate between the two. Note that hiding the
|
|
fenceposts from scan methods, which are required to take a contiguous
|
|
range of objects, is a bit complicated.
|
|
|
|
_`.fence.client-format`: The MPS would supply such a wrapper, but
|
|
clients could also be allowed to write their own fenceposted formats
|
|
(provided they coordinate with allocation, see below). This would make
|
|
scanning fenceposted segments more efficient.
|
|
|
|
_`.fence.wrapper.variable`: Furthermore, you could create different
|
|
classes of fencepost within a pool, because the fencepost itself could
|
|
have a variable format. For instance, you might choose to have the
|
|
fencepost be minimal (one to two words) for small objects, and more
|
|
detailed/complex for large objects (imagining that large objects are
|
|
likely vector-ish and subject to overruns). You could get really fancy
|
|
and have the fencepost class keyed to the object class (for example,
|
|
different allocation points create different classes of fenceposting).
|
|
|
|
_`.fence.wrapper.alloc`: Even with a wrapped format, allocation and
|
|
freeing would still have know about the fenceposts. If allocation
|
|
points are used, either MPS-side (`.fence.ap`_) or client-side
|
|
(`.fence.outside-ap`_) fenceposting could be used, with the obvious
|
|
modifications.
|
|
|
|
_`.fence.wrapper.alloc.format`: We could add three format methods, to
|
|
adjust the pointer and the size for alloc and free, to put down the
|
|
fenceposts during alloc, and to check them; to avoid slowing down all
|
|
allocation, this would require some MOPping to make the format class
|
|
affect the choice of the alloc and free methods (see
|
|
mail.pekka.1998-06-11.18-18).
|
|
|
|
_`.fence.wrapper.alloc.size`: We could just communicate the size of
|
|
the fenceposts between the format and the allocation routines, but
|
|
then you couldn't use variable fenceposts (.fence.wrapper.variable).
|
|
|
|
.. note::
|
|
|
|
All this applies to copying and reclaim in a straight-forward
|
|
manner, I think.
|
|
|
|
_`.fence.pool.wrapper`: Pools can be wrapped as well. This could be a
|
|
natural way to represent/implement the fenceposting changes to the
|
|
Alloc and Free methods. [@@@@alignment]
|
|
|
|
_`.fence.pool.new-class`: We could simply offer a debugging version of
|
|
each pool class (e.g., ``mps_pool_class_mv_debug()``). As we have seen,
|
|
debugging features have synergies which make it advantageous to have a
|
|
coordinated implementation, so splitting them up would not just
|
|
complicate the client interface, it would also be an implementation
|
|
problem; we can turn features on or off with pool init parameters.
|
|
|
|
_`.fence.pool.abstract`: We could simply use pool init parameters only
|
|
to control all debugging features (optargs would be useful here).
|
|
While there migh be subclasses and wrappers internally, the client
|
|
would only see a single pool class; in the internal view, this would
|
|
be an abstract class, and the parameters would determine which
|
|
concrete class actually gets instantiated.
|
|
|
|
_`.tag.out-of-line`: It would be nice if tags were stored out-of-line,
|
|
so they can be used to study allocation patterns and fragmentation
|
|
behaviours. Such an implementation of tagging could also easily be
|
|
shared among several pools.
|
|
|
|
|
|
Architecture
|
|
------------
|
|
|
|
_`.pool`: The implementation is at the pool level, because pools
|
|
manage allocated objects. A lot of the code will be generic,
|
|
naturally, but the data structures and the control interfaces attach
|
|
to pools. In particular, clients will be able to use tagging and
|
|
fenceposting separately on each pool.
|
|
|
|
_`.fence.size`: Having fenceposts of adjustable size and pattern is
|
|
quite useful. We feel that restricting the size to an integral
|
|
multiple of the [pool or format?] alignment is harmless and simplifies
|
|
the implementation enormously.
|
|
|
|
_`.fence.template`: We use templates (`.fence.content.template`_) to
|
|
fill in the fenceposts, but we do not give any guarantees about the
|
|
location of the fenceposts, only that they're properly aligned. This
|
|
leaves us the opportunity to do tail-only fenceposting, if we choose.
|
|
|
|
_`.fence.slop`: [see impl.c.dbgpool.FenceAlloc @@@@]
|
|
|
|
_`.fence.check.free`: We check the fenceposts when freeing an object.
|
|
|
|
_`.unified-walk`: Combine the walking and tagging requirements
|
|
(`.req.tag.walk`_ and @@@@) into a generic facility for walking and
|
|
tagging objects with just one interface and one name: tagging. Also
|
|
combine the existing formatted object walker into this metaphor, but
|
|
allowing the format and tag parameters of the step function be
|
|
optional.
|
|
|
|
.. note::
|
|
|
|
This part has not been implemented yet Pekka P. Pirinen,
|
|
1998-09-10.
|
|
|
|
_`.init`: It simplifies the implementation of both tagging and
|
|
fenceposting if they are always on, so that we don't have to keep
|
|
track of which objects have been fenceposted and which have not, and
|
|
don't have to have three kinds of tags: for user data, for
|
|
fenceposting, and for both. So we determine this at pool init time
|
|
(and let fenceposting turn on tagging, if necessary).
|
|
|
|
_`.pool-parameters`: Fencepost templates and tag formats are passed in
|
|
as pool parameters.
|
|
|
|
_`.modularity`: While a combined generic implementation of tags and
|
|
fenceposts is provided, it is structured so that each part of it could
|
|
be implemented by a pool-specific mechanism with a minimum of new
|
|
protocol.
|
|
|
|
.. note::
|
|
|
|
This will be improved, when we figure out formatted pools -- they
|
|
don't need tags for fenceposting.
|
|
|
|
_`.out-of-space`: If there's no room for tags, we will not dip into
|
|
the reservoir, just fail to allocate the tag. If the alloc call had a
|
|
reservoir permit, we let it succeed even without a tag, and just make
|
|
sure the free method will not complain if it can't find a tag. If the
|
|
call didn't have a reservoir permit, we free the block allocated for
|
|
the object and fail the allocation, so that the client gets a chance
|
|
to do whatever low-memory actions they might want to do.
|
|
|
|
.. note::
|
|
|
|
Should this depend on whether there is anything in the reservoir?
|
|
|
|
This breaks the one-to-one relationship between tags and objects, so
|
|
some checks cannot be made, but we do count the "lost" tags.
|
|
|
|
.. note::
|
|
|
|
Need to hash out how to do fenceposting in formatted pools.
|
|
|
|
|
|
Client interface
|
|
----------------
|
|
|
|
_`.interface.fenceposting.check`:
|
|
``mps_pool_check_fenceposts()`` is a function to check all
|
|
fenceposts in a pool (``AVER()`` if a problem is found)
|
|
|
|
.. note::
|
|
|
|
From here on, these are tentative and incomplete.
|
|
|
|
``mps_res_t mps_fmt_fencepost_wrap(mps_fmt_t *format_return, mps_arena_t arena, mps_fmt_t format, ...)``
|
|
|
|
_`.interface.fenceposting.format`: A function to wrap a format
|
|
(class) to provide fenceposting.
|
|
|
|
``void (*mps_fmt_adjust_fencepost_t)(size_t *size_io)``
|
|
|
|
_`.interface.fenceposting.adjust`: A format method to adjust size of a
|
|
block about to be allocted to allow for fenceposts.
|
|
|
|
``void (*mps_fmt_put_fencepost_t)(mps_addr_t * addr_io, size_t size)``
|
|
|
|
_`.interface.fenceposting.add`: A format method to add a fencepost
|
|
around a block about to be allocated. The ``NULL`` method adds a tail
|
|
fencepost.
|
|
|
|
``mps_bool_t (*mps_fmt_check_fenceposts_t)(mps_addr_t)``
|
|
|
|
_`.interface.fenceposting.checker`: A format method to check the
|
|
fenceposts around an object. The ``NULL`` method checks tails.
|
|
|
|
``mps_class_t mps_debug_class(mps_class_t class)``
|
|
|
|
_`.interface.fenceposting.pool`: A function to wrap a pool class
|
|
to provide fenceposting (note absence of arena parameter).
|
|
|
|
``mps_res_t mps_alloc(mps_addr_t *, mps_pool_t, size_t);``
|
|
``mps_res_t mps_alloc_dbg(mps_addr_t *, mps_pool_t, size_t, ...);``
|
|
``mps_res_t mps_alloc_dbg_v(mps_addr_t *, mps_pool_t, size_t, va_list);``
|
|
|
|
_`.interface.tags.alloc`: Three functions to replace existing
|
|
``mps_alloc()`` (request.???.??? proposes to remove the varargs)
|
|
|
|
``void (*mps_objects_step_t)(mps_addr_t addr, size_t size, mps_fmt_t format, mps_pool_t pool, void *tag_data, void *p)``
|
|
``void mps_pool_walk(mps_arena_t arena, mps_pool_t pool, mps_objects_step_t step, void *p)``
|
|
``void mps_arena_walk(mps_arena_t arena, mps_objects_step_t step, void *p)``
|
|
|
|
_`.interface.tags.walker`: Functions to walk all the allocated
|
|
objects in a pool or an arena (only client pools in this case),
|
|
``format`` and ``tag_data`` can be ``NULL`` (``tag_data`` really wants
|
|
to be ``void *``, not ``mps_addr_t``, because it's stored
|
|
together with the internal tag data in an MPS internal pool)
|
|
|
|
|
|
Examples
|
|
--------
|
|
|
|
_`.example.debug-alloc`: ::
|
|
|
|
#define MPS_ALLOC_DBG(res_io, addr_io, pool, size)
|
|
MPS_BEGIN
|
|
static mps_tag_A_s _ts = { __FILE__, __LINE__ };
|
|
|
|
*res_io = mps_alloc(addr_io, pool, size, _ts_)
|
|
MPS_END
|
|
|
|
|
|
Implementation
|
|
--------------
|
|
|
|
_`.new-pool`: The client interface to control fenceposting
|
|
consists of the new classes ``mps_pool_class_mv_debug()``,
|
|
``mps_pool_class_epdl_debug()``, and
|
|
``mps_pool_class_epdr_debug()``, and their new init parameter of
|
|
type ``mps_pool_debug_option_s``.
|
|
|
|
.. note::
|
|
|
|
This is a temporary solution, to get it out without writing lots
|
|
of new interface. Pekka P. Pirinen, 1998-09-10.
|
|
|
|
_`.new-pool.impl`: The debug pools are implemented using the "class
|
|
wrapper" ``EnsureDebugClass()``, which produces a subclass with
|
|
modified ``init``, ``finish``, ``alloc``, and ``free`` methods. These
|
|
methods are implemented in the generic debug class code
|
|
(``impl.c.dbgpool``), and are basically wrappers around the superclass
|
|
methods (invoked through the ``pool->class->super`` field). To find
|
|
the data stored in the class for the debugging features, they use the
|
|
``debugMixin`` method provided by the subclass. So to make a debug
|
|
subclass, three things should be provided: a structure definition of
|
|
the instance containing a ``PoolDebugMixinStruct``, a pool class
|
|
function that uses ``EnsureDebugClass()``, and a ``debugMixin`` method
|
|
that locates the ``PoolDebugMixinStruct`` within an instance.
|
|
|
|
_`.tags.splay`: The tags are stored in a splay tree of tags
|
|
allocated from a subsidiary MFS pool. The client needs to specify the
|
|
(maximum) size of the client data in a tag, so that the pool can be
|
|
created.
|
|
|
|
.. note::
|
|
|
|
Lots more should be said, eventually. Pekka P. Pirinen,
|
|
1998-09-10.
|
|
|
|
|
|
Document History
|
|
----------------
|
|
|
|
- 1998-09-10 Pekka Pirinen The first draft merely records all the
|
|
various ideas about fenceposting that came up in discussions in
|
|
June, July and September 1998. This includes the format wrapping
|
|
idea from mail.ptw.1998-06-19.21-13(0).
|
|
|
|
- 2002-06-07 RB_ Converted from MMInfo database design document.
|
|
|
|
- 2013-04-14 GDR_ Converted to reStructuredText.
|
|
|
|
.. _RB: http://www.ravenbrook.com/consultants/rb/
|
|
.. _GDR: http://www.ravenbrook.com/consultants/gdr/
|
|
|
|
|
|
Copyright and License
|
|
---------------------
|
|
|
|
Copyright © 2013 Ravenbrook Limited. All rights reserved.
|
|
<http://www.ravenbrook.com/>. This is an open source license. Contact
|
|
Ravenbrook for commercial licensing options.
|
|
|
|
Redistribution and use in source and binary forms, with or without
|
|
modification, are permitted provided that the following conditions are
|
|
met:
|
|
|
|
1. Redistributions of source code must retain the above copyright
|
|
notice, this list of conditions and the following disclaimer.
|
|
|
|
2. Redistributions in binary form must reproduce the above copyright
|
|
notice, this list of conditions and the following disclaimer in the
|
|
documentation and/or other materials provided with the distribution.
|
|
|
|
3. Redistributions in any form must be accompanied by information on how
|
|
to obtain complete source code for this software and any
|
|
accompanying software that uses this software. The source code must
|
|
either be included in the distribution or be available for no more than
|
|
the cost of distribution plus a nominal fee, and must be freely
|
|
redistributable under reasonable conditions. For an executable file,
|
|
complete source code means the source code for all modules it contains.
|
|
It does not include source code for modules or files that typically
|
|
accompany the major components of the operating system on which the
|
|
executable file runs.
|
|
|
|
**This software is provided by the copyright holders and contributors
|
|
"as is" and any express or implied warranties, including, but not
|
|
limited to, the implied warranties of merchantability, fitness for a
|
|
particular purpose, or non-infringement, are disclaimed. In no event
|
|
shall the copyright holders and contributors be liable for any direct,
|
|
indirect, incidental, special, exemplary, or consequential damages
|
|
(including, but not limited to, procurement of substitute goods or
|
|
services; loss of use, data, or profits; or business interruption)
|
|
however caused and on any theory of liability, whether in contract,
|
|
strict liability, or tort (including negligence or otherwise) arising in
|
|
any way out of the use of this software, even if advised of the
|
|
possibility of such damage.**
|