The Disclaim Patch
- Update 2009-08-15. Adjusted to match the latest version of the
patch, along with clarifications and corrections of typos. The
functionality of the patch was previously called "reclaim
notification".
These are rather technical notes about a patch to the
Boehm-Demers-Weiser Conservative Garbage Collector which adds a
low-level feature which is aimed to support efficient hash-consing and
finalisation.
The Issues
First off, it should be noted that the performance finalisation is
usually a non-issue if recommendations for their use is followed. The main
use of finalisation is to free up non-critical resources. (Critical
resources should be explicitly managed, since finalisation is not
guaranteed.) Since such resources are typically few compared to the
abundance of system memory, the overhead of finalisation is small. So why
bother?
My interest in this stems from a more exotic use of the collector,
hash-consing. This mechanism can be implemented in terms of finalisers or
in terms of weak-pointers, but since neither is a good fit, it's useful to
think of hash-consing as an independent mechanism.
Hash-consing is a
technique that allows sharing of objects constructed independently but which
happen to be identical. In addition to reducing memory overhead in
applications where such sharing is likely, it also means that equality over
complex expression trees reduces to a simple pointer comparison. The
technique can be used for expression trees, as exemplified by the ATerm tree-handling
library which implements its own garbage collector, and my own cuex
library of culibs which uses the Boehm-Demers-Weiser
collector.
The official libgc codebase handles weak pointers and schedules
finalisation at the end of a collection, when the world is stopped and
mark-bits are in a consistent state.
The actual finalisation happens later, but if the number of finalisers are
comparable to the number of heap objects, there is still a significant
overhead at this critical phase when parallel threads cannot use the
collector.
There is also a significant time and memory overhead if finalisation is used
extensively for small objects, due to hash-tables used for internal
book-keeping.
In heavy tree-processing application, the main part of the memory may be
such small objects.
A Solution
The two things we want to accomplish is
- Implement a mechanism to support finalisation and hash-consing entirely
outside the critical period when the collector marks are kept consistent.
- Reduce time and memory overhead.
Roughly one phase of garbage collection is
- Follow the root set and recursively mark all accessible objects.
Recursivity is handled by pushing objects on a stack, and iteratively
marking objects on the stack and pushing links.
- Do a final mark from dirty pages, with world stopped.
- When there is nothing more to mark, lock the collector. Marks are now
consistent.
- Walk through finalisers, schedule them and mark accessible objects.
Also, make disappearing links disappear.
- Unlock the collector.
- Process finalisation in parallel (preferred), or trigger them from
allocation functions. After each finaliser is run, remove it so that the
finalised object can be collected on the next phase.
The proposed solution is to add a light-weight low-level feature upon
which finalisation and hash-consing can be implemented:
- Add the following properties for objects kinds:
- A flag which indicates that all objects of this kind shall be
followed during the mark phase, even if they are not themselves marked,
essentially treating them as a root set.
This is not needed for hash-consing, but may be desirable in order to
retain memory rechable from finalisers.
- A closure, the disclaim function, which is guaranteed to be
called for objects of this kind after the world is restarted and
before the object is reclaimed.
The collector calls this function to make an explicit request to the
client code to disclaim the object.
This ability to deny a reclaim (aka resurrect) was added to avoid
checking mark-bits of objects upon hash-consing, as this operation can
be slow when not done in bulk.
- The mark phase is extended to support the follow-unconditionally
flag.
- I suggest one of the following two methods for running disclaim
functions:
- On demand.
Call the notifiers when a heap block is reclaimed.
This has the disadvantage that the notifiers are called from threads
allocating memory, potentially introducing concurrency issues for
finalisers.
More precisely, finalisers must not invoke locks which may be held by
the allocating code.
On the other hand, with a bit of care, this can be gracefully handled
by a hash-consing implementation.
- Parallel.
Schedule blocks of kinds with disclaim functions for traversal in a
dedicated finalisation thread.
Each block is marked so that their objects cannot be reclaimed before
the finaliser thread is done with it.
Some care must be taken so that the finaliser thread does not lag
behind the other threads.
Finalisers can now safely allocate memory and share locks with
allocating threads.
What we gain from this:
- There is no per-finaliser or per-hash-cons work to do in the
stopped-world state.
- Objects with finalisers can be reclaimed in the same collection that
they become reclaimable.
- The memory overhead per finaliser is only what is needed to store the
finaliser.
Normally a single pointer to either a function or a closure struct should
suffice.
- Languages which use vtables or similar per-class finalisers can write
a disclaim function which calls the appropriate vtable entry.
Thus, there is no extra per-object overhead for finalisation.
- Hash-consing doesn't require per-object finalisers, either.
Thus, the memory overhead is only that of the hash-table, which includes a
chain-link within the object itself.
Limitations of disclaim-based finalisation:
- Objects with disclaim-based finalisers must have a GC kind with a
suitable disclaim function, and the objects themselves must contain
information about their finaliser.
Thus, it's not possible to register these finalisers on object allocated
with the default kinds.
I suspect this is not a big issue for new code, since finalisation is
usually tied to the type of the object, which is usually know at
allocation time.
- It does not seem to suitable for implementing disappearing links.
Disappearing links must disappear while the world is stopped.
- One inconvenient technicality is that the disclaim functions must be
able to recognise and ignore objects which are part of the free-links.
A solution is to store an odd number in the first word of live objects.
This is where the collector stores the free-link next-pointer, so it is
an even pointer for all free-linked objects.
In case of finalisers, this word can be used to store the address of the
finaliser closure, adding one byte to its address.
Alternatively, the fact that free-linked objects are cleared, except for
the first word, can be utilised.
The high-level finalisation interface in the patch hides this detail from
the end-user.
Better solutions for handling this are welcome.
The Patch
In the following is a summary of what the patch does.
To make sense of it, you should have the sources to the collector, and the
patch itself, which is found in the
download directory.
The patch is named
gc-GC_VERSION-disclaim-PATCH_VERSION.patch.
Extension to Object Kinds and Heap Block Headers
Reclaim notification can be enabled for user-defined object-kinds. Most
importantly, the callback is registered along with closure data. In
addition there is a flag to follow pointers from unmarked objects in
associated heap blocks during the mark phase. The latter is optional and
prevents collection of objects reachable from disclaim functions.
The patch also includes a default kind for finalisable objects.
A debug-enabled version should be added if this patch is accepted by the GC
developers.
The Essential Algorithm Changes
This part of the patch makes sure disclaim functions are called before
objects of the associated kinds are reclaimed. The notifiers are allowed to
resurrect objects.
I found it necessary to support resurrection to support hash-consing,
because of the late callback to the notifier.
Since the notifiers are not called while the world is stopped, they can not
remove unmarked objects from the hash-consing hash-table in time.
That is, during the time between the world is stopped and the time objects
are reclaimed, the application may receive potentially recyclable objects.
The notifier detects this case and prevents these object from being
reclaimed.
Support for Finalization
Build Infrastructure and Test Case
In addition to the above there are two test cases and miscellaneous
changes to the build files.
Last updated 2009-08-14 by
Petter Urkedal.