|
Posted
10 days
ago
by
Erik Brangs
Page edited by Erik Brangs - "typo" NOTE: This page is a work-in-progress! (incomplete) TODO list reviewlink this page from the sidebar once it's ... [More] done General project statusJikes RVM is currently the most popular platform for virtual machine research. This popularity is reflected in the particpation on the mailing lists where most questions can be answered. Memory management research is a particular strength of the Jikes RVM. The Memory Management Toolkit (MMtk) provides a well-rounded selection of garbage collectors and the compiler replay feature enables researchers to control mutator variation. The MMTk test harness can be used to test collectors. In contrast to this, the compilers are currently a weak spot in the Jikes RVM. For example, the Static Single Assignment (SSA) form in the compilers is currently disabled because of bugs. The Jikes RVM is not state-of-the-art in some areas. In particular, Jikes RVM currently does not provide 64-bit Intel Support. Another big limitation is the lack of support for the OpenJDK class library. The project has received community contributions to improve those shortcomings but the code is not yet in the mainline. The Jikes RVM would also profit from efforts directed to stability improvements and bugfixes. For example, Jikes RVM currently cannot run all of the Dacapo 9.12 benchmarks. The Jikes RVM team aims to provide at least one release every year. Note: The information on this page refers to the status in the code repository and not to the status in the current release. Note: If you want to help, please see How to Help or inquire via the mailing lists. Near-term goalsAdd support for the OpenJDK class librariesAdd Intel 64-bit supportGet DaCapo 9.12 running on the Jikes RVMPreliminary long-term goals (still need further discussion)Improve stabilityImprove compliance with JVM specWrite unit tests for all classesImprove and extend test suitesadd support for relevant new platforms (ARM?)Detailed project statusThis section provides more detailed project status information for the components. If you think an important point is missing here, please contact us via the mailing lists. CommunityJikes RVM has a large community in its intended audience (researchers)Core team consists wholly of volunteers: no paid developersJikes RVM is currently not packaged for any major distributionMemory ManagementGenerational Immix (the default collector) is very stableThe other collectors are reasonably stable but have some bugs (as shown by the regression tests)Notable omissions in the collector choices include Baker-style collectors, the Compressor and on-the-fly collectorsRuntimeThe runtime is reasonably modular but it doesn't make very good use of interfacesJikes RVM currently does not follow the JVM specificationSeveral features normally found in commercial JVMs are not implemented: strictfp, JMX and JVMTI are currently unsupportedAdaptive Optimization SystemThe AOS provides a good level of control via compiler replayThe AOS provides clear extension pointsJikes RVM currently uses only one compilation thread at runtime and the current AOS model does not support multiple compilation threadsThe provided AOS models do not support Feedback-Directed OptimizationsCompilersSSA form is disabled. Scalar SSA form may be fixable; Heap SSA form is considered too brokenMany optimizations are disabled because they rely on SSA or are considered too buggySome standard optimizations are missing, e.g. Global Array Bound check eliminationJava Memory Model (JMM) is not correctly implementedInfrastructureRegression tests are run regulary. The results are displayed with Cattrack, a Ruby-on-Rails application.There's currently no infrastructure for CI: Core team members need to ensure they run the pre-commit tests themselves.More regression machines would be useful, in particular PowerPC machines that can be accessed by all team membersCurrently no code review tools in useSome unit tests (via JUnit) exist but most classes don't have unit tests View Online · View Changes Online [Less] |
||||||
|
Posted
12 days
ago
by
Robin Garner
Page edited by Robin Garner *** Work in progress ***This page gives a brief outline of the major control flows in the execution of a garbage collector in MMTk. For ... [More] simplicity, we focus on the MarkSweep collector, although much of the discussion will be relevant to other collectors. This page assumes you have a basic knowledge of garbage collection, for those that don't, please see one of the standard texts such as The Garbage Collection Handbook. Structure of a PlanAn MMTk Plan is required to provide 5 classes. They are required to have consistent names which start with the same name and have a suffix that indicates which class it inherits from. in the case of the MarkSweep plan, the name is "MS". MS - this is a singleton class that is a subclass of org.mmtk.plan.Plan. This class encapsulates data structures that are shared among multiple threads.MSMutator - subclass of org.mmtk.plan.MutatorContext. This class encapsulates data structures that are local to a single mutator thread. In the case of Jikes RVM, a Thread is actually a subclass of this class for efficiency reasons.MSCollector - subclass of org.mmtk.plan.CollectorContext. This provides thread-local data structures specific to a garbage collector thread.MSConstraints - subclass of org.mmtk.plan.PlanConstraints. This provides configuration information that the host virtual machine might need. It is separated out from the Plan class in order to prevent circular class loading dependencies.MSTraceLocal - subclass of org.mmtk.plan.TraceLocal. This provides thread-local data structures specific to a particular way of traversing the heap. In a simple collector like MarkSweep, there is only one of these classes, but in more complex collectors there may be several. For example, in a generational collector, there will be one TraceLocal class for a nursery collection, and another for a full-heap collection.The basic architecture of MMTk is that virtual address space is divided into chunks (of 4MB in a 32-bit memory model) that are managed according to a specific policy. A policy is implemented by an instance of the Space class, and it is in the policy class that the mechanics of a particular mechanism (like mark-sweep) is implemented. The task of a Plan is to create the policy (Space) objects that manage the heap, and to integrate them into the MMTk framework. MMTk exposes some of this memory management policy to the host VM, by allowing the VM to specify an allocator (represented by a small integer) when allocating space. The interface exposed to the VM allows it to choose whether an object will move during collection or not, whether the object is large enough to require special handling etc. The MMTk plan is free (within the semantic guarantees exposed to the VM) to direct each of these allocators to a particular policy.PoliciesA policy describes how a range of virtual address space is managed. The base class of all policies is org.mmtk.policy.Space, and a particular instance of a policy is known generically as a space. The static initializer of a Plan and its subclasses define the spaces that make up an MMTk plan. MS.java public static final MarkSweepSpace msSpace = new MarkSweepSpace("ms", VMRequest.create()); public static final int MARK_SWEEP = msSpace.getDescriptor(); In this code fragment, we see the MS plan defined. Note that we generally also define a static final space descriptor. This is an optimization that allows some rapid operations on spaces. A Space is a global object, shared among multiple mutator threads. Each policy will also have one or more thread-local classes which provide unsynchronized allocation. These classes are subclasses of org.mmtk.utility.alloc.Allocator, and in the case of MarkSweep, it is called MarkSweepLocal. Instances of MarkSweepLocal are created as part of a mutator context, like this MSMutator.java protected MarkSweepLocal ms = new MarkSweepLocal(MS.msSpace); The design pattern is that the local Allocator will allocate space from a thread-local buffer, and when that is exhausted it will allocate a new buffer from the global Space, performing appropriate locking. The constructor of the MarkSweepLocal specifies the space from which the allocator will allocate global memory. AllocationMMTk provides two methods for allocating an object. These are provided by the MSMutator class, to give each plan the opportunity to use fast, unsynchronized thread-local allocation before falling back to a slower synchronized slow-path. The version implemented in MarkSweep looks like this: MSMutator.java public Address alloc(int bytes, int align, int offset, int allocator, int site) { if (allocator == MS.ALLOC_DEFAULT) { return ms.alloc(bytes, align, offset); } return super.alloc(bytes, align, offset, allocator, site); } The basic structure of this method is common to all MMTk plans. First they decide whether the operation applies to this level of abstraction (if (allocator == MS.ALLOC_DEFAULT)), and if so, delegate to the appropriate place, otherwise pass it up the chain to the super-class. In the case of MarkSweep, MSMutator delegates the allocation to its thread-local MarkSweepLocal object ms. The alloc method of MarkSweepLocal is inherited from SegregatedFreeListLocal (mark-sweep is not the only way of managing free-list allocation), and looks like this SegregatedFreeListLocal.java (simplified) public final Address alloc(int bytes, int align, int offset) { int sizeClass = getSizeClass(bytes); Address cell = freeList.get(sizeClass); if (!cell.isZero()) { freeList.set(sizeClass, cell.loadAddress()); /* Clear the free list link */ cell.store(Address.zero()); return cell; } return allocSlow(bytes, align, offset); } This is a standard pattern for thread-local allocation: first we look in the thread-local space (line 3), and if successful return the result (lines 4-8). If unsuccessful, we request space from the global policy via the method Allocator.allocSlow. This is the common interface that all Allocators use to request space from the global policy. This will eventually call the allocator-specific allocSlowOnce method. The workings of the allocSlowOnce method are very policy-specific, so not appropriate to look at at this stage, but eventually all policies will attempt to acquire fresh virtual memory via the Space.acquire method. Space.acquire is the only correct way for a policy to allocate new virtual memory for its own use. Space.java (simplified) public final Address acquire(int pages) { pr.reservePages(pages); /* Poll, either fixing budget or requiring GC */ if (VM.activePlan.global().poll(false, this)) { VM.collection.blockForGC(); return Address.zero(); // GC required, return failure } /* Page budget is ok, try to acquire virtual memory */ Address rtn = pr.getNewPages(pagesReserved, pages, zeroed); if (rtn.isZero()) { /* Failed, so force a GC */ boolean gcPerformed = VM.activePlan.global().poll(true, this); VM.collection.blockForGC(); return Address.zero(); } return rtn; } The logic of space.acquire is: First, poll the plan to find out whether the heap is full. This logic is performed by the plan, because it has knowledge of copy reserves etc.The 'poll' method will request a GC if required, and return true if it has done so.Then we wait for GC if required. 'poll' can't wait, because it is called in circumstances that aren't GC safe.If Plan.poll(...) returns false (we are within the allowed heap size), we call pr.getNewPages to allocate virtual memory. At this stage we can find that we have run out of virtual memory, and if so, we force a GCIf a GC is performed, we return Address.zero(), rather than retrying locally. In many plans, the next allocation request will be satisfied by re-using space in a page that already belongs to a policy, so the post-GC allocation must be performed further up in the call stack. The retry logic is handled in Allocator.allocSlowInline. Allocator.java (simplified) public final Address allocSlowInline(int bytes, int alignment, int offset) { boolean emergencyCollection = false; while (true) { Address result = allocSlowOnce(bytes, alignment, offset); if (!result.isZero()) { return result; } if (emergencyCollection) { VM.collection.outOfMemory(); } emergencyCollection = Plan.isEmergencyCollection(); } } This code fragment shows the retry logic in the allocator. We try allocating using allocSlowOnce, which may recycle partially-used blocks and eventually call Space.acquire. If a GC occurred, we try again. Eventually the plan will request an emergency collection which will (for example) cause soft references to be dropped. If this fails we throw an OutOfMemoryError. CollectionSchedulingIn a stop-the-world garbage collector like MarkSweep, the mutator threads run until memory is exhausted, then all mutator threads are suspended, the collector threads are activated, and they perform a garbage collection. After the GC is complete, the collector threads are suspended and the mutator threads resume. MMTk also has some support for concurrent collectors, in which one or more collector threads can be scheduled to run alongside the mutator, either exclusively or in addition to (hopefully briefer) stop-the-world phases. Thread scheduling in MMTk is handled by a GC controller thread, implemented in the singleton class org.mmtk.plan.ControllerCollectorContext held in the static field Plan.controlCollectorContext. Whenever a collection is initiated, it is done by calling methods on this object. InitiatingAs mentioned above, every attempt to allocate fresh virtual memory calls the current plan's poll(...) method. This initiates a GC by calling controlCollectorContext.request(), which in a stop-the-world collector like MarkSweep pauses the mutator threads and then wakes the collector threads. The main loop of the garbage collector is simply the run() method of ParallelCollector, shown below. ParallelCollector public void run() { while(true) { park(); collect(); } } The collect() method is specific to the type of collector, and in StopTheWorldCollector it looks like this StopTheWorldCollector public void collect() { Phase.beginNewPhaseStack(Phase.scheduleComplex(global().collection)); } Collector PhasesEvery garbage collection consists of a series of steps. Each step is either executed once (e.g. updating the mark state before marking the heap), or in parallel on all available collector threads (e.g. the parallel mark phase). The actual work of a step is done by the collectionPhase method of the global, collector or mutator class of a plan. In early versions of MMTk, the main collection method was a template method, calling individual methods for each phase of the collection. As the number of collectors in MMTk grew, this became unwieldy and has been replaced with a configurable mechanism of phases. The class org.mmtk.plan.Simple defines the basic structure of most of MMTk's garbage collectors. First it defines the phases themselves, Simple.java public static final short SET_COLLECTION_KIND = Phase.createSimple("set-collection-kind", null); public static final short INITIATE = Phase.createSimple("initiate", null); public static final short PREPARE = Phase.createSimple("prepare"); ... Each phase of the collection is represented by a 16-bit integer, an index into a table of Phase objects. Simple phases are scheduled, and combined into sequences, or complex phases. Simple.java /** Ensure stacks are ready to be scanned */ protected static final short prepareStacks = Phase.createComplex("prepare-stacks", null, Phase.scheduleMutator (PREPARE_STACKS), Phase.scheduleGlobal (PREPARE_STACKS)); A simple phase can be scheduled in one of 4 ways: Global. One collector thread is chosen to run the collectionPhase method of the global Plan object.Collector. All collector threads run collectionPhase of the plan's CollectorContext object(s).Mutator. The collector threads run in parallel and iterate over the available MutatorContext objects (ie the mutator threads), and run the mutator's collectionPhase method. Note that the collector threads are performing work on a per-mutator basis, because in general the mutator threads are stopped at this point.Concurrent. The controller is requested to start a concurrent collectcor thread.Between every phase of a collection, the collector threads rendezvous at a synchronization barrier. The actual execution of a collector's phases is done in the method Phase.processPhaseStack. This method handles resuming a concurrent collection as well as running a full stop-the-world collection. The actual work of a collection phase is done (as mentioned above) in the collectionPhase method of the major Plan classes. MS.java @Inline @Override public void collectionPhase(short phaseId) { if (phaseId == PREPARE) { super.collectionPhase(phaseId); msTrace.prepare(); msSpace.prepare(true); return; } if (phaseId == CLOSURE) { msTrace.prepare(); return; } if (phaseId == RELEASE) { msTrace.release(); msSpace.release(); super.collectionPhase(phaseId); return; } super.collectionPhase(phaseId); } This excerpt shows how the global MS plan implements collectionPhase, illustrating the key phases of a simple stop-the-world collector. The prepare phase performs tasks such as changing the mark state, the closure phase performs a transifive closure over the heap (the mark phase of a mark-sweep algorithm) and the release phase performs any post-collection steps. Where possible, a plan is structured so that each layer of inheritance deals only with the objects it creates, i.e. the MS class operates on the msSpace and delegates work on all other spaces to the super-class where they are defined. By convention the PREPARE phase is performed outside-in (super-class preparation first) and RELEASE is done inside-out (local first, super-class second). Tracing the heapThe main operation of a tracing collector is the transitive closure operation where all (or a subset) of the object graph is visited. Some collectors such as generational collectors perform these operations in more than one way, e.g. a nursery collection in a generational collector does not trace through pointers into the mature space, while a full-heap collection does. All MMTk collectors are designed to run using several parallel threads, using data structures that have unsynchronized thread-local and synchronized global components in the same way as MMTk's policy classes. MMTk's trace operation uses the following terminology: An edge is a reference in the heap from one reference field to the object (or node) it points to.Tracing an object is the policy-defined operation performed by the collector on an object. In a mark-sweep policy this means setting the mark state of the object. In a copying policy this means moving the object to its new location.Scanning is the process of identifying the reference fields of an object and processing the objects reachable from each of them.Each distinct transitive closure operation is defined as a subclass of TraceLocal. The closure is performed in the collectionPhase method of the plan-specific CollectorContext class MSCollector.java public void collectionPhase(short phaseId, boolean primary) { ... if (phaseId == MS.CLOSURE) { fullTrace.completeTrace(); return; } ... } The initial starting point for the closure is computed by the STACK_ROOTS and ROOTS phases, which add root locations to a buffer by calling TraceLocal.reportDelayedRootEdge. The closure operation proceeds by invoking traceObiect on each root location (in method processRootEdge), and then invoking scanObject on each heap object encountered. Note that the CLOSURE operation is performed multiple times in each GC, due to processing of reference types. View Online · View Changes Online [Less] |
||||||
|
Posted
15 days
ago
by
Robin Garner
Page edited by Robin Garner *** Work in progress ***This page gives a brief outline of the major control flows in the execution of a garbage collector in MMTk. For ... [More] simplicity, we focus on the MarkSweep collector, although much of the discussion will be relevant to other collectors. This page assumes you have a basic knowledge of garbage collection, for those that don't, please see one of the standard texts such as The Garbage Collection Handbook. Structure of a PlanAn MMTk Plan is required to provide 5 classes. They are required to have consistent names which start with the same name and have a suffix that indicates which class it inherits from. in the case of the MarkSweep plan, the name is "MS". MS - this is a singleton class that is a subclass of org.mmtk.plan.Plan. This class encapsulates data structures that are shared among multiple threads.MSMutator - subclass of org.mmtk.plan.MutatorContext. This class encapsulates data structures that are local to a single mutator thread. In the case of Jikes RVM, a Thread is actually a subclass of this class for efficiency reasons.MSCollector - subclass of org.mmtk.plan.CollectorContext. This provides thread-local data structures specific to a garbage collector thread.MSConstraints - subclass of org.mmtk.plan.PlanConstraints. This provides configuration information that the host virtual machine might need. It is separated out from the Plan class in order to prevent circular class loading dependencies.MSTraceLocal - subclass of org.mmtk.plan.TraceLocal. This provides thread-local data structures specific to a particular way of traversing the heap. In a simple collector like MarkSweep, there is only one of these classes, but in more complex collectors there may be several. For example, in a generational collector, there will be one TraceLocal class for a nursery collection, and another for a full-heap collection.The basic architecture of MMTk is that virtual address space is divided into chunks (of 4MB in a 32-bit memory model) that are managed according to a specific policy. A policy is implemented by an instance of the Space class, and it is in the policy class that the mechanics of a particular mechanism (like mark-sweep) is implemented. The task of a Plan is to create the policy (Space) objects that manage the heap, and to integrate them into the MMTk framework. MMTk exposes some of this memory management policy to the host VM, by allowing the VM to specify an allocator (represented by a small integer) when allocating space. The interface exposed to the VM allows it to choose whether an object will move during collection or not, whether the object is large enough to require special handling etc. The MMTk plan is free (within the semantic guarantees exposed to the VM) to direct each of these allocators to a particular policy.PoliciesA policy describes how a range of virtual address space is managed. The base class of all policies is org.mmtk.policy.Space, and a particular instance of a policy is known generically as a space. The static initializer of a Plan and its subclasses define the spaces that make up an MMTk plan. MS.java public static final MarkSweepSpace msSpace = new MarkSweepSpace("ms", VMRequest.create()); public static final int MARK_SWEEP = msSpace.getDescriptor(); In this code fragment, we see the MS plan defined. Note that we generally also define a static final space descriptor. This is an optimization that allows some rapid operations on spaces. A Space is a global object, shared among multiple mutator threads. Each policy will also have one or more thread-local classes which provide unsynchronized allocation. These classes are subclasses of org.mmtk.utility.alloc.Allocator, and in the case of MarkSweep, it is called MarkSweepLocal. Instances of MarkSweepLocal are created as part of a mutator context, like this MSMutator.java protected MarkSweepLocal ms = new MarkSweepLocal(MS.msSpace); The design pattern is that the local Allocator will allocate space from a thread-local buffer, and when that is exhausted it will allocate a new buffer from the global Space, performing appropriate locking. The constructor of the MarkSweepLocal specifies the space from which the allocator will allocate global memory. AllocationMMTk provides two methods for allocating an object. These are provided by the MSMutator class, to give each plan the opportunity to use fast, unsynchronized thread-local allocation before falling back to a slower synchronized slow-path. The version implemented in MarkSweep looks like this: MSMutator.java public Address alloc(int bytes, int align, int offset, int allocator, int site) { if (allocator == MS.ALLOC_DEFAULT) { return ms.alloc(bytes, align, offset); } return super.alloc(bytes, align, offset, allocator, site); } The basic structure of this method is common to all MMTk plans. First they decide whether the operation applies to this level of abstraction (if (allocator == MS.ALLOC_DEFAULT)), and if so, delegate to the appropriate place, otherwise pass it up the chain to the super-class. In the case of MarkSweep, MSMutator delegates the allocation to its thread-local MarkSweepLocal object ms. The alloc method of MarkSweepLocal is inherited from SegregatedFreeListLocal (mark-sweep is not the only way of managing free-list allocation), and looks like this SegregatedFreeListLocal.java (simplified) public final Address alloc(int bytes, int align, int offset) { int sizeClass = getSizeClass(bytes); Address cell = freeList.get(sizeClass); if (!cell.isZero()) { freeList.set(sizeClass, cell.loadAddress()); /* Clear the free list link */ cell.store(Address.zero()); return cell; } return allocSlow(bytes, align, offset); } This is a standard pattern for thread-local allocation: first we look in the thread-local space (line 3), and if successful return the result (lines 4-8). If unsuccessful, we request space from the global policy via the method Allocator.allocSlow. This is the common interface that all Allocators use to request space from the global policy. This will eventually call the allocator-specific allocSlowOnce method. The workings of the allocSlowOnce method are very policy-specific, so not appropriate to look at at this stage, but eventually all policies will attempt to acquire fresh virtual memory via the Space.acquire method. Space.acquire is the only correct way for a policy to allocate new virtual memory for its own use. Space.java (simplified) public final Address acquire(int pages) { pr.reservePages(pages); /* Poll, either fixing budget or requiring GC */ if (VM.activePlan.global().poll(false, this)) { VM.collection.blockForGC(); return Address.zero(); // GC required, return failure } /* Page budget is ok, try to acquire virtual memory */ Address rtn = pr.getNewPages(pagesReserved, pages, zeroed); if (rtn.isZero()) { /* Failed, so force a GC */ boolean gcPerformed = VM.activePlan.global().poll(true, this); VM.collection.blockForGC(); return Address.zero(); } return rtn; } The logic of space.acquire is: First, poll the plan to find out whether the heap is full. This logic is performed by the plan, because it has knowledge of copy reserves etc.The 'poll' method will request a GC if required, and return true if it has done so.Then we wait for GC if required. 'poll' can't wait, because it is called in circumstances that aren't GC safe.If Plan.poll(...) returns false (we are within the allowed heap size), we call pr.getNewPages to allocate virtual memory. At this stage we can find that we have run out of virtual memory, and if so, we force a GCIf a GC is performed, we return Address.zero(), rather than retrying locally. In many plans, the next allocation request will be satisfied by re-using space in a page that already belongs to a policy, so the post-GC allocation must be performed further up in the call stack. The retry logic is handled in Allocator.allocSlowInline. Allocator.java (simplified) public final Address allocSlowInline(int bytes, int alignment, int offset) { boolean emergencyCollection = false; while (true) { Address result = allocSlowOnce(bytes, alignment, offset); if (!result.isZero()) { return result; } if (emergencyCollection) { VM.collection.outOfMemory(); } emergencyCollection = Plan.isEmergencyCollection(); } } This code fragment shows the retry logic in the allocator. We try allocating using allocSlowOnce, which may recycle partially-used blocks and eventually call Space.acquire. If a GC occurred, we try again. Eventually the plan will request an emergency collection which will (for example) cause soft references to be dropped. If this fails we throw an OutOfMemoryError. CollectionSchedulingIn a stop-the-world garbage collector like MarkSweep, the mutator threads run until memory is exhausted, then all mutator threads are suspended, the collector threads are activated, and they perform a garbage collection. After the GC is complete, the collector threads are suspended and the mutator threads resume. MMTk also has some support for concurrent collectors, in which one or more collector threads can be scheduled to run alongside the mutator, either exclusively or in addition to (hopefully briefer) stop-the-world phases. Thread scheduling in MMTk is handled by a GC controller thread, implemented in the singleton class org.mmtk.plan.ControllerCollectorContext held in the static field Plan.controlCollectorContext. Whenever a collection is initiated, it is done by calling methods on this object. InitiatingAs mentioned above, every attempt to allocate fresh virtual memory calls the current plan's poll(...) method. This initiates a GC by calling controlCollectorContext.request(), which in a stop-the-world collector like MarkSweep pauses the mutator threads and then wakes the collector threads. The main loop of the garbage collector is simply the run() method of ParallelCollector, shown below. ParallelCollector public void run() { while(true) { park(); collect(); } } The collect() method is specific to the type of collector, and in StopTheWorldCollector it looks like this StopTheWorldCollector public void collect() { Phase.beginNewPhaseStack(Phase.scheduleComplex(global().collection)); } Collector PhasesEvery garbage collection consists of a series of steps. Each step is either executed once (e.g. updating the mark state before marking the heap), or in parallel on all available collector threads (e.g. the parallel mark phase). The actual work of a step is done by the collectionPhase method of the global, collector or mutator class of a plan. In early versions of MMTk, the main collection method was a template method, calling individual methods for each phase of the collection. As the number of collectors in MMTk grew, this became unwieldy and has been replaced with a configurable mechanism of phases. The class org.mmtk.plan.Simple defines the basic structure of most of MMTk's garbage collectors. First it defines the phases themselves, Simple.java public static final short SET_COLLECTION_KIND = Phase.createSimple("set-collection-kind", null); public static final short INITIATE = Phase.createSimple("initiate", null); public static final short PREPARE = Phase.createSimple("prepare"); ... Each phase of the collection is represented by a 16-bit integer, an index into a table of Phase objects. Simple phases are scheduled, and combined into sequences, or complex phases. Simple.java /** Ensure stacks are ready to be scanned */ protected static final short prepareStacks = Phase.createComplex("prepare-stacks", null, Phase.scheduleMutator (PREPARE_STACKS), Phase.scheduleGlobal (PREPARE_STACKS)); A simple phase can be scheduled in one of 4 ways: Global. One collector thread is chosen to run the collectionPhase method of the global Plan object.Collector. All collector threads run collectionPhase of the plan's CollectorContext object(s).Mutator. The collector threads run in parallel and iterate over the available MutatorContext objects (ie the mutator threads), and run the mutator's collectionPhase method. Note that the collector threads are performing work on a per-mutator basis, because in general the mutator threads are stopped at this point.Concurrent. The controller is requested to start a concurrent collectcor thread.Between every phase of a collection, the collector threads rendezvous at a synchronization barrier. The actual execution of a collector's phases is done in the method Phase.processPhaseStack. This method handles resuming a concurrent collection as well as running a full stop-the-world collection. The actual work of a collection phase is done (as mentioned above) in the collectionPhase method of the major Plan classes. MS.java @Inline @Override public void collectionPhase(short phaseId) { if (phaseId == PREPARE) { super.collectionPhase(phaseId); msTrace.prepare(); msSpace.prepare(true); return; } if (phaseId == CLOSURE) { msTrace.prepare(); return; } if (phaseId == RELEASE) { msTrace.release(); msSpace.release(); super.collectionPhase(phaseId); return; } super.collectionPhase(phaseId); } This excerpt shows how the global MS plan implements collectionPhase, illustrating the key phases of a simple stop-the-world collector. The prepare phase performs tasks such as changing the mark state, the closure phase performs a transifive closure over the heap (the mark phase of a mark-sweep algorithm) and the release phase performs any post-collection steps. Where possible, a plan is structured so that each layer of inheritance deals only with the objects it creates, i.e. the MS class operates on the msSpace and delegates work on all other spaces to the super-class where they are defined. By convention the PREPARE phase is performed outside-in (super-class preparation first) and RELEASE is done inside-out (local first, super-class second). Tracing the heapThe main operation of a tracing collector is the transitive closure operation where all (or a subset) of the object graph is visited. Some collectors such as generational collectors perform these operations in more than one way, e.g. a nursery collection in a generational collector does not trace through pointers into the mature space, while a full-heap collection does. All MMTk collectors are designed to run using several parallel threads, using data structures that have unsynchronized thread-local and synchronized global components in the same way as MMTk's policy classes. MMTk's trace operation uses the following terminology: An edge is a reference in the heap from one reference field to the object (or node) it points to.Tracing an object is the policy-defined operation performed by the collector on an object. In a mark-sweep policy this means setting the mark state of the object. In a copying policy this means moving the object to its new location.Scanning ...Each distinct transitive closure operation is defined as a subclass of TraceLocal. The closure is performed in the collectionPhase method of the CollectorContext class MSCollector.java public void collectionPhase(short phaseId, boolean primary) { ... if (phaseId == MS.CLOSURE) { fullTrace.completeTrace(); return; } ... } The initial starting point for the closure is computed by the STACK_ROOTS and ROOTS phases, which add root locations to a buffer by calling TraceLocal.reportDelayedRootEdge. The closure operation proceeds by invoking traceObiect on each root location (in method processRootEdge), and then invoking scanObject on each heap object encountered. View Online · View Changes Online [Less] |
||||||
|
Posted
18 days
ago
by
Robin Garner
Page edited by Robin Garner *** Work in progress ***This page gives a brief outline of the major control flows in the execution of a garbage collector in MMTk. For ... [More] simplicity, we focus on the MarkSweep collector, although much of the discussion will be relevant to other collectors. This page assumes you have a basic knowledge of garbage collection, for those that don't, please see one of the standard texts such as The Garbage Collection Handbook. Structure of a PlanAn MMTk Plan is required to provide 5 classes. They are required to have consistent names which start with the same name and have a suffix that indicates which class it inherits from. in the case of the MarkSweep plan, the name is "MS". MS - this is a singleton class that is a subclass of org.mmtk.plan.Plan. This class encapsulates data structures that are shared among multiple threads.MSMutator - subclass of org.mmtk.plan.MutatorContext. This class encapsulates data structures that are local to a single mutator thread. In the case of Jikes RVM, a Thread is actually a subclass of this class for efficiency reasons.MSCollector - subclass of org.mmtk.plan.CollectorContext. This provides thread-local data structures specific to a garbage collector thread.MSConstraints - subclass of org.mmtk.plan.PlanConstraints. This provides configuration information that the host virtual machine might need. It is separated out from the Plan class in order to prevent circular class loading dependencies.MSTraceLocal - subclass of org.mmtk.plan.TraceLocal. This provides thread-local data structures specific to a particular way of traversing the heap. In a simple collector like MarkSweep, there is only one of these classes, but in more complex collectors there may be several. For example, in a generational collector, there will be one TraceLocal class for a nursery collection, and another for a full-heap collection.The basic architecture of MMTk is that virtual address space is divided into chunks (of 4MB in a 32-bit memory model) that are managed according to a specific policy. A policy is implemented by an instance of the Space class, and it is in the policy class that the mechanics of a particular mechanism (like mark-sweep) is implemented. The task of a Plan is to create the policy (Space) objects that manage the heap, and to integrate them into the MMTk framework. MMTk exposes some of this memory management policy to the host VM, by allowing the VM to specify an allocator (represented by a small integer) when allocating space. The interface exposed to the VM allows it to choose whether an object will move during collection or not, whether the object is large enough to require special handling etc. The MMTk plan is free (within the semantic guarantees exposed to the VM) to direct each of these allocators to a particular policy.PoliciesA policy describes how a range of virtual address space is managed. The base class of all policies is org.mmtk.policy.Space, and a particular instance of a policy is known generically as a space. The static initializer of a Plan and its subclasses define the spaces that make up an MMTk plan. MS.java public static final MarkSweepSpace msSpace = new MarkSweepSpace("ms", VMRequest.create()); public static final int MARK_SWEEP = msSpace.getDescriptor(); In this code fragment, we see the MS plan defined. Note that we generally also define a static final space descriptor. This is an optimization that allows some rapid operations on spaces. A Space is a global object, shared among multiple mutator threads. Each policy will also have one or more thread-local classes which provide unsynchronized allocation. These classes are subclasses of org.mmtk.utility.alloc.Allocator, and in the case of MarkSweep, it is called MarkSweepLocal. Instances of MarkSweepLocal are created as part of a mutator context, like this MSMutator.java protected MarkSweepLocal ms = new MarkSweepLocal(MS.msSpace); The design pattern is that the local Allocator will allocate space from a thread-local buffer, and when that is exhausted it will allocate a new buffer from the global Space, performing appropriate locking. The constructor of the MarkSweepLocal specifies the space from which the allocator will allocate global memory. AllocationMMTk provides two methods for allocating an object. These are provided by the MSMutator class, to give each plan the opportunity to use fast, unsynchronized thread-local allocation before falling back to a slower synchronized slow-path. The version implemented in MarkSweep looks like this: MSMutator.java public Address alloc(int bytes, int align, int offset, int allocator, int site) { if (allocator == MS.ALLOC_DEFAULT) { return ms.alloc(bytes, align, offset); } return super.alloc(bytes, align, offset, allocator, site); } The basic structure of this method is common to all MMTk plans. First they decide whether the operation applies to this level of abstraction (if (allocator == MS.ALLOC_DEFAULT)), and if so, delegate to the appropriate place, otherwise pass it up the chain to the super-class. In the case of MarkSweep, MSMutator delegates the allocation to its thread-local MarkSweepLocal object ms. The alloc method of MarkSweepLocal is inherited from SegregatedFreeListLocal (mark-sweep is not the only way of managing free-list allocation), and looks like this SegregatedFreeListLocal.java (simplified) public final Address alloc(int bytes, int align, int offset) { int sizeClass = getSizeClass(bytes); Address cell = freeList.get(sizeClass); if (!cell.isZero()) { freeList.set(sizeClass, cell.loadAddress()); /* Clear the free list link */ cell.store(Address.zero()); return cell; } return allocSlow(bytes, align, offset); } This is a standard pattern for thread-local allocation: first we look in the thread-local space (line 3), and if successful return the result (lines 4-8). If unsuccessful, we request space from the global policy via the method Allocator.allocSlow. This is the common interface that all Allocators use to request space from the global policy. This will eventually call the allocator-specific allocSlowOnce method. The workings of the allocSlowOnce method are very policy-specific, so not appropriate to look at at this stage, but eventually all policies will attempt to acquire fresh virtual memory via the Space.acquire method. Space.acquire is the only correct way for a policy to allocate new virtual memory for its own use. Space.java (simplified) public final Address acquire(int pages) { pr.reservePages(pages); /* Poll, either fixing budget or requiring GC */ if (VM.activePlan.global().poll(false, this)) { VM.collection.blockForGC(); return Address.zero(); // GC required, return failure } /* Page budget is ok, try to acquire virtual memory */ Address rtn = pr.getNewPages(pagesReserved, pages, zeroed); if (rtn.isZero()) { /* Failed, so force a GC */ boolean gcPerformed = VM.activePlan.global().poll(true, this); VM.collection.blockForGC(); return Address.zero(); } return rtn; } The logic of space.acquire is: First, poll the plan to find out whether the heap is full. This logic is performed by the plan, because it has knowledge of copy reserves etc.The 'poll' method will request a GC if required, and return true if it has done so.Then we wait for GC if required. 'poll' can't wait, because it is called in circumstances that aren't GC safe.If Plan.poll(...) returns false (we are within the allowed heap size), we call pr.getNewPages to allocate virtual memory. At this stage we can find that we have run out of virtual memory, and if so, we force a GCIf a GC is performed, we return Address.zero(), rather than retrying locally. In many plans, the next allocation request will be satisfied by re-using space in a page that already belongs to a policy, so the post-GC allocation must be performed further up in the call stack. The retry logic is handled in Allocator.allocSlowInline. Allocator.java (simplified) public final Address allocSlowInline(int bytes, int alignment, int offset) { boolean emergencyCollection = false; while (true) { Address result = allocSlowOnce(bytes, alignment, offset); if (!result.isZero()) { return result; } if (emergencyCollection) { VM.collection.outOfMemory(); } emergencyCollection = Plan.isEmergencyCollection(); } } This code fragment shows the retry logic in the allocator. We try allocating using allocSlowOnce, which may recycle partially-used blocks and eventually call Space.acquire. If a GC occurred, we try again. Eventually the plan will request an emergency collection which will (for example) cause soft references to be dropped. If this fails we throw an OutOfMemoryError. CollectionSchedulingIn a stop-the-world garbage collector like MarkSweep, the mutator threads run until memory is exhausted, then all mutator threads are suspended, the collector threads are activated, and they perform a garbage collection. After the GC is complete, the collector threads are suspended and the mutator threads resume. MMTk also has some support for concurrent collectors, in which one or more collector threads can be scheduled to run alongside the mutator, either exclusively or in addition to (hopefully briefer) stop-the-world phases. Thread scheduling in MMTk is handled by a GC controller thread, implemented in the singleton class org.mmtk.plan.ControllerCollectorContext held in the static field Plan.controlCollectorContext. Whenever a collection is initiated, it is done by calling methods on this object. InitiatingAs mentioned above, every attempt to allocate fresh virtual memory calls the current plan's poll(...) method. This initiates a GC by calling controlCollectorContext.request(), which in a stop-the-world collector like MarkSweep pauses the mutator threads and then wakes the collector threads. The main loop of the garbage collector is simply the run() method of ParallelCollector, shown below. ParallelCollector public void run() { while(true) { park(); collect(); } } The collect() method is specific to the type of collector, and in StopTheWorldCollector it looks like this StopTheWorldCollector public void collect() { Phase.beginNewPhaseStack(Phase.scheduleComplex(global().collection)); } Collector PhasesEvery garbage collection consists of a series of steps. Each step is either executed once (e.g. updating the mark state before marking the heap), or in parallel on all available collector threads (e.g. the parallel mark phase). In early versions of MMTk, the main collection method was a template method, calling individual methods for each phase of the collection. As the number of collectors in MMTk grew, this became unwieldy and has been replaced with a configurable mechanism of phases. The class org.mmtk.plan.Simple defines the basic structure of most of MMTk's garbage collectors. First it defines the phases themselves, Simple.java public static final short SET_COLLECTION_KIND = Phase.createSimple("set-collection-kind", null); public static final short INITIATE = Phase.createSimple("initiate", null); public static final short PREPARE = Phase.createSimple("prepare"); ... Each phase of the collection is represented by a 16-bit integer, an index into a table of Phase objects. View Online · View Changes Online [Less] |
||||||
|
Posted
18 days
ago
by
Robin Garner
Page edited by Robin Garner The garbage collectors for Jikes RVM are provided by MMTk. The MMTk: The Memory Manager Toolkit describes MMTk and gives a tutorial on how to use ... [More] and edit it and is the best place to start. A detailed description of the call chain from the compilers through to MMTk here is another good place to start understanding how MMTk integrates with JikesRVM. The RVM can be configured to employ various different allocation managers taken from the MMTk memory management toolkit. Managers divide the available space up as they see fit. However, they normally subdivide the available address range to provide: a metadata area which enables the manager to track the status of allocated and unallocated storage in the rest of the heap.an immortal data area used to service allocations of objects which are expected to persist across the whole lifetime of the RVM runtime.a large object space used to service allocations of objects which are larger than some specified size (e.g. a virtual memory page) - the large object space may employ a different allocation and reclamation strategy to that used for other objects.a small object allocation area which may be divided into e.g.two semi spaces, a nursery space and a mature space, a set of generations, a non-relocatable buddy hierarchy etc depending upon the allocation and reclamation strategy employed by the memory manager.Virtual memory pages are lazily mapped into the RVM's memory image as they are needed. The main class which is used to interface to the memory manager is called Plan. Each flavor of the manager is implemented by substituting a different implementation of this class. Most plans inherit from class StopTheWorldGC which ensures that all active mutator threads (i.e. ones which do not perform the job of reclaiming storage) are suspended before reclamation is commenced. The argument passed to -X:processors determines the number of parallel collector threads that will be used for collection. Generational collectors employ a plan which inherits from class Generational Inter alia, this class ensures that a write barrier is employed so that updates from old to new spaces are detected. The RVM does not currently support concurrent garbage collection. Jikes RVM may also use the GCSpy visualization framework. GCSpy allows developers to observe the behavior of the heap and related data structures. View Online · View Changes Online [Less] |
||||||
|
Posted
21 days
ago
by
Erik Brangs
Page edited by Erik Brangs - "added Carl Ritson" Jikes RVM TeamThe Jikes RVM team is responsible for creating or applying the improvements that go into the ... [More] releases of Jikes RVM. The current team is: Steve Blackburn, Australian National University, (Jikes RVM steering committee)Michael Bond, Ohio State UniversityErik BrangsRobin Garner, Australian National UniversityDavid Grove, IBM Research, (Jikes RVM steering committee)Andrew John Hughes, Red HatJ. Eliot B. Moss, University of MassachusettsPast Jikes RVM Team MembersThe following people have greatly contributed to the success of the Jikes RVM project by having served as Jikes RVM team members: Steve AugartPerry Cheng, IBM ResearchJulian Dolby, IBM ResearchPeter Donald, La Trobe UniversityStephen Fink, IBM ResearchDaniel Frampton, Australian National UniversityMatthias Hauswirth, University of Colorado at BoulderMichael Hind, IBM ResearchChris Hoffmann, University of MassachussettsFilip Pizlo, Purdue UniversityFeng Qian, McGill UniversityIan Rogers, University of ManchesterPeter F. Sweeney, IBM ResearchKris Venstermans, University of GhentContributions From Jikes RVM CommunityThe Jikes RVM project sincerely thanks the following people who have made contributions to the system: Eddie AftandilianSteven AugartMichael BaerJames BornholtGreg BorotaErik BrangsShane BrewerBrian D. CarlstromPeter DonaldPhilippe FaesDa FengChapman FlackDaniel FramptonRobin GarnerGeorgios GousiosAndrew GrayJungwoo HaMatthias HauswirthLaurence HellyerMatthew HertzMark HindessChris HoffmannKenneth HosteXianglong HuangRichard JonesGarrett KolpinChristos-Efthymois KotselidisSergiy KyrylkovAlan LawrenceKien LeHan LeeJohn LeunerLukas LoehrerDmitri MakarovAvery MoonJ. Eliot B. MossElias NaurAnders Biehl NorgaardJeff PalmTuan PhanFilip PizloKathiravelu PradeebanCarl RitsonJoão Reys SantosAndreas SeweRifat ShahriyarAleksey ShipilevJeremy SingerStephen SmaldoneSunil SomanDarko StefanovicSuryia SubramanianTom VanDrunenIan WarringtonMark WielaardXi YangYuval YaromLingli ZhangJisheng ZhaoLei ZhaoSoftware Used by Jikes RVMJikes RVM uses either the class libraries produced by the GNU Classpath project or Apache Harmony. Thanks to David R. Hanson,Christoper W. Fraser, and Todd Proebsting for making available the iburg tool, which we've enhanced for use in Jikes RVM. Thanks to Codehaus and Sourceforge for providing hosting services. The Jalapeño Research ProjectJikes RVM was independently developed as part of the Jalapeño research project at the IBM T.J. Watson Research Center. The following IBM Research employees, academic visitors, and student co-ops contributed to the early releases of Jikes RVM: Bowen AlpernAnthony CocchiHan LeeJanice ShepherdMatthew ArnoldJulian DolbyDerek LieberManu SridharanDick AttanasioTracy FergusonMark MergenPeter F. SweeneyDavid BaconStephen FinkTon NgoMartin TrappJohn J. BartonEugene GluzbergJeff PalmKris VenstermansSteve BlackburnDavid GroveIgor PechtchanskiJohn WhaleyRastislav BodikMichael HindVivek SarkarMaria ButricoDave HovemeyerMauricio SerranoPerry ChengSusan HummelArvin ShepherdJong-Deok ChoiSergiy KyrylkovStephen Smith View Online · View Changes Online [Less] |
||||||
|
Posted
26 days
ago
by
Jeremy Singer
Page edited by Jeremy Singer - "GNU classpath link" As in previous years, Jikes RVM has applied to Google Summer of Code. We were not accepted as mentoring ... [More] organization for GSoC in 2013. Hints for studentsYou can still take part in GSoC and work for another mentoring organization. Projects closely related to Jikes RVM include those linked with GNU Classpath. Students interested in GSoC should read at least the GSoC FAQ and Google's advice for students. There's also a lot of helpful information about GSoC that's provided by mentoring organizations and previous GSoC students (use your favorite search engine to find it). We will likely apply for GSoC again next year. Archived stuff that's no longer relevant for GSoC 2013Hints for students specific to Jikes RVMIf the Jikes RVM project sounds like it could be a good fit for you, you can already start to get familiar with Jikes RVM. Checkout the source code and read the user guide. Build the Jikes RVM, run some tests and take a look at the codebase. You should have your development environment already set up before applying. Don't hesitate to ask on the mailing lists if you need help. If/when we've been accepted as a mentoring organization, students can take a look at our guidelines on making an application. We expect that students get familiar with the Jikes RVM and the Jikes RVM community before applying. What's in it for me?All mentors get a smart Google t-shirt. The project gets some new contributors, and the students get substantial amounts of cash and experience in open source development. The Jikes RVM project has seen substantial benefits from GSoC in previous years, including a new native threading model and additions to the core team of developers. Adding new project suggestionsPlease add your ideas to the wiki below, ideally using the template shown below. You can also contribute an idea if you don't have a wiki account: just describe it on the researchers mailing list. Project TitlefooRVM area{compiler, classlib, MMTk, infrastructure...}outlineparagraph description of project ideareferenceslinks to related projects, academic papers, etc needed skillsskills that the student should have and/or needs to acquire during the project difficultya difficulty estimate for the project, e.g. easy, medium, difficult interested mentoryour name and a contact link or "TBD" if you can't mentor Project suggestions (aka ideas list) Project TitleImplement the Java Management extensions (JMX) APIRVM areaclasslib, MMTk, threading, compileroutlineThe Java Management Extensions provide a way to monitor and manage Java Virtual machines. The Jikes RVM can collect significantly more profiling data and runtime statistics than an average JVM as a result of it being focussed on research activities. These statistics would ideally be exported using JMX and could potentially provide a standard mechanism for monitoring the performance and health of the virtual machine and its components. As a first step, students should focus on laying groundwork for a suitable implementation of the API: Update the JMX code from GSoC 2007 (written by Andrew John Hughes) to work with the current Jikes RVMImplement all the parts of the API that are requiredWrite automated tests that can be integrated into our test frameworksAfter that has happened, students should implement the optional parts of the API and Jikes RVM-specific features. The exact set of features will need to be discussed with the community at the appropriate time. referencesBlog post about JMX with links to further documentation Our bugtracker entry for JMX Web view of the JMX branch in the historic svn repository (use "svn checkout svn://svn.code.sf.net/p/jikesrvm/svn/rvmroot/branches/RVM-127-JMX" to checkout the code) needed skillsJavadifficultyeasyinterested mentorTony Hosking, Steve Blackburn Project TitleImplement the JVM Tool Interface (JVMTI)RVM areathreading, JNIoutlineThe JVM Tool interface is part of the Java Platform Debugger Architecture that JVMs can implement to provide debugging capabilities. We would like to have a full JMVTI implementation to allow debugging of Java applications that are running on Jikes RVM. Another benefit of JVMTI is that it allows low-level access to JVM internals. This can be used for instrumenting and monitoring the Jikes RVM using low-level code. There is already a partial JMVTI implementation that was written by James Bornholt for GSoC 2011. His implementation should be used as a basis for this project. referencesWriteup of James Bornholt on the GSoC 2011 page Web view of previous GSoC 2011 JVMTI work needed skillsC, C++, Java, willingness to deal with low-level detailsdifficultymediuminterested mentorEliot Moss Project TitleImprove (or rethink) CatTrackRVM areainfrastructureoutlineThe Jikes RVM project currently uses the Ruby-on-Rails application CatTrack to track test results. CatTrack is also responsible for sending regression mails. This project would consist of upgrading CatTrack to use a more recent version of Ruby-on-Rails instead of the ancient Ruby-on-Rails 1.2.6. Potential candidates must have experience in Ruby-on-Rails to be able to take on this project: we don't have the necessary expertise to provide adequate mentoring to RoR beginners. The exact version numbers and the upgrade process need to be coordinated with our contacts at the Australian National University (where the servers are located). New features can be added to CatTrack after the upgrade is complete. An alternative approach would be to get rid of CatTrack completely and move to a more general open-source solution. Note that this would be a much more disruptive change. A student proposal that suggests this approach must be very convincing to be considered. Another constraint for proposals of this kind is that we want to avoid using yet another language in our build infrastructure (see RVM-85 ). It might be acceptable to replace Ruby with another language but this must be discussed before such a proposal is submitted. The X10 language project also uses CatTrack for their regression mails. Candidates should be prepared to coordinate with somebody from the X10 team to ensure that an upgraded version of CatTrack (or a replacement) is also a viable solution for them. referencesWeb view of the CatTrack repositoryneeded skillsprior Ruby-on-Rails experience, XML, basic knowledge about databasesdifficultydifficultinterested mentorSteve Blackburn Project TitleImprove testing situation for JIkes RVMRVM areainfrastructure, (additional subsystems as chosen by the student)outlineAn infrastructure for unit tests and some initial unit tests were added to the Jikes RVM in GSoC 2012 by João Reys Santos. However, there is still a lot of work to do before we can consider the codebase to be reasonably well tested. We need more unit tests as well as more black-box tests. Students should pick an area of interest (e.g. compilers, JNI, MMTk, classloading, threading, ...) and/or motivating use cases (e.g. get benchmark X or application Y running on Jikes RVM), identify and fix the corresponding problems and introduce adequate (unit) tests. Note that extra effort will be required if students want to use mocking. Mockito does not work with GNU Classpath-based JVMs and our OpenJDK-port is not yet ready. referencesbook: Michael Feathers, Working effectively with legacy code, Pearson Education, 2004 book: Andreas Zeller, Why programs fail: A guide to systematic debugging, Morgan Kaufmann, 2009 needed skillsJava, Ant, debugging without a debuggerdifficultyeasy to difficult (depends on the chosen tasks and areas)interested mentorTony Hosking Project TitleBytecode verifierRVM areaclassloading, compilersoutlineJikes RVM currently does not implement a bytecode verifier. We would like to have a bytecode verifier that can verify classfiles at least up until Java 6 (we don't support Java 7 yet). The verifier needs to work with the baseline compiler as well as the optimizing compiler. Ideally, it should be possible to use the verifier from the command line without actually running the bytecode that is supposed to be checked. This could be done in a similar way to the OptTestHarness which enables users to control the compilers. The verifier should be designed to support custom verification constraints. Some examples for possible constraints are listed in the JIRA issue linked below. referencesJIRA issue for the bytecode verifier Jasmin assembler (useful for creating test cases that trigger verifier errors) Chapter 4 of the JVM Specification needed skillsJava, reading papers and specifications, implementing algorithms from specifications and papers difficultymediuminterested mentorEliot Moss Project TitleGCspyRVM areaMMTkoutlineThe GCspy framework for garbage collector visualization has become out-of-date and needs improving [RVM-388]. GCSpy 2.0 should update the framework to work properly with the latest version of Jikes RVM. In particular, it should support discontiguous spaces and provide drivers for more of MMTk's GC algorithms. In addition, the rewrite might improve modularity, and possibly integrate with the Eclipse Rich Client Platform. referencesGCspy: an adaptable heap visualisation framework needed skillsJava, C++, willingness to deal with low-level details, some understanding of garbage collection difficultyfairly straightforwardinterested mentorRichard Jones Project TitleImplement the Compressor garbage collectorRVM areaMMTkoutlineA Stop-The-World variant of the Compressor garbage collector was implemented for Jikes RVM in GSoC 2010 by Michael Gendelman. The goal of this project is to improve the stability of Michael Gendelman's implementation so that it can be merged to mainline. Performance tuning may also be needed. Students that are interested in this project should be familiar with garbage collection and MMTk. Hint for very ambitious students: You can work on more complex variants of the Compressor (e.g. a concurrent version) once the the Stop-The-World variant has been merged and the project has determined that its quality is suitable. referencespaper: Haim Kermanya and Erez Petrank, The Compressor: concurrent, incremental, and parallel compaction (ACM link) JIRA issue for the compressor (includes a link to the paper) Michael Gendelman's work on the Compressor (includes a design document on the wiki) needed skillsJava, willingness to deal with low-level details, familiarity with garbage collection difficultydifficultinterested mentorRichard Jones or Steve Blackburn Project TitlePort Jikes RVM compilers to ARMRVM areacompilersoutlineJikes RVM currently supports 32-bit Intel, and 32- and 64-bit Power PC architectures. This project involves porting Jikes RVM (initially the baseline compiler) to the 32-bit ARM architecture. Ambitious students could consider the optimizing compiler too, as a second step. Lots of preparatory work has been done in this area. I had an undergraduate student project at Glasgow University this year. He has done excellent refactoring work and we should be able to build on this to get a fully functional baseline compiler. referencesWhen my student's project is marked, I will put a PDF document up here. In the mean time, here is an out of date thesis about an earlier porting effort to ARM. needed skillsJava, willingness to deal with low-level details, familiarity with ARM instruction set difficultymoderate to difficultinterested mentorJeremy Singer, also support available from colleagues at ARM directly. Please get in touch to discuss details! Project TitleImplement new Quick compilerRVM areacompilersoutlineA Quick Compiler that does simple optimization and some use of registers, possibly modeled after existing dynamic code generation systems such as that of QEMU or valgrind. The point of such a compiler is to exploit "low hanging fruit". Previous work (10 years ago) took a somewhat different approach, but suggested that a compiler that applied a simple approach to register allocation and some simple optimizations reduced total execution over the current Baseline or Baseline+Opt compilation systems. referencesQEMU and valgrind are good starting points for developing the compilation / optimization strategy. needed skillscompiler expertise (analyses and optimizations) difficultymoderate to difficultinterested mentorEliot Moss Project TitleReengineer Command Line Flag parsing in Jikes RVMRVM areainfrastructureoutlineJikes RVM has lots of command-line flags, as you can see from the command line help. Researchers add new flags all the time, when they are working on experimental features. The current techniques for adding new flags, parsing flags, etc is quite complex. This project is about refactoring the code to make command line flag handling more straightforward. Possibilities (negotiable) include autocomplete for flags, suggestions for unrecognised flags, etc. referencesMaybe look at Google commandline flags library, and at bash completion for some ideas. needed skillsJava difficultyeasy to moderateinterested mentorJeremy Singer Please get in touch to discuss details! View Online · View Changes Online [Less] |
||||||
|
Posted
27 days
ago
by
Robin Garner
Page edited by Robin Garner OverviewThe MMTk harness is a debugging tool. It allows you to run MMTk with a simple client - a simple Java-like scripting language - which can ... [More] explicitly allocate objects, create and delete references, etc. This allows MMTk to be run and debugged stand-alone, without the entire VM, greatly simplifying initial debugging and reducing the edit-debug turnaround time. This is all accessible through the command line or an IDE such as eclipse. Running the test harnessThe harness can be run standalone or via Eclipse (or other IDE). Standalone ant mmtk-harness java -jar target/mmtk/mmtk-harness.jar <script-file> [options...] There is a collection of sample scripts in the MMTk/harness/test-scripts directory. There is a simple wrapper script that runs all the available scripts against all the collectors, bin/test-mmtk [options...] This script prints a PASS/FAIL line as it goes, and puts detailed output in results/mmtk. In Eclipse bin/buildit localhost --mmtk-eclipse Or in versions before 3.1.1 ant mmtk-harness && ant mmtk-harness-eclipse-project Refresh the project (or import it into eclipse), and then run 'Project > Clean'. Define a new run configuration with main class org.mmtk.harness.Main. Click Run (actually the down-arrow next to the the green button), choose 'Run Configurations...' Select "Java Application" from the left-hand panel, and click the "new" icon (top left). Fill out the Main tab as below Fill out the Arguments tab as below The harness makes extensive use of the java 'assert' keyword, so you should run the harness with '-ea' in the VM options. Click 'Apply' and then 'Run' to test the configuration. Eclipse will prompt for a value for the 'script' variable - enter the name of one of the available test scripts, such as 'Lists', and click OK. The scripts provided with MMTk are in the directory MMTk/harness/test-scripts. You can configure eclipse to display vmmagic values (Address/ObjectReference/etc) using their toString method through the Eclipse -> Preferences... -> Java -> Debug -> Detail Formatters menu. The simplest option is to check the box to use toString 'As the label for all variables'. Test harness optionsOptions are passed to the test harness as 'keyword=value' pairs. The standard MMTk options that are available through JikesRVM are accepted (leave off the "-X:gc:"), as well as the following harness-specific options: Option Meaning plan The MMTk plan class. Defaults to org.mmtk.plan.marksweep.MS collectors The number of concurrent collector threads (default: 1) initHeap Initial heap size. It is also a good idea to use 'variableSizeHeap=false', since the heap growth manager uses elapsed time to make its decisions, and time is seriously dilated by the MMTk Harness. maxHeap Maximum heap size (default: 64 pages) trace Debugging messages from the MMTk Harness. Useful trace options include ALLOC - trace object allocationAVBYTE - Mutations of the 'available byte' in each object headerCOLLECT - Detailed information during GCHASH - Hash code operationsMEMORY - page-level memory operations (map, unmap, zero)OBJECT - trace object mutation events REFERENCES - Reference type processingREMSET - Remembered set processingSANITY - Gives detailed information during Harness sanity checkingTRACEOBJECT - Traces every call to traceObject during GC (requires MMTk support) See the class org.mmtk.harness.lang.Trace for more details and trace options - most of the remaining options are only of interest to maintainers of the Harness itself.watchAddress Set a watchpoint on a given address or comma-separated list of addresses. The harness will display every load and store to that address. watchObject Watch modifications to a given object or comma-separated list of objects, identified by object ID (sequence number). gcEvery Force frequent GCs. Options are ALLOC - GC after every object allocation SAFEPOINT - GC at every GC safepointscheduler Optionally use the deterministic scheduler. Options are JAVA (default) - Threads in the script are Java threads, scheduled by the host JVMDETERMINISTIC - Threads are scheduled deterministically, with yield points at every memory access.schedulerPolicy Select from several scheduling policies, FIXED - Threads yield every 'nth' yield pointRANDOM - Threads yield according to a pseudo-random policyNEVER - Threads only yield at mandatory yieldpointsyieldInterval For the FIXED scheduling policy, the yield frequency. randomPolicyLength randomPolicySeed randomPolicyMin randomPolicyMax Parameters for the RANDOM scheduler policy. Whenever a thread is created, the scheduler fixes a yield pattern of 'length' integers between 'min' and 'max'. These numbers are used as yield intervals in a circular manner. policyStats Dump statistics for the deterministic scheduler's yield policy. bits=32|64 Select between 32 and 64-bit memory models. dumpPcode Dump the pseudo-code generated by the harness interpreter timeout Abort collection if a GC takes longer than this value (seconds). Defaults to 30. ScriptsThe MMTk/harness/test-scripts directory contains several test scripts. Script Purpose Description Alignment Test allocator alignment behaviour Tests alignment by creating a list of objects aligned to a mixture of 4-byte and 8-byte boundaries. CyclicGarbage Test cycle detector in Reference Counting collectors Creates large amounts of cyclic garbage in the form of circular linked lists. FixedLive General collection test Harness version of the FixedLive GC micro-benchmark. Creates a binary tree, then allocates short-lived objects to force garbage collections. HashCode Hash code test. Creates objects and verifies that their hashcode is unchanged after a GC. LargeObject Large object allocator test Creates objects with sizes ranging from 2 to 32 pages (8k to 128k bytes). Lists Generational collector stress test Creates a set of lists of varying lengths, and then allocates to force collections. Ensures that there are Mature->Nursery, Nursery->Mature and Stack->Nursery and Stack->Mature pointers at every GC. Remsets get a serious workout. OutOfMemory Tests out-of-memory handling. Allocates a linked list that grows until the heap fills up. Quicksort General collection test Implements a list-based quicksort. ReferenceTypes Reference type test Creates Weak references, forces collections and ensures that they are correctly handled. Spawn Concurrency test Creates lots of threads which allocate objects. SpreadAlloc Free-list allocator test Creates large numbers of objects with random size distributions, keeping a fraction of the objects alive. SpreadAlloc16 Concurrent free-list allocator test A multithreaded version of SpreadAlloc. Scripting languageBasicsThe language has three types: integer, object and user-defined. The object type behaves essentially like a double array of pointers and integers (odd, I know, but the scripting language is basically concerned with filling up the heap with objects of a certain size and reachability). User-defined types are like Java objects without methods, 'C' structs, Pascal record types etc. Objects and user-defined types are allocated with the 'alloc' statement: alloc(p,n,align) allocates an object with 'p' pointers, 'n' integers and the given alignment; alloc(type) allocates an object of the given type. Variables are declared 'c' style, and are optionally initialized at declaration. User-defined types are declared as follows: type list { int value; list next; } and fields are accessed using java-style "dot" notation, eg list l = alloc(list); l.value = 0; l.next = null; At this stage, fields can only be dereferenced to one level, eg 'l.next.next' is not valid syntax - you need to introduce a temporary variable to achieve this. Object fields are referenced using syntax like "tmp.int[5]" or "tmp.object[i*3]", ie like a struct of arrays of the appropriate types. Syntax script ::= (method|type)... method ::= ident "(" { type ident { "," type ident}... ")" ( "{" statement... "}" | "intrinsic" "class" name "method" name "signature" "(" java-class {, java class} ")" type ::= "type" ident "{" field... "}" field ::= type ident ";" statement ::= "if" "(" expr ")" block { "elif" "(" expr ")" block } [ "else" block ] | "while "(" expr ")" block | [ [ type ] ident "=" ] "alloc" "(" expr "," expr [ "," expr ] ")" ";" | [ ident "=" ] "hash" "(" expr ")" ";" | "gc" "(" ")" | "spawn" "(" ident [ "," expr ]... ")" ";" | type ident [ "=" expr ] ";" | lvalue "=" expr ";" lvalue ::= ident "=" expr ";" | ident "." type "[" expr "]" type ::= "int" | "object" | ident expr ::= expr binop expr | unop expr | "(" expr ")" | ident | ident "." type "[" expr "]" | ident "." ident | int-const | intrinsic intrinsic ::= "alloc" ( "(" expr "," expr ["," expr] ") | type ) | "(" expr ")" | "gc " "(" ")" binop ::= "+" | "-" | "*" | "/" | "%" | "&&" | "||" | "==" | "!=" unop ::= "!" | "-" MMTk Unit TestsThere is a small set of unit tests available for MMTk, using the harness as scaffolding. These tests can be run in the standard test infrastructure using the 'mmtk-unit-tests' test set, or the shell script 'bin/unit-test-mmtk'. Possibly more usefully, they can be run from Eclipse. To run the unit tests in Eclipse, build the mmtk harness project (see above), and add the directory testing/tests/mmtk/src to your build path (navigate to the directory in the package explorer pane in eclipse, right-click>build-path>Use as Source Folder). Either open one of the test classes, or highlight it in the package explorer and press the 'run' button. View Online · View Changes Online [Less] |
||||||
|
Posted
29 days
ago
by
Erik Brangs
Page edited by Erik Brangs - "add link for releases page" The following are the steps required to make a release of the Jikes RVM. Leading up to a ... [More] release, here are the steps to take. All commits are to tip (default branch). Update the release number in build.xml (will continue to have +hg suffix) and commit changeExport the userguide from confluence. Update html and pdf versions of userguide and commit.Update JIRA version management to indicate that version has been released.Generate text release notes from JIRA and put them in NEWS.txt. Commit.Generate javadoc (apidoc target). If needed, fix errors and commit changes.Upload javadoc to static webspace on sourceforge (htdocs/apidocs/version). Switch "latest" symlink to point to version.In a clean hg repository (no incoming/outgoing changesets). Perform the following steps Switch to the release branch (hg update release)Merge tip to the release branch (hg merge default; hg commit)Edit build.xml to remove the +hg from the release number and set the hg.version field. CommitTag the release (hg tag <version>; hg push)Clone a new .hg repository and create the release tar balls hg clone http://hg.code.sourceforge.net/p/jikesrvm/code -b release jikesrvm-versionrm -rf jikesrvm/.hg tar cjf jikesrvm-version.tar.bz2 jikesrvm-version; tar czf jikesrvm-version.tar.gz jikesrvm-version;Extract the portion of NEWS.txt relevant to this release into README.txt (will be used for ReleaseNotes on SF file download).Publish and announce the release Upload release tar balls and README.txt to sourceforge; set it as default download using Files GUI.Update the confluence Releases page to link to the new download versionSend out mail announcements to jikesrvm-announce and jikesrvm-researchersAlso post announcement in SF news and Confluence news. View Online · View Changes Online [Less] |
||||||
|
Posted
29 days
ago
by
Erik Brangs
|
||||||
Copyright
©
2013
Black Duck Software, Inc.
and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a
Creative Commons Attribution 3.0 Unported License
. Ohloh
®
and the Ohloh logo are trademarks of
Black Duck Software, Inc.
in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.