ryanheise.comsoftwarerheise.osdevelopment → type system

The rheise.os type system

Overview

rheise.os extends the Java type system to support direct object sharing between processes. Because each process defines a separate set of classes through its own primordial class loader, the standard Java type system does not allow one process to interact with an object of another process. While the rheise.os type system behaves exactly like the standard Java type system within a single process, classes from different processes may be treated as equivalent if they have the same name and the system can ensure type safety.

In order to achieve this, rheise.os introduces global types. A global type consists of a set of classes with the same name (defined by different class loaders) that are known to be type equivalent between processes. The system can determine if a set of classes are type equivalent between processes if they were loaded from the same class file, and certain other conditions are met (these are described below).

Class definitions

The rheise.os virtual machine structures classes in memory so that the text part can be shared between all classes that were loaded from the same class file. This is achieved by having those classes point to a common class definition structure in memory. Sharing class definitions not only saves memory but allows the global type of classes to be easily determined.

When linking a class with a class definition, rheise.os must determine which class file was used to load the class. Primordial class loaders can efficiently determine whether a class file has been loaded before by checking the class file location and modification date. rheise.os cannot know where custom-loaded classes came from so the only option here is to compare the the actual data of the class files byte for byte (although it is probably acceptable to ignore the potential for custom-loaded classes to share class file data).

Global types

A global type consists of a set of classes with the same name that are known to be type equivalent between processes. Naturally these classes will be defined by different class loaders. This type equivalence is only valid between processes because it should not be possible to cast an object between two classes of the same name within the same process as this would violate the Java Virtual Machine Specification. It should be noted that this is only an issue if it is made possible for a custom-loaded class to share a global type with other classes. There does not appear to be much gain in providing this feature, and not implementing it would make type checking simpler.

The following rules are used to determine global types for classes. Classes C1 and C2 are defined to have the same global type if the following conditions are met:

In the case of circular references, if all pairs of corresponding classes involved in circular references would be considered of the same global type without these references, then they are understood as having the same global type. (I think)

Global types can be determined on-the-fly each time a comparison needs to be done, but it is more efficient to determine global type of a class just once. This can be done either at class loading time, or on the first time a comparison needs to be done.

Type compatability

This section defines the rules for allowing casts between different gobal types. The notation <C, P> refers to a class that is defined in a class loader of process P. If it is clear which process the class belongs to, just the class name is used.

A cast of an object of type <C, P1> to type <T, P2> is permitted in the following cases, where P1 and P2 are different processes:

Resolution of symblic references

In a normal JVM, symbolic references are resolved using the class loader of the class making the reference. When an object is shared between two processes, the same rules are used. Even though a shared object may be accessed using two equivalent classes, only one class is the true class of the object. Therefore symbolic references from a shared object will be resolved based on the true class of that object.

An interesting case is resolution of references to static fields. Consider an object obj of type C that is shared by process P1 to process P2. Class C also defines a static field called stat. For sharing to occur, <C, P1> and <C, P2> must of course belong to the same global type, however, they are different classes and they will have their own statics. If each process has its own statics for C, we would like to know which statics to use for an object that is shared between two processes. According to the rule that symbolic references are resolved using the class loader of the class making the reference, references to stat are resolved as follows:

  1. If P1 references C.stat, the static field is resolved via class <C, P1>
  2. If P2 references C.stat, the static field is resolved via class <C, P2>
  3. If P1 references obj.stat, the static field is resolved via obj's class which is <C, P1>
  4. If P2 references obj.stat, the static field is resolved via obj's class which is <C, P1>
All of the above cases will reference the static of process P1, except for case 2 which will access the static of process P2 because it is referring to its own class directly.