ObjectStore C++ Advanced API User Guide

Chapter 9 Advanced Schema Evolution

This chapter provides information about the ObjectStore schema evolution facility. For a basic understanding of tasks you must perform to complete a schema evolution project, see Chapter 8, Schema Evolution, in the ObjectStore C++ API User Guide.

The information about schema evolution is organized in the following manner:

Phases of the Schema Evolution Process

The schema evolution process has two phases:

Schema modification: modification of the schema information associated with the database(s) being evolved
Instance migration: modification of any existing instances of the modified classes

(In this chapter, the term process is used in the ordinary nontechnical sense. The phrase schema evolution process refers to what the evolution facility does when invoked. This is not a system process separate from the execution of the application that calls the evolution function.)

Instance migration itself has two phases:

Instance Initialization

Instance initialization modifies existing instances of modified classes so that their representations conform to the new class definitions. This might involve adding or deleting fields or subobjects, changing the type of a field, or deleting entire objects. This phase of migration also initializes any storage components that have been added or that have changed type.

In most cases, new fields are initialized with zeros. There is one useful exception to this, however. In the case where a field has changed type, and the old and new types are assignment compatible, the new field is initialized by assignment from the old field value.

The initialization rules are discussed in Instance Initialization Rules.

Instance Transformation

For some schema changes, the instance initialization phase is all that is needed. But in other cases, further modification of class instances or associated data structures is required to complete the schema evolution. This further modification is generally application dependent, so ObjectStore allows you to define your own functions, transformer functions, to perform the task.

Transformer Functions

You associate exactly one transformer with each class whose instances you want to be transformed. During the transformation phase of instance migration, the schema evolution facility invokes each transformer function on each instance of the function's associated class, including instances that are subobjects of other objects.

Transformer functions are particularly useful when you want to set the value of some field of a migrated instance based on the values of some field or fields of the corresponding old instance. For this purpose, the evolution facility provides a function that allows you to retrieve the old instance corresponding to a given new instance.

You can also use a transformer function to adjust local references (see Instance Initialization Rules). A transformer associated with a class containing an os_reference_local or os_reference_protected_local could perform the adjustment by retrieving the new version of each local reference's referent, and assigning it to the reference.

In addition, transformers are useful for updating data structures that depend on the addresses of migrated instances. A hash table, for example, that hashes on addresses should be rebuilt using a transformer. Note that you do not need to rebuild a data structure if the position of an entry in the structure does not depend on the address of an object pointed to by the entry, but depends instead, for example, on the value of some field of the object pointed to. Such data structures will still be correct after the instance initialization phase.

Once the transformation phase is complete, all the old unmigrated instances are deleted. (If the old instances of a given class are not needed for the transformation phase, you can direct ObjectStore to delete them during the initialization phase. See Recycling Old Storage.)

Using transformers is discussed in Using Transformer Functions.

Initiating Evolution with evolve()

To perform schema evolution, you make and execute an application that invokes the static member function os_schema_evolution::evolve(). The function must be called outside the dynamic scope of a transaction. The application must include the header file ostore/schmevol.hh and link with the libraries libosse.a, liboscol.a, and libos.a.

The function evolve() has two overloadings, declared as follows:

      static void evolve(
            const char *workdb_name,
            const char *db_to_evolve
      );
      static void evolve(
            const char *workdb_name,
            const os_collection &dbs_to_evolve
      );

The evolution process depends on three parameters:

Databases to evolve
Schema modifications
Work database

Databases to Evolve

You specify the database or databases to be evolved as the second argument to evolve(). If you are evolving just a single database, you supply a char*, the pathname of the database. If you are evolving more than one database, you supply an os_collection& or os_Collection<char*>&, a set containing the databases' pathnames.

If you do not specify any database to evolve (that is, if you supply 0 for the first overloading, or an empty collection for the second overloading), err_schema_evolution is signaled.

The schema modifications are, by default, specified by the schema of the application that calls evolve(). So the schema source file for this executable should contain a new class definition for each class that you want to modify.

If you want, you can specify the schema modifications with a call to the static member function os_schema_evolution::set_evolved_schema_db_name() before calling evolve(). This function takes a const char* as argument, the pathname of a compilation or application schema database (the compilation or application schema database for some other application).

Removed Classes

You must also specify the classes that are to be removed from the schema, that is, the classes present in the old schema but not in the new schema. (Removing a class from a schema results in deletion of all of its instances.) You do this with one call to the static member function os_schema_evolution::augment_classes_to_be_removed() for each removed class. This function is declared as follows:

      static void augment_classes_to_be_removed(
            const char *name_of_class_to_be_removed 
      );

The calls should precede the call to evolve().

You can also call this function once for all the classes to be removed, if you pass an os_Collection<char*> containing the names of all the classes to be removed. In this case you use the overloading

      static void augment_classes_to_be_removed(
            const os_Collection<char*>
                        &names_of_classes_to_be_removed
      );

Again, this call should precede the call to evolve().

Work Database

In addition, you specify, also as an argument to evolve(), the pathname of the work database, a database to be created by the schema evolution facility and used internally as a scratch pad. This database holds the intermediate results of the evolution process, allowing it to be restartable in case of interruption (due to network or system failure, or due to detection of an illegal pointer; see Illegal Pointers).

When evolution is interrupted, the work database records a consistent intermediate state of the evolution process. Subsequently calling evolve() using the same work database will cause evolution to be resumed from the point of interruption.

After evolution successfully completes, you should delete the work database.

Note that when you remove a class, C, you must also remove or modify any class that mentions C in its definition. Otherwise err_se_cannot_delete_class is signaled.

Resolution of Local References

As mentioned earlier, you are given an option regarding the resolution, during evolution, of local ObjectStore references. (Recall that the referent's database must be specified for resolution of local references.) If you call the static member function os_schema_evolution::set_local_references_are_db_relative(), supplying a nonzero int (true) as argument, local references will be resolved using the database in which the reference itself resides. Otherwise local references will not be adjusted during the instance initialization phase (see Illegal Pointers).

Example: Changing the Value Type of a Data Member

Consider an example that involves changing the value type of a data member.

Suppose the schema for the database /example/partsdb starts out with the following definition of the class part:

Existing part class definition

      class part {
            public:
                  short part_id;
                  part(short id) { part_id = id; }
                  static os_typespec *get_os_typespec();
      }

And you want to change the definition to be as follows:

New part class definition

      class part {
            public:
                  long part_id;
                  part(long id) { part_id = id; }
                  static os_typespec *get_os_typespec();
      }

Here, the value type of the data member part_id has changed from short to long. The constructor's argument type has also changed. Since C++ provides a standard conversion from short to long, migrated instances of the class part will have their part_id fields initialized by assignment from the value of part_id in the corresponding old unmigrated instance.

Example: schema evolution application program

The application program that invokes the evolution process might look like this:

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/schmevol.hh>
      #include "part1new.hh" /* the new definition */
      main() {
            objectstore::initialize();
            os_schema_evolution::evolve(
                  "/example/workdb", 
                  "/example/partsdb"
            ); 
      }

Note that the header file ostore/schmevol.hh is included.

Here, the argument /example/workdb is a name for the scratch pad database, and the argument /example/partsdb specifies the database to be evolved.

Example: evolution program for multiple databases

An application that evolves several databases might look like this:

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/schmevol.hh>
      #include "part1new.hh" /* the new definition */
      main() {
            objectstore::initialize();
            os_collection::initialize();
            os_Collection<char*> the_dbs_to_evolve;
            the_dbs_to_evolve |= "/example/partsdb1";
            the_dbs_to_evolve |= "/example/partsdb2";
            the_dbs_to_evolve |= "/example/partsdb3";
            os_schema_evolution::evolve(
                  "/example/workdb", 
                  the_dbs_to_evolve
            ); 
      }

Note that both versions of the main() program include the new definition of the modified class. The schema source file for this executable should also contain the new definition of the class part.

Schema source file with new definition of part class

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/manschem.hh>
      #include "part1new.hh" /* this contains the new definition */
      void dummy() {
            OS_MARK_SCHEMA_TYPE(part);
      }

The instance migration phase of the schema evolution process will migrate the parts in /example/partsdb (for the first version of main()), changing the size of the part_id field from the size of an int to the size of a long. As mentioned, the instance migration process will also initialize the field by assignment from the preevolution value. This happens for all instances of the class part.

Note that the constructor for the new version of the class has no bearing on the initialization of migrated instances. The existing instances of the modified class are initialized according to the rules of default initialization described here. The new constructor initializes only those instances of the class that are created after evolution has occurred.

Using ossevol for Simple Schema Evolution

For a simple evolution like this one, one that involves no transformers or user-defined handler functions, you can also use the ObjectStore utility ossevol instead of an application program. The utility takes arguments for the pathname of a work database, the pathname of a compilation or application schema database specifying the new schema, and the pathnames of the databases to evolve. For example:

                              ossevol  /example/workdb /example/ex1.comp_schema_db /example/partsdb

For information on the ossevol utility, see Schema Evolution with ossevol in Chapter 8 of the ObjectStore C++ API User Guide.

Using Transformer Functions

The instance initialization phase leaves migrated instances in a well-defined state. But if you want to perform further application-specific processing on these instances as part of the migration process, you can supply transformer functions to accomplish this.

To do this, you define a transformer function for each class whose instances are to be transformed, and you then associate the function with the class on whose instances the function will operate (see Associating a Transformer with a Class).

As part of the instance migration process, the ObjectStore schema evolution facility invokes each transformer function on each instance of its associated class. This includes each instance that is embedded in some other object, either as the value of a data member or as the subobject corresponding to a base class of the object's class.

The order of execution of transformers on embedded objects follows the same pattern as constructors. When the transformer for a given class is invoked, the transformers for base classes of the given class are executed first (in declaration order), followed by the transformers for class-valued members of the given class (in declaration order), after which the transformer for the given class itself is executed.

Signature of Transformer Functions

Transformers are functions with no return value and one argument of type void*. This argument is a pointer to the object being transformed, an instance of the new class that has already undergone instance initialization.

Form of the call

      void  my_transform_function(void *the_new_obj)

Transformer functions frequently perform processing that is based on the state of the old unevolved object corresponding to the object being operated on. The evolution facility provides a means of accessing the old object. This is discussed in the next section, Accessing Unevolved Objects.

Associating a Transformer with a Class

With the transform function defined, you can associate the function with a class and invoke the evolution process. The association is made by calling the static member function os_schema_evolution::augment_post_evol_transformers() in the application performing evolution. The call should be made before the call to os_schema_evolution::evolve().

augment_post_evol_transformers() function

The function augment_post_evol_transformers() has the following two overloadings:

      static void 
            os_schema_evolution::augment_post_evol_transformers(
                  const os_transformer_binding&
            );
      static void
            os_schema_evolution::augment_post_evol_transformers(
                  const os_Collection<os_transformer_binding*>&
            );

os_transformer_binding() function

You can construct an instance of os_transformer_binding by supplying a class name and a function pointer as arguments to the constructor, as in

      os_transformer_binding("part", part_transform)

So a typical call to augment_post_evol_transformers() would be

      os_schema_evolution::augment_post_evol_transformers (
            os_transformer_binding("part", part_transform)
      );

Recycling Old Storage

For classes whose instances' old state does not need to be accessed by any transformer, and for removed classes, you can increase space efficiency during the evolution process by having their old unevolved instances deleted during the instance initialization phase, allowing their space to be used for new instances. You do this with one call to os_schema_evolution::augment_classes_to_be_recycled() for each class whose old instances can be deleted.

augment_classes_to_be_recycled() function

This function is declared as follows:

      static void 
            os_schema_evolution::augment_classes_to_be_recycled(
                  const char *name_of_class_to_be_recycled 
            );

The calls should precede the call to evolve().

You can also call this function once for all the classes to be recycled, if you pass an os_Collection<char*> containing the names of all the classes to be recycled. In this case you use the overloading

      static void
            os_schema_evolution::augment_classes_to_be_recycled(
                  const os_Collection<char*>
                  &names_of_classes_to_be_recycled 
            );

Again, this call should precede the call to os_schema_evolution::evolve().

Note that the old unevolved instances of each modified class are deleted following completion of the transformation phase, whether or not you have specified the class as one to be recycled.

Accessing Unevolved Objects

Transformer functions (see Using Transformer Functions), as well as reclassification functions (see Instance Reclassification), often perform processing that is based on the state of the old unevolved object corresponding to the object being operated on. This section tells you how to access that state.

Given a pointer, the_new_obj, to an initialized object, retrieving a data member value for the corresponding old unevolved object has the following steps:

Retrieve an os_typed_pointer_void that refers to the old object. An os_typed_pointer_void is a special container object that encapsulates a void* pointer to the old instance and an object representing the instance's type.
Retrieve a void* pointer to the old object.
Retrieve a pointer to the object representing the type of the old object.
Given the type object, retrieve a pointer to the object representing the data member whose value you want to access.
Given the old object and the data member object, retrieve the old data member value.

These steps are necessary because the new schema provides the type universe for transformer and reclassification functions. The old class definitions are not part of a transformer's schema, and therefore you cannot use the usual member access notation, .member-name, to access fields of the old instance.

Retrieving os_typed_pointer_void and void* pointers

You can retrieve an os_typed_pointer_void to the old unevolved instance using the static member function os_schema_evolution::get_unevolved_object(). To retrieve the pointer itself you simply assign the os_typed_pointer_void to a void* variable, as in

            os_typed_pointer_void old_obj_typed_ptr = 
                  os_schema_evolution::get_unevolved_object(a_new_obj);
            void *an_old_obj = old_obj_typed_ptr;

This works because the class os_typed_pointer_void defines operator void*() to return the pointer.

Retrieving the type and the data member

You can retrieve the type with the member function os_typed_pointer_void::get_type().

      const os_class_type &c = old_obj_typed_ptr.get_type();

You retrieve a pointer to the object representing the data member of a specified name defined by a specified type using os_class_type::find_member().

Retrieving the data member value

Finally, you retrieve the value of a specified data member for a specified object using os_fetch():

      os_fetch(the_old_obj, *c.find_member("part_id"), the_old_val);

As mentioned earlier, the instance initialization phase of evolution automatically modifies all pointers to instances of modified classes so that they reference the new migrated instances. This is true even for pointers contained in old unmigrated instances. So if you access an old data member during the instance transformation phase, and the value of the member is a pointer to an instance of a class that was also modified, the value you retrieve will point to the new migrated instance (see Example: Changing Inheritance).

Functions used to access unevolved objects

Here are the declarations of the functions used to access unevolved objects:

      static os_typed_pointer_void os_schema_evolution::
            get_unevolved_object(void *new_obj);
      os_typed_pointer_void::operator void*() const;
      const os_type &os_typed_pointer_void::get_type() const;
      const os_member *os_class_type::
            find_member(const char *name) const;

There is also a function for retrieving the address of the new version of a specified unevolved object, get_evolved_object().

os_fetch() global function

The global function os_fetch() has an overloading for each built-in C++ type:

      void *os_fetch(
            const void *p, const os_member_variable&, void *&value);
      unsigned long os_fetch(
            const void *p, const os_member_variable&,
            unsigned long &value);
      long os_fetch(
            const void *p, const os_member_variable&, long &value);
      unsigned int os_fetch(
            const void *p, const os_member_variable&,
            unsigned int &value);
      int os_fetch(
            const void *p, const os_member_variable&, int &value);
      unsigned short os_fetch(
            const void *p, const os_member_variable&,
            unsigned short &value);
      short os_fetch(
            const void *p, const os_member_variable&, short &value);
      unsigned char os_fetch(
                  const void *p, const os_member_variable&, 
                  unsigned char &value);
      char os_fetch(
            const void *p, const os_member_variable&, char &value);
      float os_fetch(
            const void *p, const os_member_variable&, float &value);
      double os_fetch(
            const void *p, const os_member_variable&, double &value);
      long double os_fetch(
            const void *p, const os_member_variable&,
            long double &value);

os_store() global function

Once you have retrieved an old data member value, you can usually just assign it to the new data member. But if the value type of the new data member is a const or reference type, you should use os_store() to set the new member value.

      os_store(the_new_obj, c.find_member("part_id"), the_old_val);

Like os_fetch(), os_store() has an overloading for each built-in C++ type:

      void os_store(
            void *p, const os_member_variable&, const void *value);
      void os_store(
            void *p, const os_member_variable&,
            const unsigned long value);
      void os_store(
            void *p, const os_member_variable&, const long value);
      void os_store(
            void *p, const os_member_variable&, 
            const unsigned int value);
      void os_store(
            void *p, const os_member_variable&, const int value);
      void os_store(
            void *p, const os_member_variable&, 
            const unsigned short value);
      void os_store(
             void *p, const os_member_variable&, const short value);
      void os_store(
            void *p, const os_member_variable&, 
            const unsigned char value);
      void os_store(
             void *p, const os_member_variable&, const char value);
      void os_store(
            void *p, const os_member_variable&, const float value);
      void os_store(
             void *p, const os_member_variable&, const double value);
      void os_store(
            void *p, const os_member_variable&, 
            const long double value);

os_fetch_address() global function

You can get the address of a data member value with os_fetch_address(), declared

      void *os_fetch_address(void *p, const os_member_variable&);

os_member_variable::get_type() function

And you can get the value type of a data member with os_member_variable::get_type(), declared

      const os_type &os_member_variable::get_type() const;

Together with os_fetch(), these functions allow you to access not only data members, but also data members of data member values, and so on.

Accessing an inherited data member

To access an inherited data member, the following functions are useful:

      const os_base_class &os_class_type::find_base_class(
            char *base_class_name) const;
      void *os_fetch_address(

            void *p, const os_base_class_variable&);
      const os_class_type &os_base_class::get_class() const;

Header file requirement

To use the functions described in this section, you must include the header file <ostore/mop.hh>.

Example: Using Transformers

Now consider an example that uses a transformer function.

Suppose that instead of changing the value type of the class part (see the previous example) from short to long, you want to change it from short to char*, so arbitrary strings can be used for part IDs:

Existing part class definition

      class part {
            public:
                  short part_id;
                  part(short id) { part_id = id; }
                  static os_typespec *get_os_typespec();
      }

New part class definition

And you want to change the definition to be as follows:

      class part {
            public:
                  char *part_id;
                  part(char *id) { 
                        int len = strlen(id) + 1;
                        part_id = new(
                              os_segment::of(this), 
                              os_typespec::get_char(), 
                              len
                        ) char[len];
                        strcpy(part_id, id); 
                  }
                  static os_typespec *get_os_typespec();
      }

Since there is no standard C++ conversion from short to char*, the new field will be initialized to (char*) (0) during the instance initialization phase of schema evolution. But we can direct the evolution facility to overwrite this initialization during the transformation phase, and establish a new part_id value for a migrated instance based on the value of part_id for the corresponding unmigrated instance.

To do this, supply a transformer function and associate it with the class part. As part of the instance migration process, the ObjectStore schema evolution facility will invoke this transformer function on each instance of the class.

part_transform() transformer function

Here is how such a transformer function might be defined.

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/schmevol.hh>
      #include <ostore/mop.hh>
      #include <stdio.h>
      #include <string.h>
      #include "part2new.hh"
      static void part_transform(void *the_new_obj) {
            /* get a typed ptr to the old obj */
            os_typed_pointer_void old_obj_typed_ptr = 
                  os_schema_evolution::get_unevolved_object(
                        the_new_obj);

            /* get a void* ptr to the old obj; implicit operator void*() call */
            void *the_old_obj = old_obj_typed_ptr;
            /* get the type of the old obj */
            const os_class_type &c = old_obj_typed_ptr.get_type();
            /*  get the old data member value */
            int the_old_val; 
            os_fetch(the_old_obj, *c.find_member("part_id"), 
                        the_old_val);

            /*  convert the old value to string form */
            char conv_buf[16];
            sprintf(conv_buf, "%d", the_old_val);
            int len = strlen(conv_buf) + 1;
            part *part_ptr = (part *)the_new_obj;
            part_ptr->part_id =
                  new(os_segment::of(the_new_obj), 
                        os_typespec::get_char(), len) char[len];
            strcpy(part_ptr->part_id, conv_buf);
      }

This function, part_transform(), sets the value of part_id in the new instance to the string denoting the integer value of part_id in the old unevolved instance. So, for example, if the old part_id was the integer 1138, the transformer sets the new part_id to a pointer to the character array 1138.

With the transform function defined, you can associate the function with the class part and invoke the evolution process. As mentioned above, the association is made using a function call from within the application that invokes schema evolution.

main() function

The main() function associates part_transform() with the class part by creating an os_transformer_binding for the function and the class, and invoking augment_post_evol_transformers() on it.

Once the association between transformer and class is made, evolution is invoked.

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/schmevol.hh>
      #include "part2new.hh"
      main() {
            objectstore::initialize();
            /*  associate part_transform() with the class part */
            os_schema_evolution::augment_post_evol_transformers(
                  os_transformer_binding("part", part_transform)
            );
            /*  initiate evolution */
            os_schema_evolution::evolve(
                  "example/workdb", "example/partsdb"
            ); 
      }

Note that if the class part has classes derived from it, the instances of these derived classes must also be migrated, since each instance of the derived classes has a subobject corresponding to the base class part. The transformer part_transform() is run on these subobjects as well.

Example: Changing Inheritance

Here is an example that involves deleting some data members from a class, as well as changing the class to inherit from a new base class.

Consider a database schema that uses the classes epart, for electrical part, and mpart, for mechanical part, and suppose these classes both have data members for part_id and responsible_engineer. The example below shows how to add a common base class, part, to these two classes, and move the common data members out of the definitions of epart and mpart and into the definition of part.

This schema change involves redefining epart and mpart by

Deleting the members epart::part_id, epart::responsible_engineer, mpart::part_id, and mpart::responsible_engineer
Making the classes inherit from the new class part, which has members part::part_id and part::responsible_engineer.

Changing epart and mpart to inherit from part

Note that the schema evolution facility does not view the old member epart::part_id as related to the new member part::part_id (and similarly for part::responsible_engineer). It would be undesirable for the facility to make any assumptions about the semantic relationship between the two members based merely on sameness of name, since this is an application-dependent matter.

Consequently, moving a data member from subtype to supertype should be viewed as deletion of the data member from the subtype, together with addition of a new, distinct data member to the supertype. Similar remarks apply for moving members the other way, from supertype to subtype.

Here are the old and new class definitions:

Old epart class definition

      class epart {
            public:
                  int part_id;
                  employee *responsible_engineer;
                  os_Collection<cell*> cells;
                  . . . 
                  epart(int id, employee *eng) {
                        part_id = i;
                        responsible_engineer = eng;
                  }
      };

Old mpart class definition

      class mpart {
            public:
                  int part_id;
                  employee *responsible_engineer;
                  os_Collection<brep*> boundaries;
                  . . . 
                  mpart(int id, employee *eng) {
                        part_id = i;
                        responsible_engineer = eng;
                        brep =0;
                  }
      };

New part class definition

      class part {
            public:
                  int part_id;
                  employee *responsible_engineer;
                  part(int id, employee *eng) {
                        part_id = i;
                        responsible_engineer = eng;
                  }
      };

New epart class definition

      class epart : public part {
            public:
                  os_Collection<cell*> cells;
                  . . . 
                  epart(int id, employee *eng) : part(id, eng) {}
      };

New mpart class definition

      class mpart : public part {
            public:
                  os_Collection<brep*> boundaries;
                  . . . 
                  mpart(int id, employee *eng) : part(id, eng) { brep =0; }
      };
  
 New schema source file

The schema source file for this executable should contain the new definitions of epart and mpart, as well as the definition of part.

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/manschem.hh>
      /* these contain the new definitions */
      #include "part.hh" 
      #include "new_epart.hh" 
      #include "new_mpart.hh"
      static void dummy() {
            OS_MARK_SCHEMA_TYPE(epart);
            OS_MARK_SCHEMA_TYPE(mpart);
            OS_MARK_SCHEMA_TYPE(part);
            . . . 
      }

The instance migration phase of the schema evolution process modifies the instances of epart and mpart by eliminating the part_id and responsible_engineer fields from the subobject corresponding to the derived class. It also adds to each instance a subobject corresponding to the base class, and initializes it as if by a constructor that initializes each member to 0.

Supplying a transformer function for each derived class

Suppose you want to overwrite the default initialization performed by the schema evolution facility, and initialize part::part_id and part::responsible_engineer for a migrated instance based on the values of the old part_id and responsible_engineer fields for the corresponding unmigrated instance.

To do this, you supply a transformer function for each derived class, epart and mpart.

      static void epart_transform(void *the_new_obj) {
            /*  get a typed ptr to the old instance */
            os_typed_pointer_void old_obj_typed_ptr =
                  os_schema_evolution::get_unevolved_object(
                        the_new_obj
                  );
            /* get a void* ptr to the old obj */
            void *the_old_obj = old_obj_typed_ptr;
            /* get the type of the old obj */
            os_class_type &c = old_obj_typed_ptr.get_type();
            /* get the old data member values */
            int the_old_id_val;
            os_fetch(
                  the_old_obj, 
                  *c.find_member("part_id"), 
                  the_old_id_val
            );
            void *the_old_resp_eng_val;
            os_fetch(
                  the_old_obj, 
                  *c.find_member("responsible_engineer"), 
                  the_old_resp_eng_val
            );
            /* set the new data member values */
            epart *epart_ptr = (epart*)the_new_obj
            epart_ptr->part_id = the_old_id_val;
            epart_ptr->responsible_engineer = 
                  (employee*)the_old_resp_eng_val;
      }
      static void mpart_transform(void *the_new_obj) {
            /*  get a typed ptr to the old instance */
            os_typed_pointer_void old_obj_typed_ptr =
                  os_schema_evolution::get_unevolved_object(
                        the_new_obj
                  );
            /*  get a void* ptr to the old obj */
            void *the_old_obj = old_obj_typed_ptr;
            /*  get the type of the old obj */
            os_class_type &c = old_obj_typed_ptr.get_type();
            /*  get the old data member values */
            int the_old_id_val;
            os_fetch(
                  the_old_obj, 
                  *c.find_member("part_id"), 
                  the_old_id_val
            );
            void *the_old_resp_eng_val;
            os_fetch(
                  the_old_obj, 
                  *c.find_member("responsible_engineer"), 
                  the_old_resp_eng_val
            );
            /*  set the new data member values */
            mpart *mpart_ptr = (mpart*)the_new_obj;
            mpart_ptr->part_id = the_old_id_val;
            mpart_ptr->responsible_engineer = 
                  (employee*)the_old_resp_eng_val;
      }

Here, the transformer functions for the two classes need to do essentially the same thing. Each function retrieves the old values for part_id and responsible_engineer in the derived class, and sets the new values for part::part_id and part::responsible_engineer accordingly.

Note that, if the current evolution calls for the migration of instances of the class employee, the value of responsible_engineer retrieved from the old instance will be a pointer to the new employee instance corresponding to the original data member value. This is because pointers to migrated objects are modified during the initialization phase to point to the new instances. This turns out to be convenient, since we are usually interested in the evolved version of the old data member value.

Example: associating transformers with their classes and invoking evolution

Here is an application that associates the transformers with their classes and invokes evolution.

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/schmevol.hh>
      #include "part.hh"
      #include "new_epart.hh"
      #include "new_mpart.hh"
      main() {
            objectstore::initialize();
            /* associate epart_transform() with the class epart */
            os_schema_evolution::augment_post_evol_transformers(
                  os_user_tranformer_binding("epart", epart_transform)
            );
            /*  associate mpart_transform() with the class mpart */
            os_schema_evolution::augment_post_evol_transformers(
                  os_user_tranformer_binding("mpart", mpart_transform)
            );
            /* perform the evolution process */
            os_schema_evolution::evolve(
                  "/example/workdb", "/example/partsdb"
            ); 
      }

For databases undergoing the evolution described in this example, ObjectStore detects as illegal any pointers to eparts or mparts typed as void*. This is because, for example, before evolution such a pointer to an epart could also be interpreted as referring to the value of epart::part_id (since this int object starts at the same point as the epart), while after evolution it could no longer be interpreted as referring to that object. For more information on illegal pointers, see Illegal Pointers.

If the example is modified to include a leftmost base class for epart and mpart, both before and after evolution, void* pointers to eparts and mparts will not be illegal.

Instance Reclassification

As described above, the ObjectStore schema evolution facility allows you to migrate an instance to a subclass of its original class. This is particularly useful when new derived classes that are more appropriate classes for existing instances of the base class are added to a schema.

To reclassify an instance, you must define a reclassification function and associate it with the class whose instances are to be reclassified. As part of the instance initialization phase of schema evolution, ObjectStore will execute the reclassification function on each instance of the function's associated class and reclassify the instance according to the return value of the function.

Signature of Reclassification Functions

Reclassifiers are static functions with a return type of char* and one argument of type os_typed_pointer_void& (see Using Transformer Functions). This argument is a reference to a typed pointer to the object to be reclassified, an unevolved instance of the original class.

      static char * my_reclassification_function(
            os_typed_pointer_void &old_obj_typed_ptr
      );

The return value, for a given instance, should be a string naming the new class the instance is to have. If the return value is 0, the instance will retain its current type.

As with transformers, the schema for reclassification functions is the new schema. So to access fields of the object being reclassified, you must use os_typed_pointer_void::get_type(), os_class_type::find_member(), and os_fetch(). See Using Transformer Functions and the example in Example: Reclassifying Instances.

Associating a Reclassifier with a Class

With the reclassification function defined, you can associate the function with a class and invoke the evolution process. You make the association by calling the static member function os_schema_evolution::augment_subtype_selectors() in the application performing evolution. The call should be made before the call to evolve().

augment_subtype_selectors() function

The function augment_subtype_selectors() takes an instance of os_evolve_subtype_fun_binding as argument. You can construct an instance of this class by supplying a class name and a function pointer as arguments to the constructor, as in

      os_evol_subtype_fun_binding("part", part_reclassifier)

So a typical call to augment_subtype_selectors() would be

      os_schema_evolution::augment_subtype_selectors (
            os_evolve_subtype_fun_binding("part", part_reclassifier)
      );

Example: Reclassifying Instances

Consider a schema containing the class part with data members cells (a pointer to the collection of subcircuits of an electrical part) and boundary_rep (a pointer to the geometric representation of the boundary of a mechanical part). Suppose that the parts that have a nonnull value for cells have 0 for boundary_rep, and the parts that have a nonnull value for boundary_rep have 0 for cells.

In such a case, it might be desirable to modify this schema to include two new classes derived from part, epart (for electrical part) and mpart (for mechanical part). The data member cells can be moved out of part and into epart, and the member boundary_rep can be moved out of part and into mpart.

In addition to adding the subclasses to the schema, we should migrate existing instances of part so that those with a nonnull value for cells are reclassified as eparts, and those with a nonnull value for boundary_rep are reclassified as mparts.

The schema change in this example involves

Deleting the members part::cells and part::boundary_rep
Deriving two new classes, epart and mpart, from part

Moving data members of part to new subtypes

Again, note that moving a data member from supertype to subtype should be viewed as deletion of the data member from the supertype, together with addition of a new, distinct data member to the subtype.

Existing part class definition

Here is the original definition of the class part:

      class part {
            public:
                  int part_id;
                  employee *responsible_engineer;
                  os_Collection<cell*> *cells;
                  brep *boundary_rep;
                  part(int id, employee *eng) {
                        part_id = i;
                        responsible_engineer = eng;
                        boundary_rep = 0;
                  }
      };

New class definitions

Here are the class definitions of the new schema:

      class part {
            public:
                  int part_id;
                  employee *responsible_engineer;
                  part(int id, employee *eng) {
                        part_id = i;
                        responsible_engineer = eng;
                  }
      };
      class epart : public part {
            public:
                  os_Collection<cell*> *cells;
                  . . . 
                  epart(int i) : part(i) { cells = 0; }
      }
      class mpart : public part {
            public:
                  brep *boundary_rep;
                  . . . 
                  mpart(int i) : part(i) { brep = 0; }
      };

Schema source file

The schema source file for this executable should contain the definitions of epart and mpart, as well as the new definition of part.

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/manschem.hh>
      /* these contain the new definitions */
      #include "new_part.hh" 
      #include "epart.hh" 
      #include "mpart.hh"
      static void dummy() {
            OS_MARK_SCHEMA_TYPE(epart);
            OS_MARK_SCHEMA_TYPE(mpart);
            OS_MARK_SCHEMA_TYPE(part);
      }

The instance migration phase of the schema evolution process will modify the instances of part by eliminating the cells and boundary_rep fields. But first, you would like each part to be reclassified according to whether it uses the cells field or the boundary_rep field.

Reclassification function

To do this, you define a reclassification function and associate it with the class part. Here is the function definition:

      static char *part_reclassifier(
                  os_typed_pointer_void &old_obj_typed_ptr
      ) {
            /* get a void* ptr to the old obj */
            void *the_old_obj = old_obj_typed_ptr;
            /* get the type of the old obj */
            os_class_type &c = old_obj_typed_ptr.get_type();
            /* get the old cells value */
            os_Collection<cell*> *the_old_cells_val;
            os_fetch(
                  the_old_obj, 
                  *c.find_member("cells"), 
                  the_old_cells_val
            );
            if (the_old_cells_val)
                  return "epart"; /*  make it an epart */
            /* get the old boundary_rep value */
            brep *the_old_boundary_rep_val;
            os_fetch(
                  the_old_obj, 
                  *c.find_member("boundary_rep"), 
                  the_old_boundary_rep_val
            );
            if (the_old_boundary_rep_val)
                  return "mpart"; /*  make it an mpart */
            return 0; /*  leave it alone */
      }

The reclassification of each part essentially amounts to supplementing it with a subobject corresponding to the derived class, epart or mpart. The subobject is initialized as if by a constructor that initializes each member to 0. We can overwrite this initialization by defining transformer functions for the derived classes.

Note that the reclassification function is associated with the original class (the base class) of the instances it operates on, while the transformer functions (see below) are associated with the new classes (the derived classes) of the instances they operate on.

Transformer functions

Here are the transformer functions that allow you to set the values of cells and boundary_rep for the new instances according to their values in the old instances.

      static void epart_transform(void *the_new_obj) {
            /* get a typed ptr to the old instance */
            os_typed_pointer_void old_obj_typed_ptr =
                  os_schema_evolution::get_unevolved_object(
                  the_new_obj);
            /* get a void* ptr to the old obj */
            void *the_old_obj = old_obj_typed_ptr;
            /* get the type of the old obj */
            os_class_type &c = old_obj_typed_ptr.get_type();
            /* get the old data member values */
            os_Collection<cells*> the_old_cells_ val;
            os_fetch(the_old_obj,*c.find_member("cells"),
                  the_old_cells_val);
            /*  set the new data member value */
            the_new_obj->cells = the_old_cells_val;
      }
      static void mpart_transform(void *the_new_obj) {
            /*  get a typed ptr to the old instance */
            os_typed_pointer_void old_obj_typed_ptr =
                  os_schema_evolution::get_unevolved_object(
                  the_new_obj);
            /*  get a void* ptr to the old obj */
            void *the_old_obj = old_obj_typed_ptr;
            /* get the type of the old obj */
            os_class_type &c = old_obj_typed_ptr.get_type();
            /* get the old data member values */
            brep *the_old_boundary_rep_ val;
            os_fetch(
                  the_old_obj, 
                  *c.find_member("boundary_rep"), 
                  the_old_boundary_rep_val
            );
            /* set the new data member value */
            the_new_obj->cells = the_old_boundary_rep_val;
      }

Example application

Now here is an application that associates the reclassifier and transformers with their classes and invokes evolution:

      #include <ostore/ostore.hh>
      #include <ostore/coll.hh>
      #include <ostore/schmevol.hh>
      #include "part.hh"
      #include "new_epart.hh"
      #include "new_mpart.hh"
      main() {
            objectstore::initialize();
            os_collection::initialize();
            /*  associate part_reclassifier() with the class part */
            os_schema_evolution::augment_subtype_selectors(
                  os_evol_subtype_fun_binding("part", part_reclassifier)
            );
            /*  associate epart_transform() with the class epart */
            os_schema_evolution::augment_post_evol_transformers(
                  os_transformer_binding("epart", epart_transform)
            );
            /*  associate mpart_transform() with the class mpart */
            os_schema_evolution::augment_post_evol_transformers(
                  os_transformer_binding("mpart", mpart_transform)
            );
            /*  perform the evolution process */
            os_schema_evolution::evolve(
                  "/example/workdb", 
                  "/example/partsdb"
            ); 
      }

Illegal Pointers

During the instance initialization phase of schema evolution, ObjectStore adjusts all pointers and references to instances of modified classes so that they point to the new, migrated instances of these classes. During this process, ObjectStore might detect various kinds of illegal pointers or references. For example, it might detect a pointer to the value of a data member that has been removed in the new schema. By default, an exception is signaled when an illegal pointer or reference is encountered.

Ignoring Illegal Pointers During Schema Evolution

If you want evolution to continue after detection of an illegal pointer or reference, you can specify that illegal pointers be ignored, by calling os_schema_evolution::set_ignore_illegal_pointers() with a nonzero argument, before calling evolve(). This function is declared as follows:

      static void os_schema_evolution::set_ignore_illegal_pointers(
            os_boolean);

Using a Handler Function for Illegal Pointers

Alternatively, you can provide a handler function associated with one or more of the following categories of illegal pointers and references:

Illegal pointers and C++ references to objects
Illegal ObjectStore local references
ObjectStore nonlocal references
Illegal pointers and C++ references to members
Illegal database root values

Each time an illegal pointer or reference of the associated kind is detected, the handler function is executed on it, and then schema evolution is resumed. A handler function cannot modify any data in the databases being evolved, except for the illegal pointer or reference itself, which can be assigned a new value. The function can, however, generate text output. For example, you can record the location of an illegal pointer by creating a transient ObjectStore reference to the illegal pointer, and then dumping its text representation to a file (see os_reference::dump() in the ObjectStore C++ API Reference). This text representation can be used by a subsequent process to create another ObjectStore reference to the same illegal pointer (see os_reference::os_reference() in the ObjectStore C++ API Reference).

Creating a Handler Function

To associate a handler function with a category of illegal pointer or reference:

Define a function with the appropriate signature.
Register the function with a call to the static member function os_schema_evolution::set_illegal_pointer_handler().

The signatures of the handler functions for each category are as follows:

Illegal pointers and C++ references to objects

      void  function_name(
            objectstore_exception &exc, 
            char *msg, 
            void *&the_bad_ptr
      );

Illegal ObjectStore local references

      void  function_name(
            objectstore_exception &exc, 
            char *msg, 
            os_reference_local &the_bad_ref
      );

lllegal ObjectStore nonlocal references

      void  function_name(
            objectstore_exception &exc, 
            char *msg, 
            os_reference &the_bad_ref
      );

lllegal pointers and C++ references to members

      void  function_name(
            objectstore_exception &exc, 
            char *msg, 
            os_os_canonical_ptom &the_bad_ptr
      );

lllegal ObjectStore root values

      void  function_name(
            objectstore_exception &exc, 
            char *msg, 
            os_database_root &the_bad_root
      );

The set_illegal_pointer_handler() Function

The function os_schema_evolution::set_illegal_pointer_handler() has four overloadings corresponding to the four categories of illegal pointers and references. Each takes one argument, a pointer to the handler function of the appropriate signature.

Function arguments

For each kind of illegal pointer handler, the exc argument is a reference to the exception that would have been signaled had you not provided a handler. The exception is always a child exception of err_se_illegal_pointer. The msg argument is the error message that would have been sent to stderr. The last argument, the_bad_ref or the_bad_ptr, is a C++ reference to the illegal pointer or illegal ObjectStore reference.

Identifying Illegal Pointers Passed to a Handler

To help you identify an illegal pointer passed to a handler function, the class os_schema_evolution provides three useful functions not yet introduced:

os_schema_evolution::get_path_to_member()
os_schema_evolution::path_name()
os_schema_evolution::get_evolved_address()

get_path_to_member() function

get_path_to_member() performed on a void* returns an instance of os_path representing the data member whose value is pointed to by the void*.

path_name() function

path_name() performed on an os_path returns a string naming this data member.

get_evolved_address() function

get_evolved_address(), like get_evolved_object(), returns the address of the new version of a specified unmigrated object. get_evolved_address() is used here because get_evolved_object() signals an exception when performed on an illegal pointer. (get_unevolved_address(), like get_unevolved_object(), returns the address of the old version of the specified migrated object.)

The os_schema_evolution class is described in Chapter 2, Class Library, of the ObjectStore C++ API Reference.

Besides the categorization we have been discussing, there is another, orthogonal way of dividing illegal pointers and references into categories. This division will help you understand what pointers and references get counted as illegal.

Typed pointers and references to deleted subobjects

The instance migration process deletes subobjects of instances of a given class when either

The subobject is the value of a data member that has been removed from the class.
The subobject corresponds to a class that the given class previously inherited from, but no longer does.

Any pointer or reference to such a deleted subobject is illegal and can result in the exception err_se_deleted_object or err_se_deleted_component.

void* pointers and collocation ambiguities

A void* pointer in an ObjectStore database has an associated set of objects, the objects collocated at the region of memory it points to. These are all the objects to which the pointer can be interpreted as referring, instances of the types to which the pointer can legitimately be cast.

For example, a void* pointer to an instance of the class epart from the preevolution schema of Example: Changing Inheritance also points to the beginning of memory occupied by an int, the value of the member epart::part_id.

If a void* pointer is associated, before evolution, with an object with which it is not associated after evolution, the pointer is illegal and can result in the exception err_se_ambiguous_void_pointer.

Consider again Example: Changing Inheritance. After evolution, the void* pointer to an instance of epart now also points to a part, as well as an int, the value of the member part::part_id. But while before evolution the pointer could be interpreted as referring to the value of epart::part_id, after evolution it could no longer be interpreted as referring to this object. Since the value of epart::part_id is no longer one of the pointer's associated objects, the pointer becomes illegal. (Remember that ObjectStore makes no semantic connection between epart::part_id and part::part_id.)

Note that void* pointers appear in every database, since the values of database roots are typed as void*. They might be common in some databases, since in the underlying representations of ObjectStore collections, elements are typed as void*.

Pointers and references to transient or freed memory and type-mismatched pointers and references: these are pointers and references that are illegal even before schema evolution, but ObjectStore will detect them during instance initialization. Pointers and references to transient objects, or to objects that have been deleted, are illegal. Pointers and references with particular types that are not actually the addresses of some objects of that type are also illegal.

Example: Using Illegal Pointer Handlers

Consider the schema change made in Example: Changing Inheritance.

Changing epart and mpart to inherit from part

Changing epart and mpart to inherit from part; factoring out the common state to the base type.

As described above, if a database undergoes this schema change, and it contains void* pointers to eparts or mparts, these pointers will be detected as illegal, and should be handled with an illegal pointer handler.

A void* pointer to (for example) an epart is illegal because it could be interpreted, before evolution, as referring to the value of epart::part_id, which does not exist after evolution. But if we know this interpretation is never intended, then we can use the following illegal pointer handler.

Example: using an illegal pointer handler

#include <ostore/ostore.hh>
#include <ostore/coll.hh>
#include <ostore/schmevol.hh>
#include <ostore/mop.hh>
#include <stdio.h>
#include <string.h>
#include "part5new.hh"
static void my_illegal_pointer_handler(
                  objectstore_exception& exc, 
                  char* explanation, 
                  void*& illegalp
) {
      if (& exc == & err_se_ambiguous_void_pointer) 
      {
            os_path * member_path =
                   os_schema_evolution::get_path_to_member(illegalp);
            if (member_path) 
            {
                  char * path_string = os_schema_evolution::path_name(
                        * member_path);
                  if (strcmp(path_string, "epart.supplier_id") == 0 ||
                        strcmp(path_string, "mpart.supplier_id") == 0) 
                  {
                        /* We know that these void * pointers in the */
                        /* pre-evolved world should be void * pointers */
                        /* to parts in the post-evolved world, so we set */
                        /* the pointer to the evolved object */
                        illegalp = (void *)
                         os_schema_evolution::
                        get_evolved_address(illegalp);
                        return;
                  } /* end if */
            } /* end if */
      } /* end if */
      /* an unanticipated illegal pointer, signal the exception */
      exc.signal(explanation);
}

Using transformers with illegal pointer handlers

For this example, we use the same transformers as Example 3. Below is an application that associates the transformers with their classes, registers the illegal pointer handler, and invokes evolution.

#include <ostore/ostore.hh>
#include <ostore/coll.hh>
#include <ostore/schmevol.hh>
#include <ostore/mop.hh>
#include <stdio.h>
#include <string.h>
#include "part5new.hh"
main(int, char * argv[]) {
      /* register the illegal pointer handler */
       os_schema_evolution::set_illegal_pointer_handler(
            my_illegal_pointer_handler
      );
      /* associate epart_transform with the class epart */
      os_schema_evolution::augment_post_evol_transformers(
            os_transformer_binding("epart", epart_transform)
      );
      /* associate mpart_transform with the class mpart */
      os_schema_evolution::augment_post_evol_transformers(
            os_transformer_binding("mpart", mpart_transform)
      );
      /* perform the evolution process */
      os_schema_evolution::evolve(argv[2], argv[1]);
}

Obsolete Index and Query Handlers

When the selection criterion of a query or the path of an index makes reference to a removed class or data member, or makes incorrect type assumptions in light of a schema change, the query or index becomes obsolete. ObjectStore detects all obsolete queries and indexes. In the case of an obsolete query, ObjectStore internally marks the query so that subsequent attempts to use it result in the exception err_os_query_evaluation_error.

As with illegal pointers, you can handle obsolete queries or indexes by providing a special handler function for each purpose. If you do not supply handlers, ObjectStore signals an exception when it detects an obsolete query or index.

Handling obsolete queries or indexes

To handle obsolete queries or indexes:

Define a function with the appropriate signature.
Register the function with a call to the static member function os_schema_evolution::set_obsolete_index_handler() or os_schema_evolution::set_obsolete_query_handler().

Form of obsolete query handler call

The signature for an obsolete query handler is

      void  function_name(os_coll_query &query, 
                  const char *query_expr)

A reference to the obsolete query is passed in, together with a string expressing the query's selection criterion.

Form of obsolete index handler call

The signature for an obsolete index handler is

      void  function_name(os_collection &coll, const char *path_string)

A reference to the collection indexed by the obsolete index is passed in, together with a string expressing the index's path (key).

Task List Reporting

Before initiating evolution for a particular schema change, you might want to generate a task list to verify your expectations concerning the instance initialization phase. The task list contains a function definition for each class whose instances will be migrated.

Form of the call

Each function has a name of the form

       class-name@[1]::initializer()

where class-name names the function's associated class.

Statements for data members and their classes

Each function definition contains a statement or comment for each data member of its associated class. For a member with value type T, this statement or comment is any of

Assignment statement
Call toT@[1]::copy_initializer()
Call toT@[2]::construct_initializer()
Call to T@[1]::initializer()
Comment indicating that the field will be initialized to zero

Assignment statements

An assignment statement is used when the old and new value types of the member are assignment compatible:

T@[1]::copy_initializer() is used when the member has not been modified by the schema change, and the new value can be copied bit by bit from the old value.
T@[2]::construct_initializer() is used when the value type has been modified and the new value type is a class.
T@[1]::initializer() is used when the member has not been modified by the schema change, but instances of the value type of the member will be migrated. Definitions for all these functions appear in the task list.

A program to generate a task list is just like a program to perform evolution, except that the static member function os_schema_evolution::task_list() is called instead of os_schema_evolution::evolve().

task_list() function

The function task_list() has two overloadings analogous to the two overloadings of evolve(), declared as follows:

      static void task_list(
            const char *workdb_name, 
            const char *db_to_evolve
      );
      static void task_list(
            const char *workdb_name, 
            const os_collection &dbs_to_evolve
      );

Using task_list()

Prior to calling task_list(), you use os_schema_evolution::set_task_list_file_name() to specify the file to which the task list is to be sent. This function is declared as follows:

      static void set_task_list_file_name(const char *file_name);

As with evolve(), the new schema is, by default, the schema of the application that calls task_list(), but you can specify the new schema with os_schema_evolution::set_evolved_schema_db_name() before calling task_list().

Also as with evolve(), you must specify the classes that are to be removed from the schema with os_schema_evolution::augment_classes_to_be_removed(). The calls should precede the call to task_list().

Instance Initialization Rules

This section starts with a description of the various categories of schema evolution. Following this discussion, the initialization rules for each category are described.

Kinds of schema modifications

The different kinds of schema modification can be divided into three broad (not entirely disjoint) categories:

Class creation
Class redefinition
Class deletion

Kinds of class redefinitions

The kinds of class redefinition, in turn, can be divided into three subcategories: changes relating to

Inheritance
Data members
Member functions

Categories and subcategories of schema modification

Class Creation

Adding a class to a database's schema never, by itself, requires the use of the schema evolution facility. This is because a new class cannot have any previously existing instances. Since there cannot be any existing instances, instance migration is not necessary, and adding the class to the database's schema is handled automatically when an application using the new class opens the database.

Inheritance Redefinition

But, although adding a class does not by itself require using the evolution facility, sometimes adding a class involves also redefining another existing class. This is the case when you add a new class as a base class of another existing class, for example. The definition of the existing class must be changed to specify inheritance from the new class. And the representation of instances of the derived class must be supplemented with a subobject corresponding to the new base class. Such schema changes fall under the category of inheritance redefinition.

In general, inheritance redefinition includes changing a class to inherit from a new or existing class, and changing a class so that it no longer inherits from an existing class, or changing class inheritance from virtual to nonvirtual or the reverse. See Instance Reclassification.

Data Member Redefinition

Class redefinition relating to data members includes changing the definition of a class by adding or deleting members, changing the value type of a data member, and changing the order of data members. (To change the name of a data member, you delete it and then add a new one with the desired name.) See Instance Reclassification.

Member Function Redefinition

There are only two kinds of member function-related changes that require schema evolution: changing the definition of a class by adding the first virtual function, and changing the definition of a class by removing the only virtual function. These modifications require schema evolution because they change the representation of any instances of the modified class. Other changes related to member functions have no effect on the layout of class instances, and so do not require schema evolution. See Instance Reclassification.

Class Deletion

In the case of class deletion, instance migration consists of the deletion of existing instances of the deleted classes. Any pointers typed as pointers to a deleted class are detected before instance initialization, and result in an err_schema_evolution exception. Any void* pointer to an instance of a deleted class (or pointer to a subobject of such an instance) is detected as an illegal pointer.

As with class creation, deleting a class might at the same time involve changing the inheritance structure of some other class. This is the case, for example, when you delete a class that serves as a base class of another class that is to remain in the schema. The definition of the remaining class must be changed so that it no longer specifies inheritance from the deleted class. And the representation of the remaining class's instances must have the subobject corresponding to the base class removed. Such schema changes fall under the category of inheritance redefinition as well as class deletion. See Instance Reclassification.

Instance Reclassification

As mentioned earlier, the schema evolution facility provides a special capability for reclassifying instances of a base class so that they become instances of classes derived from the base class. This form of instance migration is never actually required by a schema change, but it is often desirable.

The sections that follow discuss the default initialization rules for each of these categories (except class creation, which, as explained, does not require the use of the evolution facility). See Instance Reclassification.

Schema Changes Related to Data Members

The sections that follow consider the different types of schema modification related to data members. They are:

We are particularly concerned with describing the instance migration phase of schema evolution for each kind of modification.

Categories of data member redefinition

Notice that indirect instances of a modified class are migrated just as are direct instances. That is, if you change the definition of base class B, then instances of class D, derived from B, will be migrated just as are direct instances (if there are any) of B.

Adding Data Members

When you add a data member to a class, the schema evolution process changes the representation of any of its instances by adding a field to hold the value of the new member. How this field is initialized depends on the value type of the new member.

If the value type is a built-in, nonarray type (integral type, floating type, pointer type, reference type, enumeration type, or pointer to member type), it is initialized with the appropriate representation of 0. If the value type is a class, the field is initialized as if by a constructor that initializes each member to 0.

If the value type is an array type, each element of the array is initialized (for arrays of built-ins) with 0 or (for arrays of class instances) as if by a constructor that initializes each member to 0 for the array's element class. For arrays of arrays, these rules are applied recursively. In other words, an array is initialized by initializing each of its elements as if it were a separate data member.

As with all modified classes, the class with the new data member can have an associated transformer function that you supply. If you want, this function can overwrite these default initializations, supplying a value for the new field in whatever way meets your needs.

Deleting Data Members

When you delete a data member from a class, the schema evolution process changes the representation of any of its instances by removing the field that held the value of the deleted member. Since no new storage is created by this schema change, the issue of initialization does not arise. Note however that a transformer function for the modified class can still access the value of the removed member in the unevolved instance. See Example: Changing Inheritance.

By default, pointers to members being removed result in an illegal pointer exception during evolution. You can, however, supply an illegal pointer handler to process the illegal pointer and resume evolution. See Illegal Pointers.

Changing the Value Type of a Data Member

When you change the value type of a data member, the schema evolution process changes the representation of any of its instances by adjusting the size of the member's associated storage (if necessary) and reinitializing that storage. How this storage is initialized depends on the new and old value types.

Consider first the case in which the new value type is not an array type.

Assignment-compatible value types

Old and new member declarations with assignment-compatible value types.

If the new and old types are assignment compatible, the new field is initialized by assignment. That is, ObjectStore assigns the value of the old data member to the storage associated with the new member, applying any standard conversions defined by the C++ language.

For example, if you change the value type of a data member from int to float, an old instance with the value (int)(17) for this member will be changed to have value (float)(17.0).

In some cases schema evolution considers types assignment compatible when C++ would not. For example, if D is derived from B, schema evolution will assign a B* to a D* if it knows that the B is also an instance of D.

If the new and old types are not assignment compatible, there are two cases.

New value type is a built-in

Old and new member declarations with assignment-incompatible value types, where the new value type is a built-in.

If the new value type is a built-in, nonarray type (integral type, floating type, pointer type, reference type, enumeration type, or pointer to member type), it is initialized with 0.

New value type is a class

Old and new member declarations, where the new value type is a class.

If the new value type is a class, the field is initialized as if by a constructor that initializes each member to 0.

If you change the value type of a data member by changing it from a signed integer type to an unsigned integer type, or the reverse, you do not need to perform schema evolution. This is because such a change does not change the size of the associated field, and does not change how (sufficiently small) positive numbers are represented.

Now consider the case in which the new value type is an array type.

Array values with compatible types

Old and new member declarations with array value types whose elements are assignment compatible.

If the old value type is also an array type, and if the element types of the arrays are assignment compatible, the new field is initialized by assignment. That is, ObjectStore assigns the value of the i^th element of the old array to the i^th element of the new array, applying any standard conversions defined by the C++ language. This is done for all i between 0 and one less than the size of the smaller array.

If the new array has n more elements than the old array, the trailing n elements of the new array are initialized with 0 (if the element type is a built-in, nonarray type) or as if by a generate default constructor (if the element type is a class). If the old array has n more elements than the new array, the trailing n elements of the old array are ignored.

Array values with incompatible types

Old and new member declarations with array value types whose elements are not assignment compatible.

If the old value type is also an array type, but the element types are not assignment compatible, then each element of the new array is initialized with 0 (if the element type is a built-in, nonarray type) or as if by a constructor that initializes each member to 0 (if the element type is a class).

Non-array to array type

Old and new member declarations; the old value type is a nonarray type and the new value type is an array type.

If the old value type is not an array type, each element of the new array is initialized with 0 (if the element type is a built-in, nonarray type) or as if by a constructor that initializes each member to 0 (if the element type is a class).

In general, arrays are initialized by initializing each array element as if it were a separate data member.

For a multidimensional array, these rules apply to the first dimension, and recursively to the other dimensions if the length of each other dimension is not changed by evolution. If the length of one of these other dimensions changes, every element of the multidimensional array is initialized with 0 (if the element type is a built-in) or as if by a constructor that initializes each member to 0 (if the element type is a class).

As with all modified classes, the class with the modified data member can have an associated transformer function that you supply. If you want, this function can overwrite these default initializations, supplying a value for the new field in whatever way meets your needs.

Bit fields are evolved according to the default signed/unsigned rules of the implementation that built the evolution application. This can lead to unexpected results when an evolution application built with one default rule evolves a database originally populated by an application built by an implementation whose default rule differs. The unexpected results occur when the evolution application attempts to increase the width of a bit field.

Changing the Order of Data Members

When you change the order of the data members defined by a class (by changing the order in which their declarations appear within the definition of the class), the schema evolution process changes the representation of any of its instances by reordering the storage fields associated with the members. Since there is no new storage created by this schema change, the issue of initialization does not arise.

Summary of Data Member Changes Not Requiring Explicit Evolution

Note that you do not need to invoke schema evolution to make the following kinds of data member modifications:

Changing the value type of a data member from a signed type to unsigned type and the reverse
Changing the access specified for a data member (private, public, or protected)
Changing the value type of a data member from a const to non-const type and the reverse
Adding or removing static data members

Schema Changes Related to Member Functions

As mentioned earlier, there are only two kinds of member-function-related changes that require schema evolution: changing the definition of a class by adding the first virtual function, and changing the definition of a class by removing the only virtual function. These modifications require schema evolution because they change the representation of any instances of the modified class. Other changes related to member functions have no effect on the layout of class instances, and so do not require schema evolution.

Schema Changes Related to Class Inheritance

Changes relating to class inheritance include adding base classes, removing base classes, and changing class inheritance from virtual to nonvirtual, or the reverse. Each of these is discussed in the following sections:

Adding Base Classes

When you modify a database's schema by adding a base class, say B, to an existing class, say D, instances of D must be supplemented with a B part.

Adding a base class to an existing class

When the class D is modified to inherit from a base class, B, its instances must be modified to include a B part.

The instance initialization phase of schema evolution will add the B part to each instance of D, and initialize that part as if by a constructor that initializes each member to 0.

If you provide a transformer function for D, it will be run during the instance transformation phase.

Note that this category of schema change covers more cases than might be suggested by the illustration above.

In particular,

Schema evolution works just the same if B is added as a base class to more than one existing class. Each instance of each existing class must be supplemented with a B part.
Indirect as well as direct instances of a class made to inherit from a base class must be migrated (see "When changing a class requires migrations").
The class that is added as a base class might or might not be part of the old schema. In either case, no instance migration need be performed for the base class, unless it too has evolved (but see Instance Reclassification).

When changing a class requires migrations

When you change the definition of B so that it inherits from A, instances of C (derived from B) must be migrated.

Removing Base Classes

When you change a class, D, so that it no longer inherits from a given class, B, each instance of D is migrated by removing the subobject corresponding to B.

Modifying other instances when removing a class

When the class D is modified so that it no longer inherits from a base class, B, its instances must be modified to remove the B part.

Pointers to the subobject being removed, if they are typed as B* rather than D*, result in an illegal pointer exception's being signaled during evolution. (Pointers typed as D* are, of course, automatically adjusted to point to the migrated instance of D.) The same is true for pointers (so typed) to data members of the deleted subobject.

Changing Between Virtual and Nonvirtual Inheritance

Consider a class X that inherits nonvirtually from a class B. If you change X to inherit virtually from B, instances of X must be migrated. In particular, for each instance of X, the nonvirtual B subobject is eliminated and a virtual (shared) B subobject is introduced. Each instance of X will have its virtual B subobject initialized as if by a constructor that sets each member to 0. This applies to all instances of X, including instances that are subobjects of other objects, either as a data member value or as a subobject corresponding to a base class. The figure below illustrates one such case. In general, every virtual subobject introduced by the inheritance change is initialized as if by a constructor that sets each field to 0.

Virtual inheritance

When you change both X and Y to inherit virtually from B, instances of Z (derived from both X and Y) are migrated so that they have only a single B part.

Updated: 03/31/98 15:31:20