ObjectStore C++ Advanced API User Guide

Chapter 6

Compaction

The information about compaction is organized in the following manner:

Compaction Overview

ObjectStore databases consist of segments containing persistent data. As persistent objects are allocated and deallocated in a segment, internal fragmentation in the segment can increase because of the presence of holes produced by deallocation. Of course, the ObjectStore allocation algorithms recycle deleted storage when objects are allocated, but nonetheless, there might be a need to compact persistent data by squeezing out the deleted space. Such compaction frees persistent storage space so that it can be used by other segments.

Compaction API - objectstore::compact()

The programming interface to compaction is provided by a single function, objectstore::compact(). The function is described in detail in objectstore::compact() in Chapter 2, Class Library, of the ObjectStore C++ API Reference.

Declaration
The function is declared this way:

      static void objectstore::compact (
            char **dbs_to_be_compacted, 
            os_pathname_and_segment_number
                        **segments_to_be_compacted = 0,
            char **dbs_referring_to_compacted_ones = 0,
            os_pathname_and_segment_number
                        **segs_referring_to_compacted_ones = 0
      );
Function arguments
Here, dbs_to_be_compacted is a null-terminated array of the pathnames of all the databases that are to be compacted in their entirety. That is, ObjectStore will compact every segment of every database named by an element of dbs_to_be_compacted. A simple call to compact() can supply only this one argument:

      char *compact_dbs[NUM_COMPACT_DBS];
      . . . 
      objectstore::compact (compact_dbs);
The argument segments_to_be_compacted, like dbs_to_be_compacted, specifies what data you want compacted, but it does so at segment granularity rather than database granularity. This argument is a null-terminated array of pointers to instances of the class os_pathname_and_segment_number. Each such instance identifies a particular segment by encapsulating the pathname of the database containing the segment together with the segment number of that segment within the database. The constructor for this class is declared this way:

      os_pathname_and_segment_number::
            os_pathname_and_segment_number(
                        const char *db, 
                        os_unsigned_int32 seg_number
      );
You can obtain the segment number of a segment with an os_segment::get_number whose value type is a const char*. So here is how you might create an os_pathname_and_segment_number to identify a particular segment:

      os_database *db1 = os_database::lookup("/user/parts/db1");
      . . . 
      /* retrieve obj1 from db1*/
      . . . 
      os_segment *seg1 = os_segment::of(obj1);
      os_pathname_and_segment_number *seg1_identifier = 
            os_pathname_and_segment_number(
                        db1,
                        seg1->get_number()
      );

Cross-Database Pointers and References

The argument dbs_referring_to_compacted_ones can be understood as follows. When ObjectStore compacts a segment, it must adjust all pointers and ObjectStore references to the objects in that segment. ObjectStore always adjusts any pointers and references to these objects that are in the same database, but if there are pointers or references to these objects in other databases, you must specify these other databases explicitly. You do this with a null-terminated array of the databases' pathnames.

If you know that the pointers and references to compacted objects are restricted to certain segments in certain databases, you can specify just these segments rather than the entire databases. This can speed up the compaction operation. You do this with the argument segs_referring_to_compacted_ones, which is a null-terminated array of pointers to instances of pathname_and_segment_number.

Compaction Example

Here is a program fragment that calls this function with all four arguments (a full program is presented at the end of this section):

      char *compact_dbs[NUM_COMPACT_DBS];
      char *reference_dbs[NUM_REFERENCE_DBS];
      os_pathname_and_segment_number *
                  compact_segs[NUM_COMPACT_SEGS];
      os_pathname_and_segment_number *
                  reference_segs[NUM_REFERENCE_SEGS];
      . . . 
      objectstore::compact(
            compact_dbs, compact_segs, reference_dbs,
                  reference_segs
      );

Null Termination

Remember that each argument is a null-terminated array. In each array, you must set to 0 the element immediately following the last element specifying a database or segment. In addition, it is the caller's responsibility to delete the storage associated with the arguments when the function returns.

The function will signal the exception err_os_compaction if any invalid arguments are supplied.

Compaction and Transactions

The function objectstore::compact() must be invoked outside any ObjectStore transaction. The function itself initiates a transaction, and does all its work within that one transaction. During this time, all the specified databases and segments will be locked, preventing access by other processes.

You can control the amount of time that other applications are locked out of data access by compacting a few segments at a time. For example, to compact a single segment in a particular database, you can use the following code:

      char* referencing_dbs[2]; 
      os_pathname_and_segment_number* compact_segs[2]; 
      os_pathname_and_segment_number seg(
            "database_foo", 
            segment_to_be_compacted->get_number()
      ); 
      referencing_dbs[0] = "database_foo" ; 
      referencing_dbs[1] = 0 ; 
      compact_segs[0] = &seg ; 
      compact_segs[1] = 0 ; 
      objectstore::compact (0, compact_segs, referencing_dbs); 
      . . . 
If you want to run compaction in a separate process, the application can use the UNIX exec facility to start up another process that calls the function.

Measuring Unused Space with os_segment::unused_space()

To serve as a rough guide in determining whether a segment needs to be compacted, ObjectStore provides the function os_segment::unused_space():

      os_unsigned_int32 os_segment::unused_space() const;
This function returns the amount of space (in bytes) in the segment not currently occupied by any object. It accounts for space resulting from objects that have been deleted as well as space that cannot be used as a result of internal ObjectStore alignment considerations. Here is an example of its use:

      if ((float) (seg1->unused_space()) / (float) (seg1->get_size()) > .10)
            compact_segs[i++] = seg1 ;
See also os_segment::unused_space() in Chapter 2, Class Library, of the ObjectStore C++ API Reference.

Header File for Compaction

Programs using compaction must include the header file <ostore/compact.hh>, and link with the compaction library.

Compaction Example

Here is a complete program that uses the compact() function:

      #include <iostream.h>
      #include <ostore/ostore.hh>
      #include <ostore/compact.hh>
      extern "C" {
            char* getenv(const char*);
            int strcmp( const char*, const char* );
            int atoi( const char* );
      }
      static void printUsage() {
            cout <<       "Usage: apicompact [-s] [-bs] (-d <dbname>)+ \
                  (-r <dbname>)+ \n"
                  << " (-ds <dbname> <seg>)+ (-rs <dbname> <seg>)+ \n"
                  << " -s is for silent operation.\n"
                  << " -bs is for batch schema installation.\n"
                  << " -d database to compact.\n"
                  << " -r database with ref's to compacted data.\n"
                  << " -ds database segment to compact.\n"
                  << " -rs database segment with ref's to compacted data.\n";
            cout.flush();
      }
      int apicompact_main(int argc , char* argv[]) {
            char* compact_dbs[16];
            int compact_dbs_no = 0;
            char* reference_dbs[16];
            int reference_dbs_no = 0;
            os_pathname_and_segment_number* compact_segs[16];
            int compact_segs_no = 0;
            os_pathname_and_segment_number* reference_segs[16];
            int reference_segs_no = 0;
            int silent = 0;
            int inc_schema = 1;
            for( int i = 1; i < argc; i++ ) {
                  if ( strcmp(argv[i], "-d") == 0 && i < argc - 1 ) {
                        i++;
                        compact_dbs[compact_dbs_no++] = argv[i];
                  } /* end if */
                  else if ( strcmp(argv[i], "-r") == 0 && i < argc - 1 ) {
                        i++;
                        reference_dbs[reference_dbs_no++] = argv[i];
                  } /* end else *if /
                  else if ( strcmp(argv[i], "-ds") == 0 && i < argc - 2 ) {
                        compact_segs[compact_segs_no++] =
                              new os_pathname_and_segment_number(
                              argv[i+1], atoi( argv[i+2]));
                        i += 2;
                  }  /* end else if */
                  else if ( strcmp(argv[i], "-rs") == 0 && i < argc - 2 ) {
                        reference_segs[reference_segs_no++] =
                              new os_pathname_and_segment_number( 
                              argv[i+1], atoi( argv[i+2]));
                        i += 2;
                  } /* end else if */
                  else if ( strcmp( argv[i], "-s" ) == 0 ) {
                        silent = 1;
                  }  /* end else if */
                  else if ( strcmp( argv[i], "-bs" ) == 0 ) {
                        inc_schema = 0;
                  }  /* end else if */
                  else {
                        printUsage();
                        cout << "No option \"" << argv[i] << "\"\n";
                  } /* end else */
            }  /* end for loop */
            compact_dbs[compact_dbs_no] = 0;
            reference_dbs[reference_dbs_no] = 0;
            compact_segs[compact_segs_no] = 0;
            reference_segs[reference_segs_no] = 0;
            if( !silent )
                  cout << "Starting " << argv[0] << endl << flush;
            objectstore::initialize();
            objectstore::set_incremental_schema_installation(
                  inc_schema );
            objectstore::compact (
                  compact_dbs, compact_segs,
                  reference_dbs, reference_segs);
      if( !silent )
            cout << "Finished " << argv[0] << endl << flush;
      return 0;
} /* end apicompact_main */

Compactor Limitations

The compactor operates under the following restrictions.

The compactor compacts all C and C++ persistent data, including ObjectStore collections, indexes, and bound queries, and correctly relocates pointers and all forms of ObjectStore references to compacted data.

ObjectStore os_reference_local references are relocated assuming that they are relative to the database containing them.

The compactor respects ObjectStore clusters, in that compaction ensures that objects allocated in a particular cluster remain in the cluster, although the cluster itself can move as a result of compaction.

Restrictions on Compaction Use

The following data restrictions must be observed in using this compactor:

File Systems and Compaction

ObjectStore supports two file systems for storing databases, and the compactor can run against segments in databases in either file system.

File Databases

In the case of a single database stored as a single host system file, the segments are made up of extents, all of which are allocated in the space provided by the host operating system for the single host file. When there are no free extents left in the host file, and growth of an ObjectStore segment is required, the ObjectStore Server extends the host file to provide the additional space. The compactor permits holes contained in segments to be compacted to be returned to the allocation pool for the host file, and hence that space can be used by other segments in the same database. However, since operating systems provide no mechanism to free disk space allocated to regions internal to the host file, any such free space remains inaccessible to other databases stored in other host files.

Rawfs Databases

An ObjectStore rawfs database, on the other hand, stores all databases in a single region, either one or more host files or a raw partition. When using a rawfs, any space freed by the compaction operation can be reused by any segment in any database stored in the rawfs.

Compaction Utility

In addition to the programming interface described in Compaction API - objectstore::compact(), ObjectStore provides an executable, oscompact, that can be used to compact specified databases and segments. See oscompact: Compacting Databases in Chapter 4, Utilities, of ObjectStore Management for information on the oscompact utility.



[previous] [next]

Copyright © 1997 Object Design, Inc. All rights reserved.

Updated: 03/31/98 15:29:33