ObjectStore C++ API User Guide

Chapter 3 Transactions

The information about transactions is organized in the following manner:

Transactions Overview

A transaction is a logical unit of work, a consistent and reliable portion of the execution of a program. You mark the beginnings and ends of transactions in your code using calls to the ObjectStore API. Access to persistent data must always take place within a transaction.

Transactions in a database system serve two general purposes:

They support fault tolerance.
They support concurrent database access.

Fault Tolerance

In support of fault tolerance, transactions have the following properties:

Either all a transaction's changes to persistent memory are made successfully, or none are made at all. If a failure occurs in the middle of a transaction, none of its database updates are made.
A transaction is not considered to have completed successfully until all its changes are recorded safely on stable storage. Once a transaction commits, failures such as server crashes or network failures cannot erase the transaction's changes.

Fault tolerance is implemented using a transaction log. For further details, see Logging and Propagation in Chapter 2, Advanced Transactions, of the ObjectStore Advanced C++ API User Guide.

Concurrency Control

Transactions support concurrent database access by preventing one process's updates from interfering with another process's reads or updates. ObjectStore's concurrency control facilities prevent this interference by ensuring that transactions have the following properties:

A transaction's changes to persistent data are private and invisible to other processes until the entire transaction completes successfully.
Other processes' changes to persistent data are invisible to a transaction.

Concurrency control is implemented using strict two-phase locking (see Locking), and also - in the case of abort-only transactions and multiversion concurrency control (MVCC) - using special techniques of delaying propagation and intentionally aborting transactions. See Using Dynamic Transactions, as well as the discussion on Multiversion Concurrency Control (MVCC) in Chapter 2, Advanced Transactions, of the ObjectStore Advanced C++ API User Guide.

Transaction Commit and Abort

Transactions can terminate in two ways: successfully or unsuccessfully. When they terminate successfully, they commit, and their changes to persistent memory are made permanent and visible. When they terminate unsuccessfully, they abort. There are several kinds of transaction aborts:

Explicit aborts, which result from calls to transaction member functions. See Rolling Back to Persistent State.
Aborts due to deadlock conditions. See Threads and Thread Locking.
Aborts due to nonlocal transfer of control out of the scope of a lexical transaction. (See Using Transactions.) This can happen when an exception is signaled within a lexical transaction and handled outside it, or not handled at all.
Aborts due to system failure.

If a transaction aborts, its changes to persistent memory are not made permanent or visible to other processes. After an abort, your program sees persistent memory as it was just before the aborted transaction started. But only persistent memory changes are rolled back. Transient memory is not restored to its pretransaction state however, and any form of output that occurred before the abort is not, of course, undone.

Using Transactions

With ObjectStore, every statement that reads from or writes to persistently allocated memory must be within a transaction. If you attempt to access persistent data outside a transaction, err_no_trans is signaled.

This applies to statements that access data in a database, but not to all statements that operate on a database. Statements that create, open, or close a database can be either inside or outside a transaction, although, generally, it is advisable not to open or close a database within a transaction.

Using Lexical Transactions

You begin and commit lexical transactions with the following macros:

      OS_BEGIN_TXN( identifier,exception**,transaction-type)

and

      OS_END_TXN( identifier)

The macro arguments are used (among other things) to concatenate unique names. The details of macro preprocessing differ from compiler to compiler, and in some cases you must enter these macro arguments without white space to ensure that the argument concatenation will work correctly.

These and other ObjectStore macros are described in Chapter 4, System-Supplied Macros, in the ObjectStore C++ API Reference.

identifier is a transaction tag. The only requirement on the tag is that different transactions in the same function must use different tags. (The tags are used to construct statement labels, and so have the same scope as labels in C++.)

exception** specifies a location in which ObjectStore will store an exception* if the transaction is aborted because of the raising of an exception. Raising an exception will cause a lexical transaction to abort if the exception is handled outside the transaction's dynamic scope, or if there is no handler for the exception. The stored exception* indicates the exception that caused the abort. ObjectStore stores 0 in this location at the beginning of each transaction.

Transaction type enumerators

transaction-type is one of the following enumerators, defined in the scope of os_transaction:

os_transaction::update specifies a transaction in which updates to persistent memory are allowed.
os_transaction::read_only specifies a transaction in which any attempt to update persistent memory signals the exception err_write_permission_denied.
os_transaction::abort_only specifies a transaction in which writes to persistent memory are allowed, but the transaction cannot be committed.

If a lexical transaction is aborted due to deadlock, it is automatically retried. See Threads and Thread Locking.

Example: a lexical transaction

      #include <iostream.h>
      #include <ostore/ostore.hh>
      main(int, char **argv) {
            os_database *db1 = os_database::open( argv[1] ) ;
            OS_BEGIN_TXN(my_tx_1,0,os_transaction::update)
                  int countp* = (int*)( db1->find_root("count")->get_value() ) ;
                  cout << "Hello, world\n" ;
                  cout << ++*countp << "\n" ;
            OS_END_TXN(my_tx_1)
            db1->close() ;
      }

Using Dynamic Transactions

You start and commit dynamic transactions with the following members of the class os_transaction:

      static os_transaction *begin(
            os_int32 transaction_type = os_transaction::update      

      ) ;
      static void commit() ;
      static void commpwd
it( os_transaction* ) ;

The statements executed in between the calls are all within the same transaction.

Transaction type enumerators

transaction_type is one of the following enumerators, defined in the scope of os_transaction:

os_transaction::update specifies a transaction in which updates to persistent memory are allowed.
os_transaction::read_only specifies a transaction in which any attempt to update persistent memory signals the exception err_write_permission_denied.
os_transaction::abort_only specifies a transaction in which writes to persistent memory are allowed, but the transaction cannot be committed. An attempt to commit the transaction signals the exception err_commit_abort_only.

begin() returns a pointer to a transaction, an instance of the class os_transaction.

The first overloading of commit() commits the current transaction. In the case of nesting, it commits the most nested transaction. The second overloading of commit() commits the specified transaction.

Unlike lexical transactions, if a dynamic transaction is aborted due to deadlock, it is not automatically retried. See Threads and Thread Locking.

Locking

As with most database systems, ObjectStore tries to interleave the operations of different processes' transactions to maximize concurrent usage of resources. When scheduling the operations, ObjectStore conforms to the strict two-phase locking discipline (except in the case of multiversion concurrency control as described in Multiversion Concurrency Control (MVCC) in Chapter 2, Advanced Transactions, of the ObjectStore Advanced C++ API User Guide). This discipline has been proven correct in the sense that it guarantees serializability; that is, it guarantees that the results of the schedule will be just the same as the results of noninterleaved scheduling of the transactions' operations.

Waiting for Locks

Roughly speaking, when you access data in the database, you are given exclusive access to that data for the duration of the transaction in which the access takes place. That is, when you access data, that data is locked. As long as it is locked, no other process can access it. The data is not unlocked until the end of the transaction.

Database- Compared to Segment-Level Locks

There are different kinds of locking provided by database and segment level locks. As its name implies, a database lock prohibits access to the entire database. A segment-level lock only blocks access to the specific segment affected by the transaction.

Read Locks and Write Locks

Locking actually treats reading data differently from writing data. When your process reads a persistent data item (such as a data member or persistent variable), the page on which the item resides is read locked. This prevents other processes from writing to that page, but they are still allowed read access to it. When your process writes a data item, the page on which it resides is write locked unless the transaction is abort_only. If the transaction is abort_only, the client obtains read locks for all pages read or written but does not get any write locks. This prevents other processes from reading or writing to that page. (See Transaction Locking Examples in Chapter 2, Advanced Transactions, of the ObjectStore Advanced C++ API User Guide, as well as os_transaction::abort_only in the ObjectStore C++ API Reference.)

Lock Timeouts

You can set a timeout for read- or write-lock attempts, to limit the amount of time your application will wait to acquire a lock. When the timeout is exceeded, an exception is signaled. Handling the exception allows you to continue with alternative processing, and make a later attempt to acquire the lock. See the set_readlock_timeout() and set_writelock_timeout() members of the classes objectstore, os_database, and os_segment in the ObjectStore C++ API Reference.

Reducing Wait Time

There are a number of ways to minimize the amount of time your process spends waiting for locks. See Reducing Wait Time for Locks in Chapter 2, Advanced Transactions, of the ObjectStore Advanced C++ API User Guide.

Lock Probes

You can determine whether a specified address is read locked, write locked, or unlocked with objectstore::get_lock_status(). See the ObjectStore C++ API Reference.

Explicit Lock Acquisition

Normally, ObjectStore performs locking automatically and transparently to the user. But you can explicitly lock a specified page range for read or write with objectstore::acquire_lock(). See the ObjectStore C++ API Reference.

Organizing Transaction Code

If you make transactions too short, you might be allowing other processes to interfere in a harmful way with your process. That is, if some chunk of your code is grouped into two or more short transactions when it should really be all within a single longer transaction, your process or others could produce incorrect results. Here are some guidelines about how to organize your code into transactions.

Guidelines for organizing code within a transaction

In general, you should put a given chunk of code inside a single transaction when

You do not want other processes to see intermediate results of this code's execution.
You want the state of the database to be frozen, from the point of view of the code being executed, for the duration of its execution. That is, no changes to the database made by other concurrent processes should be visible for the duration of the code's execution.

Another reason to put a chunk of code in a single transaction is to allow you to undo the code's changes at any point before the end of the chunk. See Rolling Back to Persistent State.

Hiding Intermediate Results

One kind of interference between processes occurs when one process uses some intermediate results of another. Just what constitutes an intermediate result depends on the application. Consider, for example, an imaginary MCAD application.

Suppose each of two processes is replacing two different children of a given part. Suppose further that each process must make some constraint check on the assembly after the replacement has been performed. Perhaps the total cost of the assembly must be checked against some allowable maximum cost.

In replacing a subpart, the first process removes a child part from the set of the assembly's children, and then inserts a different part into this set. But between the remove and insert, the assembly is in an intermediate state that should not be visible to the other process. Suppose, for example, the second process does its part replacement while the assembly is in this intermediate state, and then performs a cost check. The cost will be incorrect (too low), since a subpart is missing from the assembly. If the second process's new part raises the actual cost above the maximum, this will go undetected.

To prevent exposure of such intermediate states, the process should put the remove and the insert into the same transaction. This way, as far as other processes are concerned, the replacement happens all at once. In general, whatever happens within a single transaction looks to other processes as if it happens instantaneously, since the intermediate states are not visible to them.

Preventing Other Processes' Changes

Another kind of interference between processes arises when one process relies on the state of persistent memory's being unaffected by other processes for the duration of some operation.

Consider, for example, a routine that involves a recursive descent of a given assembly. Suppose that another process removes a subpart from the assembly, but it does not matter whether the descent is performed before or after the removal. Nevertheless, for this process to produce correct results, the assembly's descendents must not change during the descent itself. For if a subpart is removed after being visited, and then, before this removed subpart's children are visited, new children are added to it, these new children might be incorrectly visited as part of the original assembly's descendents. So all the code that performs the descent should be within the same transaction.

Rolling Back to Persistent State

If a transaction aborts, its changes to persistent memory are not made permanent or visible to other processes. After an abort, your program sees persistent memory as it was just before the aborted transaction started. You can abort a specified transaction using members of the class os_transaction. You can also abort a lexical transaction by signaling an exception within the transaction and handling the exception outside the transaction.

Aborting the Current Transaction

You can always roll back to the persistent memory state at the beginning of the current transaction (the most deeply nested transaction within which control currently resides) by calling the following member of the class os_transaction:

      static void abort() ;

For dynamic transactions, control flows to the next statement that follows the abort(). For lexical transactions, control flows to the next statement after the end of the current transaction block.

Persistent data is rolled back to its state as of the beginning of the transaction. In addition, if the aborted transaction is not nested within another transaction, all locks are released, and other processes can access the pages that the aborted transaction accessed.

Aborting the Top-Level Transaction

When you call os_transaction::abort() with no arguments, only the innermost transaction is aborted. But you can abort the outermost transaction with a call to the static member function os_transaction::abort_top_level(), with no arguments.

      static void abort_top_level() ;

Aborting a Specified Transaction

You can also specify a transaction in between, by including an argument in an os_transaction::abort() call.

      static void abort(os_transaction*) ;

The argument is a pointer to a transaction, an instance of the system-supplied class os_transaction. A pointer to the current transaction (the innermost transaction in which control currently resides) is returned by the static member function os_transaction::get_current().

      static os_transaction *get_current() ;

A pointer to its parent (the innermost transaction within which it is nested) is returned by the member function get_parent().

      os_transaction *get_parent() const ;

So, for example, to abort a transaction one level up from the current transaction, you might use the following code:

      os_transaction* child_tx = os_transaction::get_current() ;
      if (child_tx) {
            parent_tx = child_tx->get_parent() ;
            if (parent_tx)
                  os_transaction::abort(parent_tx) ;
      }

Example: abort()

Consider an example involving replacement of an assembly's subparts. A constraint check is required after each replacement. If the constraint check fails, you would like the replacement to be undone. To do so, you can conditionally call os_transaction::abort(), as in the code below:

      main() {
            os_database *db5 = os_database::open("/user1/db5");
            OS_BEGIN_TXN(tx1,0,os_transaction::update)
                  os_typespec *part_type =  ...;
                  part *a_wheel =  ...;
                  part *a_rim =  ...;
                  a_wheel->children -= a_rim;
                  /* in this intermediate state, the wheel has no rim */
                  /* but this state is not visible to other processes */
                  a_wheel->children |= new(db5, part_type) part(...);
                  if (!check_cost(a_wheel)) {
                        cout << "change aborted: cost check failed\n";
                        /* undo the part replacement* /
                        os_transaction::abort(); 
                  } /* end if */
            OS_END_TXN(tx1)
            db5->close();
      }

Since the abort results in control's leaving the scope of the current transaction, the current state of all local transient memory is lost. But transient state that is not local to this scope is unaffected by the abort. You should explicitly roll back or reinitialize such state before the abort, if desired.

Threads and Thread Locking

If your application uses multiple threads, you might need to take advantage of the thread-locking facilities provided by ObjectStore. These facilities ensure that ObjectStore does all interlocking between threads necessary to prevent threads from interfering with one another when within the ObjectStore run time. You are responsible for coding any thread synchronization required by your application while threads are not executing within an ObjectStore library. See Chapter 3, Threads, in the ObjectStore Advanced C++ API User Guide for more information about thread locking in multithreaded applications.

The thread-locking facility works by either serializing the transactions of different threads or serializing access by different threads to the ObjectStore run time. No two threads are ever in the ObjectStore run time at the same time.

Thread Safety

ObjectStore supports thread safety using a global mutex. This is a data structure that is used to synchronize threads. One global mutex coordinates all threads within an application. Thus, access to the ObjectStore API is currently serialized with one global mutex.

ObjectStore Release 5.1 provides a thread-safe version of the ObjectStore API. It does this by protecting the body of each API call with a mutex lock that only one thread can acquire at a time.

When You Need Thread Locking

If the synchronization coded in your application allows two threads to be within the ObjectStore run time at the same time, you need ObjectStore thread locking. A thread can enter the ObjectStore run time under either of the following circumstances:

The thread dereferences a pointer to persistent memory.
The thread calls an ObjectStore API function or macro.

If only one thread at a time ever enters the ObjectStore run time, you should disable ObjectStore thread locking. Do not use thread locking if you do not have to, since there is some extra performance overhead associated with it.

Disabling and Enabling Thread Locking

ObjectStore thread locking is enabled by default. To enable ObjectStore thread locking explicitly, pass a nonzero value to the following member of the class objectstore:

      static void set_thread_locking(os_boolean) ;

To disable ObjectStore thread locking, pass 0 to this function. To determine if ObjectStore thread locking is enabled, use the following member of objectstore:

      static os_boolean get_thread_locking() ;

If nonzero is returned, ObjectStore thread locking is enabled; if 0 is returned, ObjectStore thread locking is disabled.

Local and Global Transactions

For applications that use multiple threads, there are two kinds of transactions: local transactions and global transactions. Transactions started with OS_BEGIN_TXN() are always local. Transactions started with os_transaction::begin() are local by default, but you can also request a global dynamic transaction. See Using Global Transactions.

The two kinds of transactions have the following characteristics:

Local transaction: a thread enters a local transaction by calling os_transaction::begin() or OS_BEGIN_TXN(). When one thread enters a local transaction, this has no effect on whether other threads are within a transaction.
Global transaction: a thread enters a global transaction when it calls os_transaction::begin() or when another thread of the same process calls os_transaction::begin(). When one thread enters a global transaction by calling os_transaction::begin(), all other threads automatically enter the same transaction.

Local transactions synchronize access to the ObjectStore run time by serializing the transactions of the different threads (that is, by making the transactions run one after another without overlapping). After one thread starts a local transaction, if another thread attempts to start a transaction or enter the ObjectStore run time, it is blocked by the mutex lock until the local transaction completes. So two threads cannot be in a local transaction at the same time.

Global transactions allow for a somewhat higher degree of concurrency. After one thread enters the ObjectStore run time, if another thread attempts to enter the ObjectStore run time, it is blocked until control in the first thread exits from the run time. Although two threads cannot be in the ObjectStore run time at the same time, there can be some interleaving of operations of different threads within a transaction. See Chapter 3, Threads, in the ObjectStore Advanced C++ API User Guide for more information on using threads with ObjectStore.

Costs and Benefits of Global Transactions

Advantages of global transactions

Local transactions usually provide better performance, but for some applications, global transactions might be preferable. Here are some of the benefits of using global transactions:

Global transactions allow for a higher degree of concurrency.
With local transactions, if one thread attempts to access persistent memory from outside a transaction while another thread is performing relocation, data corruption can result. No exception is signaled. With global transactions (as in the absence of threads), there is no such possibility. Any attempt to access persistent memory from outside a transaction results in err_no_trans.

Disadvantages of global transactions

Some of the disadvantages of using global transactions are

Global transactions have extra overhead, compared to local transactions, in the form of extra memory management, particularly if there is a lot of cache replacement.
With global transactions, you must synchronize the threads so that no thread attempts to access persistent data while another thread is committing or aborting.

Using Global Transactions

You start a global transaction by passing the enumerator os_transaction::global as the second argument to os_transaction::begin().

      enum os_transaction_scope {
            os_transaction::local = 1,os_transaction::global
      };
      static os_transaction::begin(
            os_int32 type = os_transaction::update,
            os_int32 scope = os_transaction::local
      );

If you use global transactions, be sure to synchronize the threads so that no thread attempts to access persistent data while another thread is committing or aborting. Place a barrier before the end of the transaction so that all participating threads complete work on persistent data before the end-of-transaction operation is allowed to proceed. If you do not, data corruption and program failure can result.

The exception err_deadlock might be signaled asynchronously in any thread using persistent data; the application must be prepared to handle it. Once err_deadlock is handled in the first thread, any other threads that attempt to use the transaction will also get err_deadlock; in particular, any threads that were waiting for the global lock will wake up and immediately get err_deadlock.

Nesting and Global Transactions

You cannot nest a local transaction within a global transaction, nor can you nest a global transaction within a local one. The following table specifies how two transactions can interact.


	Thread A runs global transaction	Thread A runs local transaction
Thread A tries global transaction	OK. Nested global transaction.	err_trans_wrong_type is signaled.
Thread A tries local transaction	err_trans_wrong_type is signaled.	OK. Nested local transaction.
Thread B tries global transaction	OK. Nested global transaction.	OK, but block until A completes.
Thread B tries local transaction	err_trans_wrong_type is signaled	OK, but block until A completes.

Additional information about nested transactions is in Chapter 2, Advanced Transactions, of the ObjectStore Advanced C++ API User Guide. For further discussion of threads, see Chapter 3, Threads, of that publication.

[previous] [next]

Updated: 03/31/98 16:58:31