ObjectStore C++ Advanced API User Guide

Chapter 2

Advanced Transactions

This chapter is intended to augment Chapter 3, Transactions, of the ObjectStore C++ API User Guide. It includes descriptions of several advanced transaction concepts, particularly those pertaining to locks and locking. The information is organized in the following manner:

Reducing Wait Time for Locks

What can you do to reduce the overhead of waiting for locks? One application can reduce the waiting overhead for other concurrent applications by avoiding locking data unnecessarily, and by avoiding locking data for unnecessarily long periods of time. This section describes several techniques for minimizing wait time.

Clustering

One way to help avoid locking data unnecessarily involves the use of clustering. Suppose that, during a given transaction, an application requires object-a but not object-b. If the two objects are clustered onto the same page, they will both be locked, preventing other processes from accessing both objects until the end of the transaction. In contrast, by clustering object-b in a different object cluster or segment from object-a, you guarantee that the objects will be on different pages. So, if you use page-level locking granularity, the objects will not be locked together.

Locking Granularity

Another way to help avoid locking data unnecessarily is to avoid unnecessary use of segment-level locking granularity; that is, to avoid unnecessary use of lock_segment_read or lock_segment_write as the argument to os_segment::set_lock_whole_segment(). Unnecessary use of lock_segment_write can also increase the amount of data transferred out of the client cache. The benefit of segment granularity locking is that it avoids the overhead of a separate page fault for each page locked, and it can reduce Server communication.

For more information, see os_segment::set_lock_whole_segment() in Chapter 2 of the ObjectStore C++ API Reference.

Transaction Length

One way to avoid locking data for unnecessarily long periods of time is to make (nonnested) transactions as short as possible, while still guaranteeing that persistent data will be in a consistent state between transactions (see Nested Transactions).

The disadvantage of using shorter transactions is that it can mean using a greater number of transactions. This can increase network overhead, because each transaction commit requires the client to send a commit message to the Server. Nevertheless, this extra network overhead is often outweighed by the savings from shorter waits for locks to be released.

It is sometimes particularly important to make transactions that use persistent new or persistent delete as short as possible.

Multiversion Concurrency Control (MVCC)

Single-database, read-only transactions can use multiversion concurrency control, or MVCC. When you use MVCC, you can perform nonblocking reads of a database, allowing another ObjectStore application to update the database concurrently, with no waiting by either the reader or the writer. See Multiversion Concurrency Control (MVCC) for additional information.

abort_only Locking Rules

The locking restrictions are relaxed somewhat when the transaction is abort_only. Under such circumstances, the client does not get write locks for any pages that are written during an abort-only transaction. Thus there can be multiple concurrent abort-only writers to a database. The client does get read locks for all pages it reads or writes. This lock relaxation is another method of reducing wait time.

Lock Timeouts

Lock timeouts provide the ability to limit wait time, and abort if limits are exceeded. You can set a timeout for read or write lock attempts, to limit the amount of time your application will wait. When the timeout is exceeded, an exception is signaled. Handling the exception allows you to continue with alternative processing, and make a later attempt to acquire the lock. The set_readlock_timeout() and set_writelock_timeout() are members of the objectstore, os_database, and os_segment classes , which are all described in Chapter 2, Class Library, of the ObjectStore C++ API Reference.

Nested Transactions

Why use nested transactions?

For a number of reasons, it is useful to allow transactions to be nested. For example, suppose one transaction is required to hide intermediate results. This also allows rollback of persistent data to its state as of the beginning of the transaction. But suppose you would like to be able to roll back persistent data to its state as of some point after the beginning of this transaction. To allow this, you can use a nested transaction that starts at this later point.

In addition, allowing nested transactions means that a routine that initiates a transaction can be called both from inside and outside a transaction.

Nested transactions must be of the same type

Except when you are using os_transaction::abort_only, when you nest one transaction within another, the two transactions must be of the same type (os_transaction::update or os_transaction::read_only). If they have different types, err_trans_wrong_type is signaled.

Nested transactions and abort_only

When you are using os_transaction::abort_only, if the top-level transaction is abort_only, then both abort_only and update transactions can nest within it.

Note that an abort_only transaction does not automatically abort. You must specifically use the os_transaction::abort() function to abort the abort_only transaction, otherwise an exception is signaled.

You can use abort_top_level(), or for stack transactions use abort(), since you know exactly where the transaction ends.

      OS_BEGIN_TXN(txn, 0, os_transaction::abort_only) {
            . . . 
            os_transaction::abort();
      } OS_END_TXN(txn);

When a nested transaction is aborted, persistent data is rolled back to its state as of the beginning of that transaction. However, no locks are released until the outermost transaction terminates. This means other processes still have to wait to access the pages that the aborted transaction accessed.

Deadlock

ObjectStore sometimes automatically aborts a transaction due to deadlock. A simple deadlock occurs when one transaction holds a lock on a page that another transaction is waiting to access, while at the same time this other transaction holds a lock on a page that the first transaction is waiting to access. Neither process can proceed until the other does. See Simple Deadlock Scenario. There are other, more complicated forms of deadlock that are analogous.

Deadlock Victim

ObjectStore has a deadlock detection facility that breaks deadlocks, when detected, by aborting one of the transactions involved in the deadlock. By aborting one transaction (the victim), ObjectStore causes the victim's locks to be released so other processes can proceed.

You can control how ObjectStore chooses a victim with objectstore::set_transaction_priority() (see Chapter 2, Class Library, of the ObjectStore C++ API Reference) and the Deadlock Victim Server parameter (see Chapter 2, Server Parameters, of ObjectStore Management).

Automatic Retries Within Lexical Transactions

When a lexical transaction (one specified with the transaction statement macros) is aborted due to a deadlock, the system automatically retries the aborted transaction.

In the event that the transaction is repeatedly aborted by the system, the retries continue until the maximum number of retries has occurred. This maximum for any transaction in a given process is determined by the value of the static data member os_transaction::max_retries. You can retrieve the value of this member with os_transaction::get_max_retries().

      static os_int32 get_max_retries() ;

Changing the maximum number of retries

Its default value is 10. You can change the value of os_transaction::max_retries at any time with os_transaction::set_max_retries().

      static void set_max_retries(os_int32) ;

The change remains in effect only for the duration of the process, and is invisible to other processes.

Consequences of Automatic Deadlock Abort

When a transaction is aborted by the system, its changes are undone. But only persistent state is rolled back. Transient state is not undone, and any form of output that occurred before the abort is not, of course, undone. Sometimes it is a good idea to perform output outside a transaction, but other times this might not be the best approach.

Deadlocks in Dynamic Transactions

Dynamic transactions that are aborted because of deadlock are not retried. If you want to retry a dynamic transaction aborted because of deadlock, you must do so explicitly by handling the exception err_deadlock. Call os_transaction::abort_top_level() from within the handler.

See Using Dynamic Transactions in Chapter 3, Transactions, of the ObjectStore C++ API User Guide for more information about dynamic transactions.

Multiversion Concurrency Control (MVCC)

When you use multiversion concurrency control (MVCC), you can perform nonblocking reads of a database, allowing another ObjectStore application to update the database concurrently, with no waiting by either the reader or the writer. If your application contains a transaction that uses a database in a read-only fashion, you might be able to use multiversion concurrency control.

If a transaction

Only performs read access on a database, and
Does not require a view of the database that is completely up to date, but can instead rely on a snapshot of the data,

you should open the database for multiversion concurrency control (MVCC). You can do this with members of the class os_database (see The MVCC API). You can also use MVCC in conjunction with os_transaction::abort_only. This can improve your application's performance, as well as the performance of other concurrent ObjectStore applications.

No Waiting for Locks

If an application has a database opened for MVCC, it never has to wait for locks to be released in order to read the database. Reading a database opened for MVCC also never causes other applications to have to wait to update the database; see the example MVCC and the Simple Waiting Scenario. In addition, an application never causes a deadlock by accessing a database it has opened for MVCC. See the example MVCC and the Simple Deadlock Scenario.

Snapshots

In each transaction in which an application accesses a database opened for MVCC, the application sees what it would see if viewing a snapshot of the database taken sometime during the transaction.

This snapshot has the following characteristics:

It is internally consistent.
It might not contain changes committed during the transaction by other processes.
It does contain all changes committed before the transaction started.

Accessing Multiple Databases in a Transaction

When an application reads a database opened for MVCC, the snapshot it sees is potentially out of date. This means that the snapshot might not be consistent with other databases accessed in the same transaction (although it will always be internally consistent). Even two databases both of which are opened for MVCC might not be consistent with each other, because updates might be performed on one of the databases in between the times of their snapshots.

Serializability

Even though the snapshot might be out of date by the time some of the access is performed, multiversion concurrency control retains serializability, if each transaction that accesses an MVCC database accesses only that one database. Such a transaction sees a database state that would have resulted from some serial execution of all transactions, and all the transactions produce the same effects as would have been produced by the serial execution.

The MVCC API

You open a database for multiversion concurrency control with one of the following members of os_database:

      void open_mvcc() ;
      static os_database *open_mvcc(const char *pathname) ;

It is valid to open MVCC databases by following cross-database pointers.

Once you open a database for MVCC, multiversion concurrency control is used for access to that database until you close it. If the database is already opened, but not for MVCC, err_mvcc_nested is signaled. If you try to perform write access on a database opened for MVCC, err_opened_read_only is signaled.

You can determine if a database is opened for MVCC with the following member of os_database:

      os_boolean is_open_mvcc() const ;

This function returns nonzero if this is opened for MVCC, and 0 otherwise.

MVCC and the Transaction Log

Although multiversion concurrency control can cause ObjectStore to, in effect, take a snapshot of an entire database, the implementation actually only copies data when needed, on a page-by-page basis. Moreover, making the copy simply amounts to retaining the page in the transaction log. See Logging and Propagation.

In the absence of multiversion concurrency control, updated pages from committed transactions are propagated from the log to the database on a periodic basis. But with MVCC, updated pages are held in the log as long as necessary, so that the corresponding page in the database is not overwritten, and can be used as part of the MVCC snapshot.

Caution

Note that this means that long transactions that use multiversion concurrency control can cause the log to become very large.

Conflict detection

Multiversion concurrency control determines whether a page must be held in the log based on the notion of conflict defined below. From the time a conflict is detected in a given transaction, propagation is delayed for subsequently committed data, until the given transaction ends.

A conflict occurs when one of the following happens:

A process tries to read a page in a database it has opened for MVCC, and another process has the page write locked.
A process tries to write a page in a database, and another process that has the database opened for MVCC has the page read locked.

In both these cases, both processes proceed; no one is blocked. The transaction performed by the MVCC process is placed just before the conflicting update transaction in the serialization order. This is effectively when the snapshot is taken. See MVCC Conflict Scenario.

Under some circumstances, the ObjectStore Server might decide to hold a page in the log in anticipation of a conflict, even if none has actually occurred.

Logging and Propagation

The ObjectStore transaction log, as with the log in any database system, is used to ensure fault tolerance and to support the functionality involved in transaction aborts. The log is stable storage (that is, disk storage) used to keep temporary copies of data en route to the database from the client cache.

Transaction Logging

Data is recorded in the log before being written to the database (with certain exceptions - see below), and is not removed from the log until some time after the transaction sending it has committed. That way, if a failure occurs in the middle of moving a transaction's data to the database (for example, because the network crashes or someone pulls the plug on the Server host), the data is nevertheless safely in the log, and can be moved to the database in its entirety during recovery.

If a failure occurs before or during the recording of a transaction's data in the log, the transaction is considered to have aborted, and the data is never written to the database (and similarly, if the transaction aborts because of deadlock or a call to abort(), the data is never written to the database).

New data whose creation results in the use of new disk sectors is handled differently. This data is sometimes moved directly to the database, and sufficient information is maintained on stable storage to effectively remove the data from the database if the creating transaction aborts. For new sectors and segments, this undo information is kept in the database itself; for new databases the information is stored in the log, as an undo record.

Propagation

During normal operation, the ObjectStore Server moves, or propagates, data from the log to the database on a periodic basis. The Server keeps track of what has been propagated, and always knows whether the latest committed version of any given sector is to be found in the log or in the database. That way, when clients request data from the Server, the Server can send the sector's most up-to-date version.

Controlling propagation

You can control how often propagation occurs with the ObjectStore Server parameter Propagation Sleep Time; the default is every 60 seconds. This determines the time between propagations, except when the Server temporarily deems it necessary to propagate on a more frequent basis. By default, the Server increases the propagation rate when there are more than 8192 sectors waiting to be propagated. You can override the default of 8192 with the Server parameter Max Data Propagation Threshold. The Server also increases the propagation rate in order to empty out a log record segment.

You can control the amount of data propagated each time with the Server parameter Max Data Propagation Per Propagate. For propagates that consist of a single disk write (that is, propagation of data that is contiguous in the database), this specifies the number of sectors to propagate (the default is 512).

For information on these and other Server parameters, see Chapter 2, Server Parameters, in ObjectStore Management.

Checkpoint: Committing and Continuing a Transaction

ObjectStore includes a way to perform a checkpoint within a transaction. The checkpoint commits modified data from a top-level transaction without incurring the overhead of ending a transaction and starting a new transaction. This done with the os_transaction::checkpoint() interface.

The os_transaction::checkpoint() interface is similar to os_transaction::commit(), with the difference that you get the effect of committing a transaction and then continuing work in a new transaction in which you have read locks on all or most of the persistent objects that were locked in the committed transaction. This is useful when

You are making modifications to a database. You want to periodically commit your changes but continue updating the database without intervention. For example, you might be loading new data into the database.
You want to make your changes available to MVCC readers.
You opened a database for MVCC and you want an updated snapshot.

Note

Checkpoints within a transaction differ from conventional checkpoints. In this checkpoint, an application might not have all the locks after the checkpoint that it had before the checkpoint. The details are explained in the next section.

Caution

Checkpoint allows the application to maintain lock assertion, relocation state, address space assignment, and page protection state for some pages in the cache across what ordinarily would be transaction boundaries. This means that for certain classes of applications that access the same data pages repeatedly in sequential transactions, you can avoid the cost of setting up and tearing down access to those pages repeatedly. Like transaction commit and abort operations, this operation is not thread-safe. Applications must ensure that other threads do not access persistent memory during a checkpoint operation.

In conjunction with MVCC-opened databases, checkpoint can also be used to expose to the current transaction changes that have been committed to the databases since the transaction started (or since the last checkpoint invocation). This brings the transaction up to date with changes that have taken place without its knowledge.

See os_transaction::checkpoint() and os_transaction::checkpoint_in_progress() in Chapter 2, Class Library, of the ObjectStore C++ API Reference for further detail.

Advantages of a Checkpoint

The advantage of a checkpoint is that there is less overhead than when you actually end one transaction and start another. When you checkpoint a transaction, it is as if you committed the transaction and then immediately started a new transaction. But in the new transaction, you already have read locks on most or all of your persistent objects.

If another client is waiting for a write lock on a persistent object that was locked in your transaction, you lose that lock when you checkpoint the transaction. As long as another client is not waiting for a write lock on an object that was associated with your transaction, you reacquire as read locks any locks you had before the checkpoint.

After the checkpoint, you do not have to start from a root object to set up your access to objects. Your application's access to objects can be the same before and after the checkpoint.

After a checkpoint, ObjectStore has read locks on the same objects as before the checkpoint, unless another client was waiting for a write lock on one of these objects. In that case, your transaction loses the lock.

If there were any write locks before the checkpoint, ObjectStore changes them to read locks, or gives them to any clients waiting for those write locks. Consequently, you might have to wait for locks or you might get a deadlock when you try to update the database again.

Calling the os_transaction::checkpoint() Function

To checkpoint a transaction, call the os_transaction::checkpoint() function. The function's overloadings are

static void os_transaction::checkpoint();

Invokes checkpoint on the current transaction.

static os_transaction::checkpoint(os_transaction*);

Invokes checkpoint on the specified transaction.

Caution

Before you checkpoint a transaction, you must ensure that the database is in a consistent state.

During the checkpoint, you must ensure that no other thread tries to access the database.

For related information, see os_transaction::checkpoint_in_progress().

Transaction Locking Examples

The following examples illustrate some of the locking situations described in this chapter.

Simple Waiting Scenario

If one transaction reads a page, and then another transaction reads the same page, it is not blocked. But if the latter transaction tries to write to the page, it must wait until the first transaction commits.


Transaction 1	Transaction 2
Read P
	Read P
	Write P: BLOCKED
Commit

So the actual schedule of operations looks like this:

Transaction 1 Transaction 2
Read P
Read P
Commit
Write P (succeeds)

Simple Deadlock Scenario

In the schedule below, Transaction 2 attempts to write P1, but cannot proceed until Transaction 1 completes and releases its read lock on P1. But Transaction 1 cannot proceed until Transaction 2 completes and releases its lock on P2. Since neither Transaction can proceed until the other does, the result is a classic deadlock scenario. ObjectStore chooses Transaction 1 as victim and aborts it, whereupon Transaction 2 can proceed.


Transaction 1	Transaction 2
Read P1
	Read P1
	Read P2
	Write P2
	Write P1: BLOCKED
Read P2: BLOCKED - DEADLOCK
Abort
	Write P1 (succeeds)

MVCC and the Simple Waiting Scenario

If one transaction reads a page of a database it has opened for MVCC, and then another transaction attempts to update the same page, the second transaction is not blocked. Compare this with the Simple Waiting Scenario.


MVCC Transaction 1	Update Transaction 2
Read P
	Read P
	Write P: NOT BLOCKED

MVCC and the Simple Deadlock Scenario

In the schedule below, Transaction 2 writes P1 without waiting; it can proceed before Transaction 1 completes and releases its read lock on P1, because Transaction 1 has the database containing the page opened for MVCC. Similarly Transaction 1 can proceed before Transaction 2 completes and releases its lock on P2. Without multiversion concurrency control, deadlock would have resulted. See the Simple Deadlock Scenario.


MVCC Transaction 1	Update Transaction 2
Read P1
	Read P1
	Read P2
	Write P2
	Write P1: NOT BLOCKED
Read P2: NOT BLOCKED

MVCC Conflict Scenario

MVCC and update conflict, because update writes something (A) which is being read by MVCC. Therefore all pages updated by update must be retained in the log, so MVCC can see the old copies of these pages.


MVCC Transaction 1	Update Transaction 2
Read A
	Read A, B, C, D, E
	Write A, B, C, D, E
	Commit
Read B (old)

[previous] [next]

Copyright © 1997 Object Design, Inc. All rights reserved.

Updated: 03/31/98 15:27:46