Python Database API Specification 1.1a8

This interface specification consists of several items:

Module Interface
Connection Objects
Cursor Objects
DBI Helper Objects and Exceptions

Comments and questions about this specification may be directed to the SIG on Tabular Databases in Python.

See the main index for information about modules that use this interface.

This specification document was last updated on: Mar 03, 1999.
It is as such not yet approved by the db-sig.
The original version of this document can be found on www.python.org.

Module Interface

Access to the database is made available through connection objects. The module must provide the following constructor for these:

connect(parameters...)

Constructor for creating a connection to the database. Returns a Connection Object. It takes a number of parameters which are database dependent and should be implemented as keyword parameters for more intuitive use. The order of parameters is defined as:

dsn = Data source name as string
user = User name as string
password = Password as string
host = Hostname (optional)
database = Database name (optional)

Additional parameters are allowed if the database connection requires these. E.g. a connect could look like this:

connect(dsn='myhost:MYDB',user='guido',password='234$ś')

These module globals must be defined:

threadsafety

Integer constant stating the level of thread safety the interface supports. Possible values are:

`0`	= Threads may not share the module.
`1`	= Threads may share the module, but not connections.
`2`	= Threads may share the module and connections.
`3`	= Threads may share the module, connections and cursors.

If not defined, level 0 should be assumed.

Sharing in the above context means that two threads may use a resource without wrapping it using a mutex semaphore to implement resource locking. Note that you cannot always make external resources thread safe by managing access using a mutex: the resource may rely on global variables or other external sources that are beyond your control.

apilevel

String constant stating the supported DB API level. Currently only the strings '1.0' and '1.1' are allowed.

If not defined, '1.0' should be assumed.

The module should make all error information available through these exceptions or subclasses thereof:

Warning: Exception raised for important warnings like data truncations while inserting, etc.
Error: Exception that is the base class of all other error exceptions. You can use this to catch all errors with one single 'except' statement. Warnings are not considered errors and thus should not use this class as base.
InterfaceError: Exception raised for errors that are specific to the interface rather than the database itself.
DataError: Exception raised for errors that are due to problems with the processed data like division by zero, numeric out of range, etc.
OperationalError: Exception raised when the an unexpected disconnect occurs, the data source name is not found, etc.
IntegrityError: Exception raised when the relational integrity of the database is affected, e.g. a foreign key check fails.
InternalError: Exception raised when the database encounters an internal error, e.g. the cursor is not valid anymore, the transaction is out of sync, etc.
ProgrammingError: Exception raised for programming erros, e.g. table not found or already exists, etc.

Note: The values of these exceptions are not defined. They should give the user a fairly good idea of what went wrong though.

Connection Objects

Connections Objects should respond to the following methods:

close(): Close the connection now (rather than whenever __del__ is called). The connection will be unusable from this point forward; an exception will be raised if any operation is attempted with the connection. The same applies to all cursor objects trying to use the connection.
commit(): Commit any pending transaction to the database. Note that if the database supports an auto-commit feature, this must be initially off. An interface method may be provided to turn it back on.
rollback(): Roll the database back to the start of any pending transaction. Note that closing a connection without committing the changes first will cause an implicit rollback to be performed.
cursor(): Return a new Cursor Object using the connection. An exception may be thrown if the database does not support a cursor concept.
A database interface may choose to support named cursors by allowing a string argument to the method. This feature is not part of the specification, since it complicates semantics of the .fetchXXX() methods.

Cursor Objects

These objects represent a database cursor, which is used to manage the context of a fetch operation.

Cursor Objects should respond to the following methods and attributes:

description

This read-only attribute is a sequence of 7-item sequences. Each of these sequences contains information describing one result column:

(name, type_code, display_size,
	      internal_size, precision, scale, null_ok)

. This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the execute() method yet.

The type_code is equal to one of the type objects specified in the section below.

rowcount

This read-only attribute specifies the number of rows that the last execute() produced (for DQL statements like select) or affected (for DML statements like update or insert). The attribute is -1 in case no execute() has been performed on the cursor or the rowcount of the last operation was not determinable by the interface.

callproc(procname[,parameters])

This method is optional since not all databases provide stored procedures.

Call a stored database procedure with the given name. The sequence of parameters must contain one entry for each argument that the procedure expects. The result of the call is returned as modified copy of the input sequence. Input parameters are left untouched, output and input/output parameters replaced with the new values.

The procedure may also provide a result set as output. This must then be made available through the standard fetch-methods.

close()

Close the cursor now (rather than whenever __del__ is called). The cursor will be unusable from this point forward; an exception will be raised if any operation is attempted with the cursor.

execute(operation[,parameters])

Prepare and execute a database operation (query or command). Parameters may be provided (as tuple) and will be bound to variables in the operation. Variables are specified in a database-specific notation (some DBs use ?,?,? to indicate parameters, others :1,:2,:3) that is based on the index in the parameter tuple (position-based rather than name-based).

The parameters may also be specified as list of tuples to e.g. insert multiple rows in a single operation.

A reference to the operation will be retained by the cursor. If the same operation object is passed in again, then the cursor can optimize its behavior. This is most effective for algorithms where the same operation is used, but different parameters are bound to it (many times).

For maximum efficiency when reusing an operation, it is best to use the setinputsizes() method to specify the parameter types and sizes ahead of time. It is legal for a parameter to not match the predefined information; the implementation should compensate, possibly with a loss of efficiency.

Return values are not defined.

fetchone()

Fetch the next row of a query result set, returning a single tuple, or None when no more data is available.

An exception is raised if the previous call to execute() did not produce any result set or no call was issued yet.

Note that the interface may implement row fetching using arrays and othe optimizations. It is not guaranteed that a call to this method will only move the associated cursor forward by one row.

fetchmany([size])

Fetch the next set of rows of a query result, returning as a list of tuples. An empty list is returned when no more rows are available. The number of rows to fetch is specified by the parameter. If it is None, then the cursor's arraysize determines the number of rows to be fetched.

Note there are performance considerations involved with the size parameter. For optimal performance, it is usually best to use the arraysize attribute. If the size parameter is used, then it is best for it to retain the same value from one fetchmany() call to the next.

An exception is raised if the previous call to execute() did not produce any result set or no call was issued yet.

fetchall()

Fetch all (remaining) rows of a query result, returning them as a list of tuples. Note that the cursor's arraysize attribute can affect the performance of this operation.

An exception is raised if the previous call to execute() did not produce any result set or no call was issued yet.

nextset()

If the database supports returning multiple result sets, this method will make the cursor skip to the next available set. If there are no more sets, the method returns None. Otherwise, it returns a true value and subsequent calls to the fetch methods will return rows from the next result set.

Database interface modules that don't support this feature should always return None.

An exception is raised if the previous call to execute() did not produce any result set or no call was issued yet.

arraysize

This read/write attribute specifies the number of rows to fetch at a time with fetchmany(). This value is also used when inserting multiple rows at a time (passing a list of tuples as the params value to execute()). This attribute will default to a single row.

Note: The arraysize is optional and is merely provided for higher performance database interactions. Implementations should observe it with respect to the fetchmany() method, but are free to interact with the database a single row at a time.

setinputsizes(sizes)

This can be used before a call to execute() to predefine memory areas for the operation's parameters. sizes is specified as a tuple -- one item for each input parameter. The item should be a Type object that corresponds to the input that will be used, or it should be an integer specifying the maximum length of a string parameter. If the item is None, then no predefined memory area will be reserved for that column (this is useful to avoid predefined areas for large inputs).

This method would be used before the execute() method is invoked.

Note: This method is optional and is merely provided for higher performance database interaction. Implementations are free to do nothing and users are free to not use it.

setoutputsize(size [,col])

Set a column buffer size for fetches of large columns (e.g. LONG). The column is specified as an index into the result tuple. Using a column of None will set the default size for all large columns in the cursor.

This method would be used before the execute() method is invoked.

Note: This method is optional and is merely provided for higher performance database interaction. Implementations are free to do nothing and users are free to not use it.

Type Objects and Identifiers

Many databases need to have the input in a particular format for binding to an operation's input parameters. For example, if an input is destined for a DATE column, then it must be bound to the database in a particular string format. Similar problems exist for "Row ID" columns or large binary items (e.g. blobs or RAW columns). This presents problems for Python since the parameters to the execute() method are untyped. When the database module sees a Python string object, it doesn't know if it should be bound as a simple CHAR column, as a raw binary item, or as a DATE.

To overcome this problem, a module must provide the constructors defined below to create objects that can hold date/time and raw values. When passed to the cursor methods, the module can then detect the proper type of the input parameter and bind it accordingly.

A Cursor Object's description attribute returns information about each of the result columns of a query. The type_code must be equal to one of type codes defined below.

Note: The values returned in the description tuple may not necessarily be identical to the defined type codes, i.e. while coltype == STRING should always work, coltype is STRING may fail. This is necessary because the database interface may want to return more specific type codes than the ones defined below.

The module exports the following functions and names:

Date(year,month,day): This function constructs an object holding a date value.
Time(hour,minute,second): This function constructs an object holding a time value.
Timestamp(year,month,day,hour,minute,second): This function constructs an object holding a time stamp value.
Raw(string): This function constructs an object capable of holding a binary string value.
STRING: This object is used to describe columns in a database that are string-based (e.g. CHAR).
RAW: This object is used to describe (large) binary columns in a database (e.g. LONG RAW, blobs).
NUMBER: This object is used to describe numeric columns in a database.
DATE: This object is used to describe date columns in a database.
TIME: This object is used to describe time columns in a database.
TIMESTAMP: This object is used to describe timestamp columns in a database.
ROWID: This object is used to describe the "Row ID" column in a database.

The preferred object types for the date/time objects are those defined in the mxDateTime package. It provides all necessary constructors and methods both at Python and C level.

`dsn`	= Data source name as string
`user`	= User name as string
`password`	= Password as string
`host`	= Hostname	(optional)
`database`	= Database name	(optional)