Databases¶
Django attempts to support as many features as possible on all database backends. However, not all database backends are alike, and we’ve had to make design decisions on which features to support and which assumptions we can make safely.
This file describes some of the features that might be relevant to Django usage. Of course, it is not intended as a replacement for server-specific documentation or reference manuals.
PostgreSQL notes¶
Django supports PostgreSQL 8.2 and higher.
PostgreSQL 8.2 to 8.2.4¶
The implementation of the population statistics aggregates STDDEV_POP
and
VAR_POP
that shipped with PostgreSQL 8.2 to 8.2.4 are known to be
faulty. Users of these releases of PostgreSQL are advised to upgrade to
Release 8.2.5 or later. Django will raise a NotImplementedError
if you
attempt to use the StdDev(sample=False)
or Variance(sample=False)
aggregate with a database backend that falls within the affected release range.
Optimizing PostgreSQL’s configuration¶
Django needs the following parameters for its database connections:
client_encoding
:'UTF8'
,default_transaction_isolation
:'read committed'
,timezone
:'UTC'
whenUSE_TZ
isTrue
, value ofTIME_ZONE
otherwise.
If these parameters already have the correct values, Django won’t set them for
every new connection, which improves performance slightly. You can configure
them directly in postgresql.conf
or more conveniently per database
user with ALTER ROLE.
Django will work just fine without this optimization, but each new connection will do some additional queries to set these parameters.
Transaction handling¶
By default, Django runs with an open transaction which it commits automatically when any built-in, data-altering model function is called. The PostgreSQL backends normally operate the same as any other Django backend in this respect.
Autocommit mode¶
If your application is particularly read-heavy and doesn’t make many
database writes, the overhead of a constantly open transaction can
sometimes be noticeable. For those situations, you can configure Django
to use “autocommit” behavior for the connection, meaning that each database
operation will normally be in its own transaction, rather than having
the transaction extend over multiple operations. In this case, you can
still manually start a transaction if you’re doing something that
requires consistency across multiple database operations. The
autocommit behavior is enabled by setting the autocommit
key in
the OPTIONS
part of your database configuration in
DATABASES
:
'OPTIONS': {
'autocommit': True,
}
In this configuration, Django still ensures that delete() and update() queries run inside a single transaction, so that either all the affected objects are changed or none of them are.
This is database-level autocommit
This functionality is not the same as the autocommit decorator. That decorator is
a Django-level implementation that commits automatically after
data changing operations. The feature enabled using the
OPTIONS
option provides autocommit behavior at the
database adapter level. It commits after every operation.
If you are using this feature and performing an operation akin to delete or updating that requires multiple operations, you are strongly recommended to wrap you operations in manual transaction handling to ensure data consistency. You should also audit your existing code for any instances of this behavior before enabling this feature. It’s faster, but it provides less automatic protection for multi-call operations.
Indexes for varchar
and text
columns¶
When specifying db_index=True
on your model fields, Django typically
outputs a single CREATE INDEX
statement. However, if the database type
for the field is either varchar
or text
(e.g., used by CharField
,
FileField
, and TextField
), then Django will create
an additional index that uses an appropriate PostgreSQL operator class
for the column. The extra index is necessary to correctly perform
lookups that use the LIKE
operator in their SQL, as is done with the
contains
and startswith
lookup types.
MySQL notes¶
Version support¶
Django supports MySQL 5.0.3 and higher.
MySQL 5.0 adds the information_schema
database, which contains detailed
data on all database schema. Django’s inspectdb
feature uses it.
Django expects the database to support Unicode (UTF-8 encoding) and delegates to it the task of enforcing transactions and referential integrity. It is important to be aware of the fact that the two latter ones aren’t actually enforced by MySQL when using the MyISAM storage engine, see the next section.
Storage engines¶
MySQL has several storage engines (previously called table types). You can change the default storage engine in the server configuration.
Until MySQL 5.5.4, the default engine was MyISAM [1]. The main drawbacks of MyISAM are that it doesn’t support transactions or enforce foreign-key constraints. On the plus side, it’s currently the only engine that supports full-text indexing and searching.
Since MySQL 5.5.5, the default storage engine is InnoDB. This engine is fully transactional and supports foreign key references. It’s probably the best choice at this point.
If you upgrade an existing project to MySQL 5.5.5 and subsequently add some
tables, ensure that your tables are using the same storage engine (i.e. MyISAM
vs. InnoDB). Specifically, if tables that have a ForeignKey
between them
use different storage engines, you may see an error like the following when
running syncdb
:
_mysql_exceptions.OperationalError: (
1005, "Can't create table '\\db_name\\.#sql-4a8_ab' (errno: 150)"
)
In previous versions of Django, fixtures with forward references (i.e. relations to rows that have not yet been inserted into the database) would fail to load when using the InnoDB storage engine. This was due to the fact that InnoDB deviates from the SQL standard by checking foreign key constraints immediately instead of deferring the check until the transaction is committed. This problem has been resolved in Django 1.4. Fixture data is now loaded with foreign key checks turned off; foreign key checks are then re-enabled when the data has finished loading, at which point the entire table is checked for invalid foreign key references and an IntegrityError is raised if any are found.
[1] | Unless this was changed by the packager of your MySQL package. We’ve had reports that the Windows Community Server installer sets up InnoDB as the default storage engine, for example. |
MySQLdb¶
MySQLdb is the Python interface to MySQL. Version 1.2.1p2 or later is required for full MySQL support in Django.
Note
If you see ImportError: cannot import name ImmutableSet
when trying to
use Django, your MySQLdb installation may contain an outdated sets.py
file that conflicts with the built-in module of the same name from Python
2.4 and later. To fix this, verify that you have installed MySQLdb version
1.2.1p2 or newer, then delete the sets.py
file in the MySQLdb
directory that was left by an earlier version.
Creating your database¶
You can create your database using the command-line tools and this SQL:
CREATE DATABASE <dbname> CHARACTER SET utf8;
This ensures all tables and columns will use UTF-8 by default.
Collation settings¶
The collation setting for a column controls the order in which data is sorted as well as what strings compare as equal. It can be set on a database-wide level and also per-table and per-column. This is documented thoroughly in the MySQL documentation. In all cases, you set the collation by directly manipulating the database tables; Django doesn’t provide a way to set this on the model definition.
By default, with a UTF-8 database, MySQL will use the
utf8_general_ci_swedish
collation. This results in all string equality
comparisons being done in a case-insensitive manner. That is, "Fred"
and
"freD"
are considered equal at the database level. If you have a unique
constraint on a field, it would be illegal to try to insert both "aa"
and
"AA"
into the same column, since they compare as equal (and, hence,
non-unique) with the default collation.
In many cases, this default will not be a problem. However, if you really want
case-sensitive comparisons on a particular column or table, you would change
the column or table to use the utf8_bin
collation. The main thing to be
aware of in this case is that if you are using MySQLdb 1.2.2, the database
backend in Django will then return bytestrings (instead of unicode strings) for
any character fields it receive from the database. This is a strong variation
from Django’s normal practice of always returning unicode strings. It is up
to you, the developer, to handle the fact that you will receive bytestrings if
you configure your table(s) to use utf8_bin
collation. Django itself should
mostly work smoothly with such columns (except for the contrib.sessions
Session
and contrib.admin
LogEntry
tables described below), but
your code must be prepared to call django.utils.encoding.smart_text()
at
times if it really wants to work with consistent data – Django will not do
this for you (the database backend layer and the model population layer are
separated internally so the database layer doesn’t know it needs to make this
conversion in this one particular case).
If you’re using MySQLdb 1.2.1p2, Django’s standard
CharField
class will return unicode strings even
with utf8_bin
collation. However, TextField
fields will be returned as an array.array
instance (from Python’s standard
array
module). There isn’t a lot Django can do about that, since, again,
the information needed to make the necessary conversions isn’t available when
the data is read in from the database. This problem was fixed in MySQLdb
1.2.2, so if you want to use TextField
with
utf8_bin
collation, upgrading to version 1.2.2 and then dealing with the
bytestrings (which shouldn’t be too difficult) as described above is the
recommended solution.
Should you decide to use utf8_bin
collation for some of your tables with
MySQLdb 1.2.1p2 or 1.2.2, you should still use utf8_collation_ci_swedish
(the default) collation for the django.contrib.sessions.models.Session
table (usually called django_session
) and the
django.contrib.admin.models.LogEntry
table (usually called
django_admin_log
). Those are the two standard tables that use
TextField
internally.
Connecting to the database¶
Refer to the settings documentation.
Connection settings are used in this order:
In other words, if you set the name of the database in OPTIONS
,
this will take precedence over NAME
, which would override
anything in a MySQL option file.
Here’s a sample configuration which uses a MySQL option file:
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'OPTIONS': {
'read_default_file': '/path/to/my.cnf',
},
}
}
# my.cnf
[client]
database = NAME
user = USER
password = PASSWORD
default-character-set = utf8
Several other MySQLdb connection options may be useful, such as ssl
,
use_unicode
, init_command
, and sql_mode
. Consult the
MySQLdb documentation for more details.
Creating your tables¶
When Django generates the schema, it doesn’t specify a storage engine, so tables will be created with whatever default storage engine your database server is configured for. The easiest solution is to set your database server’s default storage engine to the desired engine.
If you’re using a hosting service and can’t change your server’s default storage engine, you have a couple of options.
After the tables are created, execute an
ALTER TABLE
statement to convert a table to a new storage engine (such as InnoDB):ALTER TABLE <tablename> ENGINE=INNODB;
This can be tedious if you have a lot of tables.
Another option is to use the
init_command
option for MySQLdb prior to creating your tables:'OPTIONS': { 'init_command': 'SET storage_engine=INNODB', }
This sets the default storage engine upon connecting to the database. After your tables have been created, you should remove this option as it adds a query that is only needed during table creation to each database connection.
Another method for changing the storage engine is described in AlterModelOnSyncDB.
Table names¶
There are known issues in even the latest versions of MySQL that can cause the
case of a table name to be altered when certain SQL statements are executed
under certain conditions. It is recommended that you use lowercase table
names, if possible, to avoid any problems that might arise from this behavior.
Django uses lowercase table names when it auto-generates table names from
models, so this is mainly a consideration if you are overriding the table name
via the db_table
parameter.
Savepoints¶
Both the Django ORM and MySQL (when using the InnoDB storage engine) support database savepoints, but this feature wasn’t available in Django until version 1.4 when such supports was added.
If you use the MyISAM storage engine please be aware of the fact that you will receive database-generated errors if you try to use the savepoint-related methods of the transactions API. The reason for this is that detecting the storage engine of a MySQL database/table is an expensive operation so it was decided it isn’t worth to dynamically convert these methods in no-op’s based in the results of such detection.
Notes on specific fields¶
Character fields¶
Any fields that are stored with VARCHAR
column types have their
max_length
restricted to 255 characters if you are using unique=True
for the field. This affects CharField
,
SlugField
and
CommaSeparatedIntegerField
.
DateTime fields¶
MySQL does not have a timezone-aware column type. If an attempt is made to
store a timezone-aware time
or datetime
to a
TimeField
or DateTimeField
respectively, a ValueError
is raised rather than truncating data.
MySQL does not store fractions of seconds. Fractions of seconds are truncated to zero when the time is stored.
Row locking with QuerySet.select_for_update()
¶
MySQL does not support the NOWAIT
option to the SELECT ... FOR UPDATE
statement. If select_for_update()
is used with nowait=True
then a
DatabaseError
will be raised.
SQLite notes¶
SQLite provides an excellent development alternative for applications that are predominantly read-only or require a smaller installation footprint. As with all database servers, though, there are some differences that are specific to SQLite that you should be aware of.
Substring matching and case sensitivity¶
For all SQLite versions, there is some slightly counter-intuitive behavior when
attempting to match some types of strings. These are triggered when using the
iexact
or contains
filters in Querysets. The behavior
splits into two cases:
1. For substring matching, all matches are done case-insensitively. That is a
filter such as filter(name__contains="aa")
will match a name of "Aabb"
.
2. For strings containing characters outside the ASCII range, all exact string
matches are performed case-sensitively, even when the case-insensitive options
are passed into the query. So the iexact
filter will behave exactly
the same as the exact
filter in these cases.
Some possible workarounds for this are documented at sqlite.org, but they aren’t utilised by the default SQLite backend in Django, as incorporating them would be fairly difficult to do robustly. Thus, Django exposes the default SQLite behavior and you should be aware of this when doing case-insensitive or substring filtering.
SQLite 3.3.6 or newer strongly recommended¶
Versions of SQLite 3.3.5 and older contains the following bugs:
- A bug when handling
ORDER BY
parameters. This can cause problems when you use theselect
parameter for theextra()
QuerySet method. The bug can be identified by the error messageOperationalError: ORDER BY terms must not be non-integer constants
. - A bug when handling aggregation together with DateFields and DecimalFields.
SQLite 3.3.6 was released in April 2006, so most current binary distributions
for different platforms include newer version of SQLite usable from Python
through either the pysqlite2
or the sqlite3
modules.
Version 3.5.9¶
The Ubuntu “Intrepid Ibex” (8.10) SQLite 3.5.9-3 package contains a bug that causes problems with the evaluation of query expressions. If you are using Ubuntu “Intrepid Ibex”, you will need to update the package to version 3.5.9-3ubuntu1 or newer (recommended) or find an alternate source for SQLite packages, or install SQLite from source.
At one time, Debian Lenny shipped with the same malfunctioning SQLite 3.5.9-3 package. However the Debian project has subsequently issued updated versions of the SQLite package that correct these bugs. If you find you are getting unexpected results under Debian, ensure you have updated your SQLite package to 3.5.9-5 or later.
The problem does not appear to exist with other versions of SQLite packaged with other operating systems.
Version 3.6.2¶
SQLite version 3.6.2 (released August 30, 2008) introduced a bug into SELECT
DISTINCT
handling that is triggered by, amongst other things, Django’s
DateQuerySet
(returned by the dates()
method on a queryset).
You should avoid using this version of SQLite with Django. Either upgrade to 3.6.3 (released September 22, 2008) or later, or downgrade to an earlier version of SQLite.
Using newer versions of the SQLite DB-API 2.0 driver¶
For versions of Python 2.5 or newer that include sqlite3
in the standard
library Django will now use a pysqlite2
interface in preference to
sqlite3
if it finds one is available.
This provides the ability to upgrade both the DB-API 2.0 interface or SQLite 3 itself to versions newer than the ones included with your particular Python binary distribution, if needed.
“Database is locked” errors¶
SQLite is meant to be a lightweight database, and thus can’t support a high
level of concurrency. OperationalError: database is locked
errors indicate
that your application is experiencing more concurrency than sqlite
can
handle in default configuration. This error means that one thread or process has
an exclusive lock on the database connection and another thread timed out
waiting for the lock the be released.
Python’s SQLite wrapper has
a default timeout value that determines how long the second thread is allowed to
wait on the lock before it times out and raises the OperationalError: database
is locked
error.
If you’re getting this error, you can solve it by:
Switching to another database backend. At a certain point SQLite becomes too “lite” for real-world applications, and these sorts of concurrency errors indicate you’ve reached that point.
Rewriting your code to reduce concurrency and ensure that database transactions are short-lived.
Increase the default timeout value by setting the
timeout
database option option:'OPTIONS': { # ... 'timeout': 20, # ... }
This will simply make SQLite wait a bit longer before throwing “database is locked” errors; it won’t really do anything to solve them.
QuerySet.select_for_update()
not supported¶
SQLite does not support the SELECT ... FOR UPDATE
syntax. Calling it will
have no effect.
Parameters not quoted in connection.queries
¶
sqlite3
does not provide a way to retrieve the SQL after quoting and
substituting the parameters. Instead, the SQL in connection.queries
is
rebuilt with a simple string interpolation. It may be incorrect. Make sure
you add quotes where necessary before copying a query into a SQLite shell.
Oracle notes¶
Django supports Oracle Database Server versions 9i and
higher. Oracle version 10g or later is required to use Django’s
regex
and iregex
query operators. You will also need at least
version 4.3.1 of the cx_Oracle Python driver.
Note that due to a Unicode-corruption bug in cx_Oracle
5.0, that
version of the driver should not be used with Django;
cx_Oracle
5.0.1 resolved this issue, so if you’d like to use a
more recent cx_Oracle
, use version 5.0.1.
cx_Oracle
5.0.1 or greater can optionally be compiled with the
WITH_UNICODE
environment variable. This is recommended but not
required.
In order for the python manage.py syncdb
command to work, your Oracle
database user must have privileges to run the following commands:
- CREATE TABLE
- CREATE SEQUENCE
- CREATE PROCEDURE
- CREATE TRIGGER
To run Django’s test suite, the user needs these additional privileges:
- CREATE USER
- DROP USER
- CREATE TABLESPACE
- DROP TABLESPACE
- CONNECT WITH ADMIN OPTION
- RESOURCE WITH ADMIN OPTION
Connecting to the database¶
Your Django settings.py file should look something like this for Oracle:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': 'xe',
'USER': 'a_user',
'PASSWORD': 'a_password',
'HOST': '',
'PORT': '',
}
}
If you don’t use a tnsnames.ora
file or a similar naming method that
recognizes the SID (“xe” in this example), then fill in both
HOST
and PORT
like so:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': 'xe',
'USER': 'a_user',
'PASSWORD': 'a_password',
'HOST': 'dbprod01ned.mycompany.com',
'PORT': '1540',
}
}
You should supply both HOST
and PORT
, or leave both
as empty strings.
Threaded option¶
If you plan to run Django in a multithreaded environment (e.g. Apache in Windows
using the default MPM module), then you must set the threaded
option of
your Oracle database configuration to True:
'OPTIONS': {
'threaded': True,
},
Failure to do this may result in crashes and other odd behavior.
INSERT ... RETURNING INTO¶
By default, the Oracle backend uses a RETURNING INTO
clause to efficiently
retrieve the value of an AutoField
when inserting new rows. This behavior
may result in a DatabaseError
in certain unusual setups, such as when
inserting into a remote table, or into a view with an INSTEAD OF
trigger.
The RETURNING INTO
clause can be disabled by setting the
use_returning_into
option of the database configuration to False:
'OPTIONS': {
'use_returning_into': False,
},
In this case, the Oracle backend will use a separate SELECT
query to
retrieve AutoField values.
Naming issues¶
Oracle imposes a name length limit of 30 characters. To accommodate this, the backend truncates database identifiers to fit, replacing the final four characters of the truncated name with a repeatable MD5 hash value.
When running syncdb, an ORA-06552
error may be encountered if
certain Oracle keywords are used as the name of a model field or the
value of a db_column
option. Django quotes all identifiers used
in queries to prevent most such problems, but this error can still
occur when an Oracle datatype is used as a column name. In
particular, take care to avoid using the names date
,
timestamp
, number
or float
as a field name.
NULL and empty strings¶
Django generally prefers to use the empty string (‘’) rather than
NULL, but Oracle treats both identically. To get around this, the
Oracle backend ignores an explicit null
option on fields that
have the empty string as a possible value and generates DDL as if
null=True
. When fetching from the database, it is assumed that
a NULL
value in one of these fields really means the empty
string, and the data is silently converted to reflect this assumption.
TextField
limitations¶
The Oracle backend stores TextFields
as NCLOB
columns. Oracle imposes
some limitations on the usage of such LOB columns in general:
- LOB columns may not be used as primary keys.
- LOB columns may not be used in indexes.
- LOB columns may not be used in a
SELECT DISTINCT
list. This means that attempting to use theQuerySet.distinct
method on a model that includesTextField
columns will result in an error when run against Oracle. As a workaround, use theQuerySet.defer
method in conjunction withdistinct()
to preventTextField
columns from being included in theSELECT DISTINCT
list.
Using a 3rd-party database backend¶
In addition to the officially supported databases, there are backends provided by 3rd parties that allow you to use other databases with Django:
The Django versions and ORM features supported by these unofficial backends vary considerably. Queries regarding the specific capabilities of these unofficial backends, along with any support queries, should be directed to the support channels provided by each 3rd party project.