In relational databases, the Log trigger or History trigger is a mechanism for automatic recording of information about changes inserting or/and updating or/and deleting rows in a database table.
It is a particular technique for change data capturing, and in data warehousing for dealing with slowly changing dimensions.
Video Log trigger
Definition
Suppose there is a table which we want to audit. This table contains the following columns:
Column1, Column2, ..., Columnn
The column Column1
is assumed to be the primary key.
These columns are defined to have the following types:
Type1, Type2, ..., Typen
The Log Trigger works writing the changes (INSERT, UPDATE and DELETE operations) on the table in another, history table, defined as following:
As shown above, this new table contains the same columns as the original table, and additionally two new columns of type DATETIME
: StartDate
and EndDate
. This is known as tuple versioning. These two additional columns define a period of time of "validity" of the data associated with a specified entity (the entity of the primary key), or in other words, it stores how the data were in the period of time between the StartDate
(included) and EndDate
(not included).
For each entity (distinct primary key) on the original table, the following structure is created in the history table. Data is shown as example.
Notice that if they are shown chronologically the EndDate
column of any row is exactly the StartDate
of its successor (if any). It does not mean that both rows are common to that point in time, since -by definition- the value of EndDate
is not included.
There are two variants of the Log trigger, depending how the old values (DELETE, UPDATE) and new values (INSERT, UPDATE) are exposed to the trigger (it is RDBMS dependent):
Old and new values as fields of a record data structure
Old and new values as rows of virtual tables
Compatibility notes
- The function
GetDate()
is used to get the system date and time, a specific RDBMS could either use another function name, or get this information by another way. - Several RDBMS (DB2, MySQL) do not support that the same trigger can be attached to more than one operation (INSERT, DELETE, UPDATE). In such a case a trigger must be created for each operation; For an INSERT operation only the inserting section must be specified, for a DELETE operation only the deleting section must be specified, and for an UPDATE operation both sections must be present, just as it is shown above (the deleting section first, then the inserting section), because an UPDATE operation is logically represented as a DELETE operation followed by an INSERT operation.
- In the code shown, the record data structure containing the old and new values are called
OLD
andNEW
. On a specific RDBMS they could have different names. - In the code shown, the virtual tables are called
DELETED
andINSERTED
. On a specific RDBMS they could have different names. Another RDBMS (DB2) even let the name of these logical tables be specified. - In the code shown, comments are in C/C++ style, they could not be supported by a specific RDBMS, or a different syntax should be used.
- Several RDBMS require that the body of the trigger is enclosed between
BEGIN
andEND
keywords.
Data warehousing
According with the slowly changing dimension management methodologies, The log trigger falls into the following:
- Type 2 (tuple versioning variant)
- Type 4 (use of history tables)
Maps Log trigger
Implementation in common RDBMS
IBM DB2
- A trigger cannot be attached to more than one operation (INSERT, DELETE, UPDATE), so a trigger must be created for each operation.
- The old and new values are exposed as fields of a record data structures. The names of these records can be defined, in this example they are named as
O
for old values andN
for new values.
Microsoft SQL Server
- The same trigger can be attached to all the INSERT, DELETE, and UPDATE operations.
- Old and new values as rows of virtual tables named
DELETED
andINSERTED
.
MySQL
- A trigger cannot be attached to more than one operation (INSERT, DELETE, UPDATE), so a trigger must be created for each operation.
- The old and new values are exposed as fields of a record data structures called
Old
andNew
.
Oracle
- The same trigger can be attached to all the INSERT, DELETE, and UPDATE operations.
- The old and new values are exposed as fields of a record data structures called
:OLD
and:NEW
. - It is necessary to test the nullity of the fields of the
:NEW
record that define the primary key (when a DELETE operation is performed), in order to avoid the insertion of a new row with null values in all columns.
Historic information
Typically, database backups are used to store and retrieve historic information. A database backup is a security mechanism, more than an effective way to retrieve ready-to-use historic information.
A (full) database backup is only a snapshot of the data in specific points of time, so we could know the information of each snapshot, but we can know nothing between them. Information in database backups is discrete in time.
Using the log trigger the information we can know is not discrete but continuous, we can know the exact state of the information in any point of time, only limited to the granularity of time provided with the DATETIME
data type of the RDBMS used.
Advantages
- It is simple.
- It is not a commercial product, it works with available features in common RDBMS.
- It is automatic, once it is created, it works with no further human intervention.
- It is not required to have good knowledge about the tables of the database, or the data model.
- Changes in current programming are not required.
- Changes in the current tables are not required, because log data of any table is stored in a different one.
- It works for both programmed and ad hoc statements.
- Only changes (INSERT, UPDATE and DELETE operations) are registered, so the growing rate of the history tables are proportional to the changes.
- It is not necessary to apply the trigger to all the tables on database, it can be applied to certain tables, or certain columns of a table.
Disadvantages
- It does not automatically store information about the user producing the changes (information system user, not database user). This information might be provided explicitly. It could be enforced in information systems, but not in ad hoc queries.
Examples of use
Getting the current version of a table
It should return the same resultset of the whole original table.
Getting the version of a table in a certain point of time
Suppose the @DATE
variable contains the point or time of interest.
Getting the information of an entity in a certain point of time
Suppose the @DATE
variable contains the point or time of interest, and the @KEY
variable contains the primary key of the entity of interest.
Getting the history of an entity
Suppose the @KEY
variable contains the primary key of the entity of interest.
Getting when and how an entity was created
Suppose the @KEY
variable contains the primary key of the entity of interest.
Immutability of primary keys
Since the trigger requires that primary key being the same throughout time, it is desirable to either ensure or maximize its immutability, if a primary key changed its value, the entity it represents would break its own history.
There are several options to achieve or maximize the primary key immutability:
- Use of a surrogate key as a primary key. Since there is no reason to change a value with no meaning other than identity and uniqueness, it would never change.
- Use of an immutable natural key as a primary key. In a good database design, a natural key which can change should not be considered as a "real" primary key.
- Use of a mutable natural key as a primary key (it is widely discouraged) where changes are propagated in every place where it is a foreign key. In such a case, the history table should be also affected.
Alternatives
Sometimes the Slowly changing dimension is used as a method, this diagram is an example:
See also
- Relational database
- Primary key
- Natural key
- Surrogate key
- Change data capture
- Slowly changing dimension
- Tuple versioning
Notes
The Log trigger was written by Laurence R. Ugalde to automatically generate history of transactional databases.
References
Source of article : Wikipedia