Skip navigation

Tag Archives: architecture

Two simple guiding principles should be kept in mind when implementing dimensional database design techniques: accuracy and performance.

The accuracy of a design is important for several reasons. While operational questions asked of the database can be determined in advance with some level of certainty, analytic questions cannot. Analytic questions often change with time and questions lead to even more questions.

Designers need to keep track of how the facts in a database are represented. Is there a chance that the data can be misinterpreted or misapplied through either a false query assumption or understanding of the data? What kinds of design methods can be implemented to eliminate or reduce ambiguity or confusion of the use of the design’s facts and data?

Also of importance is performance. The results from queries against a dimensional design are usually optimized due to the structure of its tables and objects, but since the nature of questions asked of the database changes with time, constant effort needs to be invested in optimizing the response time of these evolving queries. Performance considerations may lead designers to consider offering multiple formats of the same data to support these variations in query scope and type.

There are three different types of storage used by Oracle databases.

  1. Data Files
  2. Control Files
  3. Redo Logs

Each are recommended to be redundant and also to have separate physical storage drives and controllers to maximize the likeliness of recovery of data in the event of a hardware failure.

Data Files

These are the physical representation of the data written from the database engine to the disk storage drive. The smallest unit of storage on a data file is a data block, which is usually 8 kb in size. This can be variably set, but the recommendation is to choose an Oracle data block to be a size that is a multiple of the host operating system’s block size.

Data files are represented on the database as a tablespace. A tablespace is a logical grouping of data files that store database information. In a tablespace, four major types of information is stored:

  • Data (information from tables)
  • Indexes
  • Rollback/Undo
  • Temporary

The acronym D.I.R.T makes this easy to remember!

Control Files

These contain information about the database name, when it was created, and the full path of the data files represented in the database. There are exactly three control files in a database and each is identical in content to one another for redundancy.

Redo Logs

A redo log contains one of two different types of commands that are issued against the database:

  1. DML: Data Manipulation Language, or commands such as INSERT, UPDATE and DELETE which change the data in database tables.
  2. DDL: Data Definition Language, or commands such as CREATE and ALTER. These commands create and change database objects within the database.

Redo logs are usually multiplexed across different physical drives and controllers for reduncancy. There are three different groups that are rotated as each group of storage gets filled.

When all three groups are filled, one of two different things happen depending on the mode set for the database.

Archive log mode forces the database engine to copy the contents of each redo log group to a separate location (archive log file) before the database overwrites the contents of the redo log group.

Non-Archive mode just skips the copy step and automatically overwrites the old redo log file group.

That’s it from a storage perspective, which is one of the three major components of an Oracle Database (the other two being: MEMORY and PROCESSES).

Oracle Storage Types

The three major roles of physical storage for an Oracle database

Stumbled across a great four part video series on You Tube with a lecture on Oracle database architecture, and specifically a discussion on the Oracle SGA. For anyone interested in understanding the internals and the pure definition of an Oracle database, this lecture series is worth the time to watch. If you follow his work on the board, you can create your own cool diagram of Oracle database architecture.

Oracle SGA Architecture Lecture

Oracle Server Architecture

High Level Diagram/Definition of an Oracle Server

It is useful to know the architectural details of the Oracle Relational Database Management System (RDBMS) such as its concurrency model, how it manages in its memory and disk, etc.

Database vs. Instance

One common point of confusion with Oracle databases is the distinction between databases and instances.

A database is a collection of data files found on disk or some sort of storage device.

An instance is a set of Oracle processes (also known as background processes) are what operate on the database files, such as storing and fetching data from the database.

Schemas

A collection of objects under one database user (owner). A database user can own one schema, and that same user can still have access to different, multiple schemas.

Two important schemas belong to the SYS and SYSTEM users. SYS schemas contain the data dictionary for the database. The data dictionary is a collection of read-only (query only) database tables and views containing metadata related to the objects in the database. Metadata includes definitions for all of a schema’s objects.

Incicdentally, the SYS user has the highest privileges in an Oracle database and is the equivallent to an “Administrator” or “root” account on the operating system.

Tablespaces

A schema objects that requires physical storage must belong to tablespace, which is a logical group of functionally related schema objects.

Data Blocks

A data block is the smallest unit of data used by ORacle during its I/O operations. Each operating system has a block size. However, Oracle works exclusively in units of Oracle data blocks… not operating system blocks.

Note that one data block may contain more than one row of a table.

Most discussions about performance measurement in Oracle almost always includes the concept of data blocks.

What Makes Oracle Different

Come to understand two concepts of Oracle: its locking mechanism and its multiversion read consistency model. These two characteristics set it apart from other RDBMS’s. Facts about Oracle’s locking behavior:

Oracle uses row-level locking. It makes sure only one transaction can write or modify a piece of data at any given time. Read operations (query or select) do not initiate locking behavior.

Overhead for locking is really low. The lock status is stored within the data block of the row, and not through some lock row status table.

Even if you modify every row of a very large table, Oracle will put a row lock on each row.

Oracle also keeps track of how to change back any blocks of data that are changed. This is stored in an undo (or rollback) segment. This undo segment is marked with a internal timestamp or SCN (system change number) so that if for example, you are querying data while it is being updated, Oracle will look through the undo segments to know what the data looked like at the time your query initiated. This is called multiversion read consistency.

Follow

Get every new post delivered to your Inbox.