Data warehouseA data warehouse comprises a computing system used to store information regarding an organization's activities in a database. The database design favours reporting on and analysing the data in order to gain strategic information and to facilitate decision making.
Data warehouses may hold large amounts of information, sometimes in smaller logical units called Data marts. Often the schemas of data marts are stored in what are known as "Star Schemas", or Dimensional Modeling form; however there is no industry standard requiring that the schemas of data marts be in any particular form. There is, in fact, some controversy about the most useful form of data mart schemas.
Conventional database systems use highly normalized data formats so that they execute transactions and queries as fast as possible, in minimal time and space. Data Warehouses often use a more de-normalized (relaxed) format. De-normalization is usually encouraged because the schema will be more intuitive to non-administrative users as they are exploring it. For example, rather than having a single record in a table contain customer information, that information may be replicated across a whole series of tables to simplify querying for users.
OLAP (online analytical processing) tools are generally designed to work with de-normalized databases although there are tools that work with special data warehouse schemas stored in Third normal form (denormalized).
Data being pushed into a warehouse is usually "staged". Data staging occurs when a periodic process reads data from sources (often a business's primary OLTP databases), scrubs this information for quality, de-normalizes it, and writes it into the warehouse. This process is usually carried out with an ETL tool.
Data warehouses are usually accessed (queried) via "data marts", which are purpose-specific access points to or sub-sets of the warehouse. Data marts are designed to answer the probable queries of a given kind of user.
Normally a data warehouse does not store current information on an individual business activity. It is often used for collective processing for all business units across a corporation.
Computing in data warehouses is often referred to as Online Analytical Processing (OLAP), in contrast to Online Transaction Processing (OLTP) -- used for normal business activities. Data from Enterprise resource planning (ERP) systems and other related business software systems is imported into data warehouses periodically for further processing.