Schema Modeling Techniques. The following topics provide information about schemas in a data warehouse: Schemas in Data Warehouses.
A schema is a collection of database objects, including tables, views, indexes, and synonyms. There is a variety of ways of arranging schema objects in the schema models designed for data warehousing. One data warehouse schema model is a star schema. The Sales. History sample schema (the basis for most of the examples in this book) uses a star schema. However, there are other schema models that are commonly used for data warehouses. The most prevalent of these schema models is the third normal form (3. NF) schema. Additionally, some data warehouse schemas are neither star schemas nor 3. NF schemas, but instead share characteristics of both schemas; these are referred to as hybrid schema models. The Oracle. 9i database is designed to support all data warehouse schemas. What is snowflake schema? The snowflake schema architecture is a more complex variation of the star schema used in a data warehouse. Pdf: Download (java,struts. What is the difference between star schema and snowflake schema? Oracle interview questions!!! Star Schema And Snowflake Schema Difference. What is the difference between Star Schema & Snowflake Scheme in Data. Some features may be specific to one schema model (such as the star transformation feature, described in . However, the vast majority of Oracle's data warehousing features are equally applicable to star schemas, 3. NF schemas, and hybrid schemas. Key data warehousing capabilities such as partitioning (including the rolling window load technique), parallelism, materialized views, and analytic SQL are implemented in all schema models. The determination of which schema model should be used for a data warehouse should be based upon the requirements and preferences of the data warehouse project team. Comparing the merits of the alternative schema models is outside of the scope of this book; instead, this chapter will briefly introduce each schema model and suggest how Oracle can be optimized for those environments. Star Vs Snowflake Schema. 1.Star Schema 2.Snowflake Schema. Q1 What is the difference between the Power Center Integration Service and the Data Integration. VS Denormalized(Star Schema. You certainly could create a star (or a snowflake). Now it is common to create Star Schema data marts on top. Third Normal Form. Although this guide primarily uses star schemas in its examples, you can also use the third normal form for your data warehouse implementation. Third normal form modeling is a classical relational- database modeling technique that minimizes data redundancy through normalization. When compared to a star schema, a 3. NF schema typically has a larger number of tables due to this normalization process. For example, in Figure 1. Figure 1. 7- 2. 3. NF schemas are typically chosen for large data warehouses, especially environments with significant data- loading requirements that are used to feed data marts and execute long- running queries. The main advantages of 3. NF schemas are that they: Provide a neutral schema design, independent of any application or data- usage considerations. May require less data- transformation than more normalized schemas such as star schemas. Figure 1. 7- 1 presents a graphical representation of a third normal form schema. Figure 1. 7- 1 Third Normal Form Schema. Text description of the illustration dwhsg. Optimizing Third Normal Form Queries. Queries on 3. NF schemas are often very complex and involve a large number of tables. The performance of joins between large tables is thus a primary consideration when using 3. NF schemas. One particularly important feature for 3. NF schemas is partition- wise joins. The largest tables in a 3. NF schema should be partitioned to enable partition- wise joins. The most common partitioning technique in these environments is composite range- hash partitioning for the largest tables, with the most- common join key chosen as the hash- partitioning key. Parallelism is often heavily utilized in 3. NF environments, and parallelism should typically be enabled in these environments. Star Schemas. The star schema is perhaps the simplest data warehouse schema. It is called a star schema because the entity- relationship diagram of this schema resembles a star, with points radiating from a central table. The center of the star consists of a large fact table and the points of the star are the dimension tables. A star schema is characterized by one or more very large fact tables that contain the primary information in the data warehouse, and a number of much smaller dimension tables (or lookup tables), each of which contains information about the entries for a particular attribute in the fact table. A star query is a join between a fact table and a number of dimension tables. Each dimension table is joined to the fact table using a primary key to foreign key join, but the dimension tables are not joined to each other. The cost- based optimizer recognizes star queries and generates efficient execution plans for them. A typical fact table contains keys and measures. For example, in the sh sample schema, the fact table, sales, contain the measures quantity. The dimension tables are customers, times, products, channels, and promotions. The product dimension table, for example, contains information about each product number that appears in the fact table. A star join is a primary key to foreign key join of the dimension tables to a fact table. The main advantages of star schemas are that they: Provide a direct and intuitive mapping between the business entities being analyzed by end users and the schema design. Provide highly optimized performance for typical star queries. Are widely supported by a large number of business intelligence tools, which may anticipate or even require that the data- warehouse schema contain dimension tables. Star schemas are used for both simple data marts and very large data warehouses. Figure 1. 7- 2 presents a graphical representation of a star schema. Figure 1. 7- 2 Star Schema. Text description of the illustration dwhsg. Snowflake Schemas. The snowflake schema is a more complex data warehouse model than a star schema, and is a type of star schema. It is called a snowflake schema because the diagram of the schema resembles a snowflake. Snowflake schemas normalize dimensions to eliminate redundancy. That is, the dimension data has been grouped into multiple tables instead of one large table. For example, a product dimension table in a star schema might be normalized into a products table, a product. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance. Figure 1. 7- 3 presents a graphical representation of a snowflake schema. Figure 1. 7- 3 Snowflake Schema. Text description of the illustration dwhsg. Note: Oracle Corporation recommends you choose a star schema over a snowflake schema unless you have a clear reason not to. Optimizing Star Queries. You should consider the following when using star queries: Tuning Star Queries. To get the best possible performance for star queries, it is important to follow some basic guidelines: A bitmap index should be built on each of the foreign key columns of the fact table or tables. The initialization parameter STAR. This enables an important optimizer feature for star- queries. It is set to false by default for backward- compatibility. The cost- based optimizer should be used. This does not apply solely to star schemas: all data warehouses should always use the cost- based optimizer. When a data warehouse satisfies these conditions, the majority of the star queries running in the data warehouse will use a query execution strategy known as the star transformation. The star transformation provides very efficient query performance for star queries. Using Star Transformation. The star transformation is a powerful optimization technique that relies upon implicitly rewriting (or transforming) the SQL of the original star query. The end user never needs to know any of the details about the star transformation. Oracle's cost- based optimizer automatically chooses the star transformation where appropriate. The star transformation is a cost- based query transformation aimed at executing star queries efficiently. Oracle processes a star query using two basic phases. The first phase retrieves exactly the necessary rows from the fact table (the result set). Because this retrieval utilizes bitmap indexes, it is very efficient. The second phase joins this result set to the dimension tables. An example of an end user query is: . In Oracle. 9i Standard Edition, bitmap indexes and star transformation are not available. Star Transformation with a Bitmap Index. A prerequisite of the star transformation is that there be a single- column bitmap index on every join column of the fact table. These join columns include all foreign key columns. For example, the sales table of the sh sample schema has bitmap indexes on the time. In the first phase, Oracle uses the bitmap indexes on the foreign key columns of the fact table to identify and retrieve only the necessary rows from the fact table. That is, Oracle will retrieve the result set from the fact table using essentially the following query: SELECT .. FROM sales. WHERE time. This method of accessing the fact table leverages the strengths of Oracle's bitmap indexes. Intuitively, bitmap indexes provide a set- based processing scheme within a relational database. Oracle has implemented very fast methods for doing set operations such as AND (an intersection in standard set- based terminology), OR (a set- based union), MINUS, and COUNT. In this star query, a bitmap index on time. This set is represented as a bitmap (a string of 1's and 0's that indicates which rows of the fact table are members of the set). A similar bitmap is retrieved for the fact table rows corresponding to the sale from 1. Q2. The bitmap OR operation is used to combine this set of Q1 sales with the set of Q2 sales. Additional set operations will be done for the customer dimension and the product dimension. At this point in the star query processing, there are three bitmaps. Each bitmap corresponds to a separate dimension table, and each bitmap represents the set of rows of the fact table that satisfy that individual dimension's constraints. These three bitmaps are combined into a single bitmap using the bitmap AND operation. This final bitmap represents the set of rows in the fact table that satisfy all of the constraints on the dimension table.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
January 2017
Categories |