Data Modeling in Multiproviders and Infocubes
SAP BW Modeling:
Multiprovider:
Multiproviders must always be used as a base for the query.
Every MultiProvider includes a characteristic called InfoProvider (0INFOPROV) in the data package dimension.
Queries can filter on this characteristic to restrict access to one or more specific underlying InfoProviders.
If your MultiProvider definition includes a large number of underlying InfoProviders (such as one per year, one per division, etc.) you can improve query performance by dynamically filtering on the 0INFOPROV characteristic.
A customer exit variable provides a flexible way to dynamically filter on the 0INFOPROV characteristic. For example, if your data model includes one physical InfoCube per division, you can use a customer exit variable to specify the correct filter values for 0INFOPROV, based on the division(s) specified in the query.
You can use an external table to store the relationship between a division value and its corresponding 0INFOPROV value (technical name of an InfoProvider) When the query executes, the BW OLAP engine will query only the underlying InfoProviders that contain data for the specified divisions.
Infocube:
Infocube Design: Fact table and key figures
- fact table can be partitioned based in Fiscal month or calendar month.
- Infocube should not be used for archiving or storing historical or detailed data
- Infocube should always have only summarized data needed by user
- Old data from the infocube can be periodically removed (retention period of the cube) or else the size of the cube can grow very large with time
- More the dimensions in the cube, more is the granularity
- SAP recommends that Calculated key figures should be calculated at front end level instead of storing them in the cube unless required. This will improve the data load performance.
Infocube Design: Dimensions and Master data
Dimension Standards:
- If there are less than 13 characteristics in the cube, put each of them in a separate dimension and flag as line item dimension, this will eliminate the dimension tables and improve the performance
- If there are more than 13 characteristics in a cube, group low cardinality characteristics together in one dimension (here cardinality refers to number of unique values of the characteristic)
- Do not group high cardinality and low cardinality objects together in one dimension like document number and document type. Here, Document number can hold thousands of unique values whereas document type have few unique values.Hence the query which has filtering on document type would require system to retrieve and join thousands of dimension ids
- Characteristics having many to many relationships should be put in separate dimensions
- Characteristics having a single value can be grouped together in one dimension
- Multiprovider can be utilized to again logically group the characteristics. This will not have any effect on the data modeling. Hence cube dimension modeling should be done taking query performance in mind and multiprovider dimension modeling should be done in regard to user perspective.
Standards: Time Characteristic
Include all the aggregation of time in the cube like month, quarter, half year and year if there is calday in the data model. This will provide flexibility to the query users.
This will not increase number of rows in the fact table.
Master data table and Attributes
Dimension characteristics vs Navigational Attribute
- Query performance is degraded if navigational attributes are used
- Instead include the characteristic in the infocube dimension and use if the attribute is used frequently in the query
- Thus, attributes most frequently used should reside in the dimension table
- Attributes with many to many relationships within infocube must occur as characteristic in a dimension table
- For 1:N relationship attribute (the one at the N side) which may or may not occur in the data, model these as characteristic in the dimension or as leaf node in a hierarchy
- SAP BW only track changes between values of related characteristics within a dimension table indirectly (through fact table). To track these changes precisely, either store the parent in the child master data or store the parent attribute in a hierarchy.
Use of Aggregates and Navigational Attributes
- If navigational attribute had to be used, consider creating an aggregate where this attribute can be added as characteristic in aggregate dimension. BW allows aggregates with time dependent navigational attributes
Many to Many (n:m) Relationships
- These should not be in the same dimension. If one of such characteristics have few distinct values, you can keep them in the same dimension e.g. Customer and shipping type
One to Many (1:n) dependent relationships
An example of such relation is Product and its status: whether on sale or not on sale.
Here a status could be of any product, whereas a product can have only one status at a time.
You must know the following business requirements to model these:
- How does the relationship changes with time: frequently,. at fixed intervals or never.
- Should business need to report fact based on the relationship at the time of transaction
- Does business need to preserve history of the relationship changes
Modeling guideline table for 1:N attribute
Business Scenario | Model as Dimension Characteristics | Model as Master data attributes | Model as time dependent Master data attributes |
Report based on the state of the relationship at the time of the transaction | * | ||
Report of historical facts (key figures) using the current state of the relationship | * | ||
Reporting of data using relationship values of specific intervals (from/to dates) in the past, as well as the current state | * | ||
Reporting of key figures using both the values in effect at the time of transaction and the current value | * | * | |
The source system sometimes omit the lower level attribute value (at the N side of the relationship) | * |
Modeling Hierarchies
You can model the hierarchy in three different ways in the BW system:
- As an external hierarchy
- As a hierarchy of dimension characteristics
- As a hierarchy of master data attributes
External Hierarchies
SAP BW system has this feature of built in hierarchy management for each characteristic. You can use external hierarchies if the hierarchy changes frequently or it does not have a fixed number of levels.
External Hierarchy Benefits:
- A particular infoobject can have several such hierarchies and the design can use all of these within a single infocube
- Aggregates can be defined on such hierarchies
- There is only one hierarchy table which can be shared across multiple infocubes
External hierarchy drawbacks:
- These hierarchies usually perform worse than those modeled within dimensions
- Problems can arise if the hierarchy becomes big with thousands of nodes and leaves. In such cases, consider the below options:
Modeling a hierarchy as dimension characteristic
Using this option, you can model the hierarchy by adding set of independent characteristics in a dimension, for example calday, calmonth, calquarter, calyear.
You can follow below general criteria while using this option:
- Your hierarchy has fixed number of levels
- It rarely changes
- Past history is not required
Benefits:
- Queries perform faster on such hierarchies
Drawbacks:
- BW does not explicitly know about the hierarchical relationship between the characteristics in the dimension table. Thus any aggregate having the child node must also include all the parent nodes in the hierarchy
- This method does not support building of multiple independent hierarchies in the same characteristic
- The hierarchy must have fixed number of levels, any changes in that may require change in the underlying cube structure and possibly, reloading the data.
Modeling a Hierarchy as Master Data Attribute
Here with the use of navigational attributes, the hierarchy can be modeled. The method can however only be used if the number of levels are unchanged. The adjustments/realignments to this hierarchy can be automatically seen in the cube fact table. This feature is hence more flexible than dimension hierarchy but less effective than external hierarchy.
For example, consider the hierarchy sales group – sales person. This hierarchy has fixed number of levels but changes frequently.
From performance perspective, this method is the least attractive compared to other options.
Design choices and their performance effects
Below table summarized the options:
Dimension Characteristics vs. Navigational Master Data Attributes: | |||
Data Load Performance | Query / Reporting Performance | Data Integration | |
Dimension Characteristic | Worse | Better | Worse |
Navigational Master Data Attribute | Better | Worse | Better |
Other, Ad Hoc Design Choices: | |||
Data Load Performance | Query / Reporting Performance | Data Integration | |
Aggregate | Worse | Better | N/A |
Stored Key Figures for Calculated Values | Worse | Better | Worse |
Compound Key in a Dimension Characteristic | N/A | Worse | Worse |
Tag:multiprovider, techniques