Introducing Microsoft Analysis Services

  Windows IT Pro
Windows IT Library
  - Advertise        
Windows IT Pro Logo

  Home  |   Books  |   Chapters  |   Topics  |   Authors  |   Book Reviews  |   Whitepapers  |   About Us  |   Contact Us

search for  on    power search   help
 






Introducing Microsoft Analysis Services
View the book table of contents
Author: Teo Lachev
Published: December 2005
Copyright: 2005
Publisher: Prologika
 


Applied Microsoft Analysis Services 2005 shows database administrators and developers how to build complete OLAP solutions with Microsoft Analysis Services 2005 and Microsoft Business Intelligence Platform. The book gives developers the necessary background to extend UDM with custom programming logic, in the form of MDX expressions, scripts and .NET code.


 

Albert Einstein once said that information alone is not knowledge. This adage has never been more relevant than in today’s information age where organizations are looking for ways to quickly make sense of mountains of data generated every day. Only by carefully screening and analyzing that data can an organization unlock its power and fully understand its customers, its markets, and its business. I am not aware of an official slogan for Analysis Services but the one that may best serve this purpose could be "Knowledge is power".

As its name suggests, the promise of Microsoft SQL Server Analysis Services 2005 is to promote better data analytics by giving information workers powerful ways to analyze consistent, timely, and reliable data. Empowered with Analysis Services, you are well positioned to solve the perennial problem with data — that there is too much of it and finding the right information is often difficult, if not impossible.

This introductory chapter gives you a panoramic view of Microsoft SQL Server Analysis Services 2005 (SSAS) and the Microsoft Business Intelligence platform. Throughout the rest of this book, I will use the terms Analysis Services and SSAS interchangeably to refer to Microsoft SQL Server Analysis Services 2005. In this chapter, we will discuss:
  • What is SSAS?
  • The services SSAS provides
  • The role of SSAS in the Microsoft Business Intelligence Platform
  • How SSAS unifies the relational and dimensional reporting models
  • SSAS architecture
  • SSAS in action hands-on lab

1.1 WHAT IS ANALYSIS SERVICES 2005?

The processes of collecting and analyzing information assets to derive knowledge from data are typically referred to as Business Intelligence, or BI, for short. Simply put, SSAS can be viewed as a sophisticated software platform for delivering business intelligence by providing rich and efficient ways to "get out" what was "put in". To be more specific, we can describe SSAS as a server-based platform that provides two core services — On-Line Analytical Processing (OLAP) and data mining. Let’s cover these two terms in more detail.

Definition: Microsoft Analysis Services is a server-based platform for on-line analytical processing (OLAP) and data mining.


1.1.1 Introducing OLAP

There are two types of database-driven systems that serve orthogonal requirements. On-Line Transactional Processing (OLTP) systems are designed for fast transactional input to support business systems. On the other side, OLAP systems are optimized for fast data output to support data analytics and reporting.

Definition: On-Line Analytical Processing (OLAP) applications are optimized for fast data querying and reporting.

Let’s consider a popular BI scenario that can be implemented using the OLAP technology and SSAS.

Fast and intuitive reporting
Suppose that your company’s CEO is asking you for a report showing last-year sales. You run an SQL query that produces the magic number, e.g. the company made one million dollars in sales. While this figure may be great (or not so great depending on the company situation), it is unlikely to provide enough information to your CEO and marketing department.

There are a myriad of questions your business users may ask you. How do the sales of this year compare to last year? What are the top selling products? How do products sale by region, resale channels, etc? You may opt to address some of these questions by authoring standard reports but, at some point, this process may become counterproductive, especially when large datasets need to be processed.

One elegant and efficient solution is to implement an OLAP application that allows business users to produce reports interactively, as shown in Figure 1.1. It is implemented as a .NET Windows Form client application that leverages a "smart" OLAP browser connected to SSAS 2005. In this case, the browser is Microsoft Office Web Components (OWC) which is part of the Microsoft Office suite. Empowered with this application, end users could view data from different angles (called dimensions in the OLAP terminology) by creating dynamic views interactively. Moreover, report queries will be satisfied almost instantaneously because OLAP servers are optimized for fast retriaval.

Definition: The OLAP terminology uses the term dimension to refer to an analytical perspective that can be used to browse the data. Common examples of dimensions include Product, Customer, and Time.

In Figure 1.1, the user has decided to see sales data broken by Product by Category dimension on rows and Time dimension on columns. The user has expanded ("drill down") the Product by Category dimension to see data broken further down by Product Category, Subcategory, and Model Name. The Time dimension is also expanded to show sales figures by Calendar Year, Semester, and Quarter levels. Data is further filtered by Country to show sales figures for United States only.

I dubbed this type of reporting "interactive" because with a few mouse clicks the user can change the report to view data from different angles. For example, assuming the SSAS model supports this, the user can opt to view the sales data by Customers on rows and Fiscal Year on columns. Optionally, this application allows the user to save the report view to a database and retrieve it on as-needed basis. I will show you how you can build this type of applications in chapter 19.

As you could see, interactive reporting is much more powerful and flexible than standard "canned" reporting. The tradeoff is that you need to spend some extra effort to design and implement your OLAP database in such a way that it conforms to the OLAP dimensional model, as we will discuss in section 1.2. Don’t despair, though. SSAS definitely goes a long way in making this endeavor easier.

SSAS as OLAP server
Can you produce interactive reports from a relational database instead of going through all the trouble to implement an OLAP solution? Most likely the answer will be yes. For example, you can connect OWC directly to a relational database and achieve similar reporting results. There is really no magic that OLAP performs behind the scenes as a database engine. The same aggregation results could be achieved by just sending SQL statements. So, why should you use OLAP and SSAS for that matter?

To answer this question, consider the Fast Analysis of Shared Multidimensional Information (FASMI). This test was proposed by The OLAP Report, an independent organization that studies the OLAP technology and market (see the Resources section). It outlines five easy to remember criteria that each OLAP server should adhere to.

The Fast rule means that the OLAP server has to be optimized for fast (almost instantaneous) data retrieval. The OLAP Report suggests that most queries should be answered in five seconds. Of course, performance depends on a variety of factors, including hardware and software configuration, level of optimization, etc., but OLAP servers have to be FAST.

The Analysis rule means that the OLAP calculation engines should be superior in supporting advanced business calculations compared with RDBMS and these calculations shouldn’t mandate the use of a professional programming language.

Each OLAP server should also provide Shared access to data. This criterion has two aspects. First, it mandates that every OLAP server should protect sensitive data preferably at the most granular (cell) level. In addition, an OLAP server should support writeback, i.e. allowing the users not only to read, but also to change data.

The Multidimensional characteristic is the most important requirement. Every OLAP system should provide a multidimensional view of data that extends beyond the two-dimensional analysis of RDBMS. OLAP users should be able to see data from different angles called dimensions, as we’ve just seen in the example shown in Figure 1.1.

Finally, the Information criterion measures the ability of the OLAP system to store and aggregate vast volumes of data without performance degradation. So, to sum up, the strongest motivation factors to favor the OLAP technology instead of RDBMS for reporting is its superior performance, user-friendly reporting model, and rich calculations. I hope that in this chapter and throughout this book, I will convince you to consider OLAP for implementing efficient and rich reporting solutions.

1.1.2 Introducing Data Mining

The second main service that SSAS provides is data mining. Data mining is a science in itself. Generally speaking, data mining is concerned with the process used to predict the unknown based on known statistical facts. Instead of asking us to look at a crystal ball, the SSAS team has implemented sophisticated mathematical models that can analyze large volumes of data, discover patterns and trends, and produce prediction results.

Typical examples where data mining can be used efficiently include sales forecasting and basket analysis. For example, by examining the historical sale figures, data mining can answer the following questions:
  • What are the forecasted sales numbers for the next few months?
  • What products may this customer buy together with the chosen product?
  • What type of customers (gender, age groups, income, etc.) is likely to buy this product?
A common practical example of using data mining is shown in Figure 1.2. Imagine that your company is selling products and the marketing department is asking you to estimate the sales in North America for the next five months. Instead of dusting off your college math book, you prudently decide to use SSAS. Once you build the data mining model, with a few mouse clicks you can produce a report as the one shown in Figure 1.2. We will discuss data mining in more detail in chapters 7 and 8.

1.1.3 Overview of SSAS

Shortly after acquiring the Panorama OLAP technology in 1997, Microsoft introduced the first release of SSAS. It shipped as an add-on to the SQL Server 7.0 and was named Microsoft OLAP Services. The second version coincided with the SQL Server 2000 release. The product name was changed to Analysis Services 2000 to reflect the fact that now SSAS provided not only OLAP, but also data mining capabilities. Today, SSAS 2000 is a leading OLAP platform according to The OLAP Report (see the Resources section). After five years of gestation effort, Microsoft released SQL Server Analysis Services 2005 in November 2005.

SSAS editions and licensing
As its predecessors, SSAS 2005 ships as an add-on to SQL Server 2005. However, note that, although it is bundled with SQL Server, SSAS is not necessary dependent on the SQL Server relational engine.

Note: There are some features in SSAS 2005, such as using multiple data sources in a single data source view and proactive cache notifications, that work only when SQL Server is used as a data source.

For step-by step instructions on how to install SSAS 2005, refer to Appendix A at the end of this book. Let’s now briefly discuss how SSAS is packaged and its licensing requirements. To address different user needs, SQL Server 2005 is available in five editions — Express, Workgroup, Standard, Enterprise, and Developer editions. However, SSAS is available in the last three editions only (see Table 1.1). The Developer edition has the same feature set as the Enterprise edition but it is licensed for one machine only. For more information about how SSAS editions and other SQL Server 2005 products compare to each other, read the document SQL Server 2005 Features Comparison (see Resources section). SSAS 2005 licensing model is simple. Basically, you need a SQL Server license on the machine where SSAS is installed and there are no special "middleware" exceptions. For example, suppose your operational requirements call for installing SSAS on a separate server than your SQL Server RDBMS box. In this case, you will need two SQL Server licenses — one for the SSAS server and another one for the SQL Server instance.

Why use SSAS?
Traditionally, Analysis Services and OLAP in general have been used in conjunction with data warehousing. Of course, this scenario is still applicable. Many organizations would use Analysis Services to build an OLAP layer on top of a relational data warehouse in order to take advantage of the superior query performance of SSAS. However, thanks to new enhancements in SSAS, I believe you will find new scenarios for using it, including:
  • Rich data analytics — For many organizations, SSAS can become the logical next step for advanced data analysis and interactive reporting.
  • Data mining — An organization could find many uses for the predictive power of data mining.
  • Corporate performance management — With the introduction of KPIs, SSAS can be used to capture vital company performance metrics. More on this in chapter 12 and 19.
  • Centralized repository for business metrics — SSAS supports advanced calculations and is best suited for storing business metrics and calculations.
  • Ad hoc reporting — Besides interactive reports, end-users can create ad hoc reports from SSAS. We will see how this could be done in chapter 18.
At the same time, as there is no such a thing as a free lunch, SSAS may be overkill for small organizations because it requires an additional design and maintenance effort. As a rule of thumb, if standard reporting meets your data analytics and performance needs, it may not be time to "graduate" to OLAP yet.

1.1.4 SSAS and Microsoft Business Intelligence Platform

SSAS is not the only Business Intelligence product that Microsoft provides. It is an integral part of the Microsoft Business Intelligence Platform that was initiated in early 2004 with the powerful promise to "bring BI to the masses". The Microsoft Business Intelligence Platform is a multiproduct offering that addresses the most pressing data analytics and management needs that many organizations encounter every day.

To understand how SSAS fits into this initiative, it may be helpful to depict the Microsoft Business Intelligence Platform in the context of a typical three-tier architectural view that most of the readers are probably familiar with, as shown in Figure 1.3. Let’s explain briefly the building blocks of the Microsoft Business Intelligence Platform.

SQL Server
The SQL Server relational database engine forms the foundation of the BI Platform. In my opinion, SQL Server is one of the best products that Microsoft has ever invented. Its relational database has been holding the top TPC (Transaction Processing Council) benchmarks in the price/performance category, as you could see online at www.tpc.org. Now that SQL Server comes bundled with so many valuable add-on services, it is indeed "do more with less", as the popular Microsoft slogan goes. While discussing the SQL Server 2005 enhancements and new features may easily fill a whole book, I would like to bring your attention to a couple of SSAS-related enhancements that we will be using throughout this book

SQL Server Management Studio
SQL Server 2000 Enterprise Manager is gone and it is replaced by the new SQL Server Management Studio (see Figure 1.4). The most prominent feature of the SQL Server Management Studio is that it can be used to manage all SQL Server services. Figure 1.4 shows that I’ve connected to an Analysis Services server called Prologika and I’ve executed an MDX query against the Adventure Works cube.

The central pane shows the metadata of the Adventure Works cube. You can drag and drop objects from the Metadata tab, or MDX standard functions, from the Functions tab. Yes, the query editors support IntelliSense so you could check the function syntax easily! SQL Server Management Studio comes with a slew of editors, templates, designers, and other tools to meet the full spectrum of your query authoring, performance optimization and management needs.

SQL Profiler
Veteran SSAS developers know that, in the past, it was almost impossible to "peek under the hood" of the SSAS server. This has all changed now since the SQL Profiler has been enhanced to support capturing and displaying events raised by SSAS 2005. For example, you can use the SQL Profiler to intercept an MDX query to see how long it takes to execute. We will meet the SQL Profiler in chapter 13 when discussing SSAS management.

Services layer
On top of the relational database, SQL Server provides various services. The three main BI pillars are Reporting Services (SSRS), Integration Services (SSIS), and, of course, Analysis Services (SSAS) which is the subject of this book.

Reporting Services
SSRS is a server-based platform for authoring, managing, and distributing standard reports. SSRS reports can source data from virtually any data source that exposes its data in a tabular format, including SSAS. A new SSAS-related feature of SSRS 2005 is the excellent MDX Query Builder that you can use to create reports from SSAS cubes easily (see Figure 1.5). Figure 1.5 shows that I’ve authored a Product Sales report by dragging Product dimension and a few measures from the Metadata pane and dropping them onto the Results pane. In addition, I’ve parameterized this report by allowing the user to filter the report data for a given calendar year.

I’ve covered SSRS 2000 in details in my book Microsoft Reporting Services in Action (see the Resources section). In chapters 18 and 19 of this book, I will show you the most exciting new features of SSRS 2005 that relate to authoring SSAS-based reports, including the MDX and DMX query builders, the new Windows Forms and ASP.NET report viewer controls, ad-hoc reporting, and SharePoint integration.

Integration Services
Today’s enterprise IT shop would typically maintain a hodge-podge of data sources and technologies. These include desktop databases, legacy mainframe systems (that no one dares to touch), RDBMS, etc.

Note: One of my projects involved building a data warehouse for a call center of a major financial institution. The data integration requirements called for extracting data from six databases and consolidating it into a central data warehouse repository. Most of the project effort was spent on implementing the ETL data integration processes.

For example, the order tracking data could reside in a SQL Server database, the HR data could be stored in an Oracle database, while the manufacturing data could be located in a mainframe database. Integrating disparate and heterogeneous data sources presents a major challenge for many organizations. This is where SSIS (formerly known as DTS) could be useful. It is typically used for Extracting, Transforming, and Loading (ETL) processes for data integration.

SSIS has been completely revamped in SQL Server 2005. There are a few exciting features in SSIS that specifically target SSAS, including dealing with slowly changing dimensions, implementing low-latency OLAP, and processing partitions. One of the most common OLAP requirements that could benefit from SSIS is data warehousing. We will see how this could be done in chapter 6. There are other SQL Server add-on services that you may find more or less relevant to BI applications. These may include Replication Services to clone data, SQL Server Broker to raise event notifications, and Notification Services to build sophisticated notification application.

Presentation layer
The OLAP technology will be useless if users cannot browse the data. SSAS itself doesn’t provide an OLAP browser. The BI platform delegates this role to the Microsoft Office suite, SharePoint, or third-party products.

Microsoft Office
In section 1.1.1 of this chapter, we had a glimpse of how Microsoft Office Web Components (part of the Microsoft Office suite) can be used to build "smart" OLAP clients. Besides OWC, in the last chapter of this book I will show you how to integrate Microsoft Excel and the Office Business Scorecard Manager 2005 with SSAS 2005 to implement interactive reporting and performance management.

SharePoint
Use SharePoint to build enterprise-level portal sites. SSAS doesn’t include web parts to browse cube data. However, the Reporting Services team has built two SharePoint web parts, Report Explorer and Report Viewer, which can be used to integrate SSRS reports with a SharePoint portal. We will have a glimpse of how SharePoint Portal Services can be used to disseminate scorecards in chapter 19.

Other applications
Developers can utilize industry-standard connectivity protocols to integrate SSAS and SSRS easily with their applications. In chapter 17, I will show you how you can develop custom OLAP-based applications.

Visual Studio.NET

Finally, developers can use Visual Studio.NET to glue the components of the BI Platform together. Developers can use the excellent Visual Studio.NET IDE to custom solutions or work with BI projects. If you don’t have the full-blown version Visual Studio.NET (or you are not willing to purchase a license), the SQL Server 2005 setup program gives you an option to install a scaled-down version of Visual Studio.NET, called Business Intelligence Development Studio (BI Studio).

BI Studio supports Analysis Services, Reporting Services, and Integration Services projects. It gives you the power of the Visual Studio.NET Integrated Development Environment at no additional cost. Using BI studio, you can centralize the design and management of your BI projects. Now that we have reviewed the components of the Microsoft BI Platform, let’s find out what’s so innovative about SSAS 2005.


1.2 UNDERSTANDING OLAP

SSAS 2005 goes beyond just being an OLAP and Data Mining server. The bold mission of SSAS is to break out of the OLAP space by unifying the relational and dimensional reporting models. To understand the new changes in SSAS 2005, it may be useful to take a short trip back in time and discuss the challenges that enterprise business intelligence has been facing for the past two decades. Let’s use Figure 1.6 as a roadmap for our tour.

I will be quick to point out that as it currently stands, UDM shouldn’t be viewed as a replacement of the relational reporting model or as a competing technology to standard reporting. Considering the unified vision of UDM however, one would expect that eventually both models will be merged in an integrated platform that provides both standard and OLAP reporting services.

Figure 1.6 depicts the evolution of both reporting models, relational and OLAP, and the "great divide" between them that is a common reality for most organizations today. On the one side of the dividing line is relational reporting where reporting processes are performed against relational models. A relational model could represent both a relational OLTP schema (normalized in the 3rd Normal Form) and a layer built on top of it (e.g. to serve ad-hoc reporting needs).

Note: Strictly speaking, the dividing line between relational and OLAP reporting processes could be somewhat blurred. For example, standard reports can be generated from a dimensional data source (e.g. data mart) if this doesn’t lead to performance issues. For this reason, Figure 1.6 shows a data mart in the relational reporting section. Chapter 2 discusses various reporting scenarios in respect to the data source type in more details.

On the other side are OLAP reporting processes that interact with dimensional models. We will use the term dimensional model to represent a data source that is specifically structured and optimized to address reporting and data analytics requirements, such as data marts, data warehouses, and OLAP cubes.

1.2.1 Relational Model

In the early 1980s, reporting needs were addressed by sourcing data directly from RDBMS. This model is still popular and widely used today. For example, if your preferred tool of choice for standard reporting is Microsoft Reporting Services, you can source the report data directly from RDBMS. As popular as it is, the relational reporting model has well-known deficiencies.

Not user-oriented
The relational model is designed with the system, not the end user in mind. Consequently, to create a report, the end user has to understand the database relational schema and know SQL. Isn’t it strange that one of the job requirements for hiring a business analyst is to know SQL?

Performance challenges
Relational reporting could lead to performance issues. The report performance depends, to a large extent, on the data volume the report needs to process. What’s more, running reports directly against RDBMS may very well slow down the performance of the OLTP system itself as a result of locking large number of rows. The reason for this is that pending transactions may be blocked while waiting for the report query to finish executing and releasing the read locks placed on the qualifying rows.

Note: Once upon a time, I was called upon to troubleshoot mysterious query timeout errors that a clientserver application was experiencing at random. After some troubleshooting, I pinpointed the culprit to be a popular ad-hoc reporting tool. Not only was the tool placing read locks on the SQL Server tables, but it wasn’t releasing the locks even after the report was generated.

Lack of conformity
While the ad-hoc reporting models could abstract the underlying data schema to some degree, relational reporting is characterized by a lack of conformity. Business calculations and logic are not centralized in one place. For example, the database developer could define the net dollar amount calculation of a line item in the database itself, while the report designer could re-define it in the ad-hoc reporting model. This may be confusing to end users. Often, users are left to make their own interpretation of the data.

1.2.2 Dimensional Model

To solve some of the challenges of relational reporting, organizations started moving data from OLTP databases to data marts and warehouses. OLAP servers, such as Analysis Services, emerged in the late 1990s to provide the necessary CPU power to process the increased data volumes. A new dimensional model was born to make reporting more intuitive for less technically savvy users. SSAS embraces and extends the dimensional model, so let’s spend some time explaining its terminology.

Note: The terms data warehousing and OLAP are often used interchangeably but an important distinction exists. As its name suggests, a data warehouse can simply be described as a relational database that stores vast volumes of data. The term OLAP, on the other hand, represents the service layer introduced between the warehouse and users to make data available for fast retrieval and analysis. A data warehouse solution may not feature an OLAP server. Similarly, an OLAP server, such as SSAS 2005, may draw its data directly from the OLTP system, instead of from a data warehouse. That said, both data warehousing and OLAP use dimensional modeling as a core technique to organize data and make it more suitable for data analytics and reporting. There are some differences in their terminology though. For example, the data warehouse model refers to business metrics as facts, while OLAP uses the term measures.

Measures

Measures represent the numerical values (facts) that are used to measure business activity. Let’s have a look again at our interactive report shown in Figure 1.7. This report displays the company sales performance. The only measure used in this report is Sales Amount. Other measures may include tax amount, discount, profit, order count, etc.

Measures are physically stored in relational tables called fact tables. These tables are usually narrow (don’t have many columns) but can have thousands to millions rows of historical data. In addition, fact tables have foreign keys that link them to dimension tables.

Dimensions

As its name suggests, the main goal of the dimensional model is to allow users to slice and dice data using different perspectives called dimensions. Dimensions reflect the natural way end users would prefer to view and query data. For example, our report allows users to browse data by two common dimensions: Product and Time.

Dimension hierarchies
To facilitate drilling through data, dimensions may have hierarchies. For example, in our sample report, the time dimension hierarchy consists of the following dimension levels: Year, Seminar, and Quarter. The quarters can be further broken down into more granular levels, e.g. Month and Day. Similarly, the Product dimension hierarchy includes the Category, Subcategory, and Model Name levels. A dimension level summarizes ("aggregates") data at that level. For example, since the user hasn’t expanded the 2003 quarter, 2003 sales data on this report are aggregated at the Quarter level.

Dimension members
The actual dimension entities that belong to each level are called dimension members. Thus, the members of the Calendar Year level are 2003 and 2004, while the members of the Category level are Accessories, Bikes, Clothing, and Components. Another way to depict the dimension hierarchy is to use an organizational chart, as the one shown in Figure 1.8.

The top level of a dimension is depicted as All level, which is how SSAS terminology refers to it. It is a handy way to retrieve the total aggregated value for the whole dimension. The All level usually serves as the default dimension member. For example, if I am to remove the Time dimension from the report, the report will show the product sales for all time periods. The members of the lowest level of a dimension hierarchy are called leaf members. For example, if the Quarter level is the lowest level in our Time dimension, the quarters are the leaf members.

Dimension tables
Dimension data are stored in relational tables called dimension tables. Unlike fact tables, dimension tables are usually wide (have many columns) but don’t have many rows. The large number of columns is required to accommodate various dimension-related attributes that could be of interest to the end users. For example, a product dimension could have attributes such as product description, color, model, list price, listed date, discontinued date, etc.

The classic dimensional model defines two types of relational schemas (see Figure 1.9) that describe the relationship between the dimension and fact tables: star and snowflake. A star schema requires that a dimension hierarchy be contained within a single table. This requires the dimensional table to be denormalized. For example, going back to the sales report, if a star schema is chosen for the product dimension, the product data may look like this:
ProductID ProductCategory ProductSubCategory ModelName
1     Bikes   Montain    Bikes Montain-200
2     Bikes   Montain    Bikes Montain-300
If the dimension hierarchy is left normalized, then the schema is of a snowflake type. Over the past decade, dimensional model scholars have staged fierce battles in a quest to find out which schema type reigns supreme. The "classic" dimensional model promotes the star schema. Indeed, if the user queries the relational database (e.g. data warehouse) directly, a snowflake schema will require the user to link the dimension tables together. This assumes that the user has the necessary technical skills to do so, but wasn’t this the problem that the dimensional model was trying to avoid in the first place? Moreover, in comparison with snowflake schemas, stars schemas are easier to maintain and update.

On the other hand, snowflake schemas could support more flexible relationships, such as referenced and many-to-many relationships (discussed in chapter 5). In addition, they could save storage space with large dimensions. SSAS takes a nonchalant view of this schema debate and supports both schema types. A noticeable exception is the dimension writeback feature which is only supported with star dimensions. In real life you should carefully weigh out the pros and cons of both approaches and choose the schema that best meets your requirements. In general, I would recommend you gravitate toward star dimensions whenever possible and consider "upgrading" them to snowflake dimensions if needed. Dimension terminology has some additional classifications but, for time being, this is all you need to know about dimensions.

Cubes
The Sales by Product Category report (see again Figure 1.7) is an example of a two-dimensional report. SSAS is not limited to storing and displaying information in a two-dimensional format. As I’ve mentioned, one of the FASMI requirements is that every OLAP system must be multidimensional, so users can view data from as many dimensions as they want. In addition, the OLAP multidimensionality shouldn’t sacrifice performance.

Definition: The cube is the logical storage object in SSAS. It combines dimensions and measures to provide fast multidimensional access to the cube data.

How are cubes implemented?
To achieve these demanding requirements, SSAS employs the logical concept of a cube as a main storage object. The term logical in our definition means that, unlike relational objects (tables, views, etc.), the cube doesn’t have a physical realization. Thus, if you browse the SSAS storage folder (the default is C:\Program Files\MicrosoftSQL Server\MSSQL.2\OLAP\Data), you won’t find any multidimensional or other exotic structures. Instead, you will find a large number of files that store data in binary format. During runtime, SSAS performs its "magic" and exposes the content of these files to clients as a multidimensional cube. Figure 1.10 shows how you can visualize a cube that has three dimensions. Suppose that we connect OWC to this cube. We can now browse by three dimensions — Year, Product, and Territory (for the sake of simplicity, let’s assume that the three dimensions don’t have hierarchies). The intersection of the cube dimensions is called a cube cell. For example, the shaded cell in the figure is found at the intersection of the following dimension members — 2003 (Year dimension), Bikes (Product Category dimension), and North America (Territory dimension). Each cell of the cube holds a single value.

Tip: When visualizing the cube, it may be helpful to think of the cube measures as separate dimensions. In our case, we have two measures (sales amount and order count). Each intersection between the cube dimensions and measures will result in a single cell. Since it is difficult to show the cube in four perspectives, I split the cell in two — the upper cell in the figure shows the sales amount; the lower shows the order count.

We live in a three-dimensional space, so it is natural for us to visualize a three-dimensional cube. However, SSAS 2005 cubes can and usually have more than three dimensions. In fact, SSAS 2005 cubes support more dimensions than you will ever need (in the range of billions). Since many of the cube limitations in the past have been removed, an SSAS 2005 cube could really be viewed as a "super-cube". In fact, you are encouraged to build your entire OLAP layer on top of an enterprise-wide data warehouse with a single SSAS 2005 cube.

Cube storage
From a user perspective, a cube appears to store all fact rows and have aggregated values for each cell. For example, if the user hasn’t expanded the Calendar Year hierarchy, the cube will show the annual totals. Similarly, if the user drills down to the Quarter level, the cube will readily return the quarter aggregated values.

In reality, the cube may have neither the fact rows, nor the aggregated values stored in it. In this case, the cube will aggregate the values on the fly. As an UDM designer, you can tell SSAS where the fact data and aggregations will be kept — in the relational database (ROLAP storage), in the cube (MOLAP storage), or both (HOLAP storage). SSAS cubes usually perform best when all data (details and aggregations) are stored in the multidimensional store on the server, i.e. when MOLAP storage model is used. In this way, the cube can answer all queries without querying (and impacting the performance) of the underlying data source. On the downside, since the server keeps a copy of dimension and fact data (MOLAP storage option), it has to be updated (processed) when the dimensional structure or source data changes.

Dimensional model challenges
By now, you are probably convinced that the dimensional model has many advantages over its counterpart, the relational model. It is the cornerstone of the OLAP technology and it is very popular today. At the same time, the dimensional model is not without its shortcomings. Let’s mention a couple of them.

Data fidelity lost
While dimensions reflect the natural way end-users prefer to analyze data, important dimension characteristics could be lost when transforming OLTP data to fit into the "classic" dimensional model. For example, the Product dimension we’ve discussed has a hierarchy that reflects the natural way end users would prefer to browse the product sales data — by category, subcategory and product name.

However, the original Product relational schema may have included columns for product color, size, model, etc. With the classic dimensional model these attributes could be simply "lost" when the Product dimension hierarchy is created or require additional dimensions to be implemented. The end result is that the user may not be able browse or filter data using these attributes.

Data latency
The second well-known issue surrounding the OLAP model is data latency, since the same data extracts would exist in all the OLAP repositories — data marts, data warehouse, and OLAP cubes. This is further aggravated by latency issues. It is not uncommon for ETL processes to take hours, if not days to complete. By the time data arrives in the OLAP cubes and it is available for reporting, it may be significantly outdated.


1.3 UNDERSTANDING THE UNIFIED DIMENSIONAL MODEL

In summary, most of today’s organizations have accumulated a mixture of two distinct storage models, relational and dimensional, and each of them has its own pros and cons. What is exciting about SSAS 2005 is that is starts a novel journey to unite both relational and dimensional models by combining the best aspects from both. This model is called Unified Dimensional Model or UDM for short, as shown in Figure 1.11. UDM is unified because its goal is to unite the relational and dimensional models. It is dimensional because it has its roots in the dimensional model.

Definition: The SSAS Unified Dimensional Model (UDM) converges the relational and dimensional models. The physical manifestation of UDM is the SSAS 2005 cube.




Page: 1, 2

next page



ADS BY GOOGLE SPONSORED LINKS FEATURED LINKS

Critical Challenges of ESI & Email Retention
Are you storing too much electronic information? Get expert legal advice and better understanding of what you are required to do as an IT professional.

Become a fan of Windows IT Pro on Facebook!
Join us on Facebook and be a fan of Windows IT Pro!

Sustainable Compliance: Are You Having a Resource Crisis?
Read this white paper to examine trends in compliance and security management and review approaches to reducing the cost and operational burden of compliance.

Rev Up Your IT Know-How with Our Recharged Magazine!
The improved Windows IT Pro provides trusted IT content with an enhanced new look and functionality! Get comprehensive coverage of industry topics, expert advice, and real-world solutions—PLUS access to over 10,000 articles online. Order today!

Get It All with Windows IT Pro VIP
Stock your IT toolbox with every solution ever printed in Windows IT Pro and SQL Server Magazine plus bonus Web-exclusive content on hot topics. Subscribe to receive the VIP CD and a subscription to your choice of Windows IT Pro or SQL Server Magazine!



Order Your Fundamentals CD Today!
Gain an introduction to Exchange, learn server security requirements, and understand how unified communications can play a role in your messaging strategies with this free Exchange CD.
Windows IT Pro Home Register About Us Affiliates / Licensing Media Kit Contact Us/Customer Service  
SQL Connected Home IT Library SuperSite FAQ Wininfo News
Europe Edition Office & SharePoint Pro Windows Dev Pro Windows Excavator 
 
 Windows IT Pro is a Division of Penton Media Inc.
 Copyright © 2008 Penton Media, Inc., All rights reserved. Terms and Use | Privacy Statement | Reprints and Licensing