Tuesday, July 16, 2013

Microsoft SQL Server 2012 Data Quality Services

Microsoft SQL Server 2012 Data Quality Services
 
Purpose: The purpose of this document is to provide a walkthrough of how to leverage Microsoft SQL Server 2012 Data Quality Services in conjunction with Master Data Services for Microsoft Dynamics Master Data Management scenario.
 
Challenge: The pain that organizations are experiencing around data accuracy, consistent reporting, regulatory compliance, etc. generated a lot of interest in Master Data Management. Master Data Management becomes extremely important when you have multiple different systems and you have to manage the data in different systems. In fact what we would also love to do is to control the quality of the data and be able to perform data cleanse, data matching and other data quality related activities.
 
Solution: The data-quality solution provided by Data Quality Services (DQS) enables a data steward or IT professional to maintain the quality of their data and ensure that the data is suited for its business usage. DQS is a knowledge-driven solution that provides both computer-assisted and interactive ways to manage the integrity and quality of your data sources. DQS enables you to discover, build, and manage knowledge about your data. You can then use that knowledge to perform data cleansing, matching, and profiling. You can also leverage the cloud-based services of reference data providers in a DQS data-quality project.  
 
Scenario
 
By now we have already implemented Master Data Management scenario using Microsoft SQL Server 2012 Master Data Services. The next step is to integrate Master Data Services with Data Quality Services to deliver end-to-end Master Data Management solution. For the sake of simplicity in this document I’ll consider usual Master Data Management entities such as Customers and Products.
 
Walkthrough
 
In this walkthrough I’m using Microsoft Dynamics AX 2012 R2 Demo VM (AX2012R2A) which has Microsoft SQL Server 2012 installed and Microsoft Dynamics CRM Online (free trial)
 
After you install Microsoft SQL Server 2012, Master Data Services and Data Quality Services (Server and Client components) you have to run Data Quality Server Installer to perform necessary post installation tasks by using dedicated installer which will create appropriate DQS databases, set up security, etc. In case you have multiple instances of Microsoft SQL Server installed please locate appropriate DQSInstaller.exe file in corresponding Microsoft SQL Server Binn folder
 
 
Please note that the default Microsoft SQL Server instance may be different from the desired (MSSQLSERVER) which is the case for Microsoft Dynamics AX 2012 R2 Demo VM (AX2012R2A), that’s why it is critical to run appropriate DQSInstaller.exe during DQS installation  
 
Data Quality Server Installer 
 
 
Once DQS installation is complete you can launch DQS Client
 
SQL Server Data Quality Services – Client
 
 
The next step will be to create Knowledge base for future data quality related activities. In particular I’ll be doing data matching trying to detect duplicates
 
Knowledge base management
 
 
Please note that number of domains are available for Knowledge base “DQS Data”. In particular I’ll be using “US – Last name” domain to detect duplicates based on last name of customers
 
Actual data matching we’ll perform in Microsoft Excel
 
 
That’s why first off we have to establish connection to MDS Server
 
Excel Add-in – Add new connection
 
 
Manage connections
 
 
Once connection has been established
 
 
Now we can use Master Data Explorer to open up appropriate Model (“Dynamics”)
 
 
In this scenario I’ll work with customers and at the moment I have one customer record in Customers table in MDS database
 
 
In Microsoft Excel I can introduce new (External) data as shown below
 
 
But before I publish this data to MDS I can perform data matching and find out if I have any duplicates or similar data based on criteria defined. For these purposes I’ll use Combine data feature which allows me to combine MDS data and External data in order to perform data matching activities
 
Combine data
 
 
I’ll select what data I want to combine and map the fields appropriately. Once this is done the system shows “who is who” in SOURCE field
 
 
When I have the data for analysis I can use Match data feature to perform data matching
 
Match data
 
 
Please note that for Name field I specify Domain = US – Last name, so the system will apply matching rules for the analysis based on Last name. The data will be clustered in accordance to similarity as shown below in CLUSTER_ID field  
 
 
Finally I’ll Show Details to see the results of extended analysis
 
 
Please note that for External data the system assigned a score in SCORE field which shows degree of similarity within the cluster. It appears that “Alex Anikiiev” is 90% similar to “Alex Anikiev” which allows me to define and eliminate duplicate data before publishing this to MDS
 
You can also perform periodic data cleanse, data matching and other data quality related activities using MDS and DQS to ensure the quality of your master data
 
Summary: This document describes how to leverage integration of Master Data Services and Data Quality Services when implementing Master Data Management scenario for Microsoft Dynamics AX 2012 and Microsoft Dynamics CRM. In particular I explained how to use Data Quality services in order to perform data matching based on certain criteria. Microsoft SQL Server 2012 provides a robust platform for Master Data Management which not only includes Master Data Services, but also Data Quality Services, SQL Server Integration Services, etc. Master Data Services provides Web front-end as well as Excel Add-in which allows you to choose your favorite tool to perform Master Data Management activities.
 
Tags: SQL Server 2012, Dynamics AX 2012, Dynamics CRM Online, Master Data Management, Master Data Services, Data Quality Services, SQL Server Integration services, Customers, Products.
 
Note: This document is intended for information purposes only, presented as it is with no warranties from the author. This document may be updated with more content to better outline the issues and describe the solutions.
 
Author: Alex Anikiev, PhD, MCP

5 comments: