Data Quality: Advanced Techniques

Training Overview

Understanding leverage advanced techniques when utilizing developers to profile, cleanse, standardize, de-duplicate, & consolidate data in an enterprise. Focused on creating & applying custom-built Classifier & Probabilistic Models, utilizing advanced Parsing & Matching methods, refining Human Tasks & workflow, automatically Associating & Consolidating matched records, applying Parameters in mappings, & more. This training is applicable for all version 10 releases.

Prerequisites

Data Quality: Data Quality Management for Developers (Instructor Led)

Audience Profile

Developer

Learning Objectives

After successfully completing this training, professionals should be able to:

  • Perform Join Profiling
  • Create & apply Classification Models
  • Parse data using advanced techniques
  • Create & apply Probabilistic Models
  • Apply sophisticated Grouping & Matching techniques
  • Automatically Associate & Consolidate matched records
  • Refine Exception & Duplicate Record Workflows used to populate Analyst inboxes
  • Design, Implement & Test processes to manage updated exception/duplicate records
  • Appropriate DQ Parameters
  • Examine Performance considerations
  • Review CRM & Dashboard & Reporting Templates
  • Optionally/time allowing:
  • Leverage Web Services to apply DQ mappings in Excel
  • Perform Identity Matching

· Use the Universal ID store to match against master data

Content Outline

Training Introduction, Agenda & Overview

  • A quick review of Informatica Developer
  • Use Enterprise Discovery to create Join Profiles

Lab: 

· Perform Join Profiling using an Enterprise Discovery Profile

  • Review Standardization Techniques
  • Build, refine & apply a Classifier Model

Labs:

· Create, refine & apply Classifier Model

  • What is Probabilistic Labeling & Parsing?
  • Build, refine & apply a Probabilistic Model
  • Additional Parsing Techniques:
  • Build regular expressions

Labs: 

· Build, refine & apply a Probabilistic Model

· Review an example of Advanced Parsing 

· Generate & test Regular Expressions

  • Additional Grouping Techniques
  • Using Composite keys
  • Advanced Matching Techniques
    • Matched pairs outputs
    • Working with Match Mapplets
    • Manipulating the matched data using the Driver ID
    • Perform Dual Matching

Lab:

· Create a Match mapping using Matched Pairs 

· Create & update a Match Mapplet

· Manipulating Matched Data using the Driver ID

· Perform Dual Matching using a Master Dataset

  • Overview of the Consolidation Process
  • Use the Consolidation Transformation to consolidate matched data.
  • Use the Association Transformation to link matched data ahead of Consolidation

Lab: 

· Automatically Consolidate matched data

· Perform multi-criteria Matching, Association & Consolidation.

  • Additional Task & Workflow functionality:
  • Permission settings for data access & editing
  • Notifications, including Human Task Notification Variables
  • Setting Timeouts
  • Reviewing Tasks
  • Configuring Workflow Recovery

Lab:

· Update the Exception Workflow

· Review the Consolidation Workflow

  • How to process updated exception records
  • How to process consolidated records
  • Fields of Interest
  • Lab: Create a mapping to process updated exception data

Lab:

· Create a mapping to process consolidated data

· Update & deploy Exception & Cluster Workflows

  • Update exception & duplicate records in Informatica Analyst

Lab:

· Update records & push the Tasks through the Exception Process

· Update records & push the Tasks through the Consolidation Process

  • Explain the difference between System & User defined parameters
  • Use Parameters in Data Quality mappings.

Lab:

· Create a parameterized mapping

· Build & deploy an Application

· Create & execute parameter files

  • General Installation & Memory Information
  • DQ Component Configuration
  • Service Settings
  • DQ Transformations
    • Configuration Settings

Learn how Data Quality has been implemented in different projects

  • Review the CRM & Dashboard & Reporting Templates that are available

Lab: 

· Review the CRM Template

FAQs

A: There are multiple potential reasons for poor data quality & i.e.

  • Immense amounts collected; enormous data to be collected leads to less time to do it, & shortcuts to finish reporting.
  • A few manual steps; moving figures, summing up, etc.
  • Vague definitions; wrong interpretation of the fields to be filled out.

A: Accuracy, reliability, timeliness, relevance & completeness are the five traits of finding good data quality.

A: There is a number of steps given below by which a user can improve the quality of data:

  • Determine what you want from your data & how to evaluate quality. Data quality means something different across different organizations.
  • Assess where your efforts stand today.
  • Hire the right people & centralize ownership.
  • Implement proactive processes.
  • Take advantage of technology.

A: Data quality basically pertains to the completeness, accuracy, timeliness & consistent state of information managed in a company’s data warehouse & on the other hand, data integrity refers to validity, but it can also define the accuracy & consistency of stored data.

A: Radiant has highly intensive selection criteria for Technology Trainers & Consultants who deliver training programs. Our trainers & consultants undergo rigorous technical & behavioral interviews & assessment processes before they are onboarded in the company.

Our Technology experts/trainers & consultants carry deep-dive knowledge in the technical subject & are certified by the OEM.

Our training programs are practically oriented with 70% – 80% hands-on training technology tools. Our training program focuses on one-on-one interaction with each participant, the latest content in the curriculum, real-time projects & case studies during the training program.

Our faculty will provide you the knowledge of each training from a fundamental level in an easy way & you are free to ask your doubts any time from your respective faculty.

Our trainers have the patience & ability to explain difficult concepts in a simplistic way with depth & width of knowledge.

To ensure quality learning, we provide support sessions even after the training program.

A: Radiant Techlearning offers a training program on weekdays, weekends & a combination of weekdays & weekends. You can always choose the schedule that best suits your need.

A: We would always recommend you attend the live session to practice & clarify the doubts instantly & get more value from your investment. However, if, due to some contingency, you have to skip the class, Radiant Techlearning will help you with the recorded session of that particular day. However, those recorded sessions are not meant only for personal consumption & NOT for distribution or any commercial use.

A: Radiant Techlearning has a data center containing the Virtual Training environment for the purpose of participant hand-on-practice. 

Participants can easily access these labs over Cloud with the help of a remote desktop connection. 

Radiant virtual labs provide you the flexibility to learn from anywhere in the world & in any time zone. 

Send a Message.


  • Enroll
    • Learning Format: ILT
    • Duration: 80 Hours
    • Training Level : Beginner
    • Jan 29th : 8:00 - 10:00 AM (Weekend Batch)
    • Price : INR 25000
    • Learning Format: VILT
    • Duration: 50 Hours
    • Training Level : Beginner
    • Validity Period : 3 Months
    • Price : INR 6000
    • Learning Format: Blended Learning (Highly Interactive Self-Paced Courses +Practice Lab+VILT+Career Assistance)
    • Duration: 160 Hours 50 Hours Self-paced courses+80 Hours of Boot Camp+20 Hours of Interview Assisstance
    • Training Level : Beginner
    • Validity Period : 6 Months
    • Jan 29th : 8:00 - 10:00 AM (Weekend Batch)
    • Price : INR 6000

    This is id #d