Friday, May 9, 2008

Data Profiling - The First step

Profiling the existing data is the first step in any Data driven Initiative. Take a look at the DQ Management process available at Informatica's website.



When to profile the data?

Dr. Claudia Imhoff, President & Founder, Intelligent Solutions and Ed Lindsey, National Product Specialist, Informatica, answers Profile early and Profile often.

Some of the Key points she makes are

"...Data Stewards should come from the business. Generally they are the few people who demonstrate a true interest in the data and information you are generating.."

Forgetting the business that the IT serves and not involving them is a sure-fire way for an IT project to fail.

"The DQ process is different from customer to customer. The best place to fix a data quality problem is at the source. However, many customers will not allow modifications of the data at the source for fear of breaking the original system. Also, a lot of the data is not under the control of the department using the feeds because it comes from outside the company or business unit. Most of the time the data is corrected as it enters the business unit as part of an operational data store, data warehouse or enterprise application. Over time, as DQ issues are corrected downstream, the data customer gives their feedback to the provider and hopefully they initiate their own DQ process so that over time quality ultimately finds it way back to the source system"

This is a much more realistic approach in the real world. Usually Companies have some form of Data they use for reporting or analysis and they have Data Quality problems right there. This means profiling and identifying the problems is indeed the first step usully in a DQ initiative.

Read the complete Article here. There is also a download link available there if you want to view the webinar.

0 comments: