How frequent do you see project timelines suffer as a result of data quality issues discovered late in the game?
On my most recent ETL engagement, I had noticed a multiplicity issue with the value of one of the dimension attributes early on in the effort, prior to development. Consequently, quite a bit of time was spent to address the problem but I did not want to commence any coding until the issue gets resolved because of potential impact to the target model and downstream processes.
Soon enough, the week for code delivery came and the PM got anxious by the perceived lack of progress, to which I explained ETL is 50% Analysis, 30% Design and 20% Development. The fact of the matter is in a data integration effort, analysis and design normally dovetail. Design ideas would pop up in the course of analyzing the source data. Often than not, what is required at the end of analysis is only a formalization of the design since the latter's components are already present then.
I ended up completing the design the same day the PM voiced his "bewilderment" and finished the majority of the coding two days later, all within the planned delivery week. Granted it was a small project but there were most definitely challenges in sourcing the data.
Whereas it was primarily anecdotal on prior occasions, this experience ingrained in me the intuition that the bulk of data integration work is analysis and design. Properly conducted, analysis and design can facilitate the entire coding process greatly, i.e. the development itself is relatively not as burdensome a phase as most professionals are led to believe, especially when employing highly productive data integration tools.
Do as complete a data analysis upfront as possible, resist the urge to initiate coding early and you will have increase your odds of on-time delivery.
Thursday, October 30, 2008
Saturday, October 18, 2008
Fractal Kaizen Methodology - Part 3
So now that we know what Hybrid Kaizen Methodology (HKM) stands for, and we saw what fractals are all about, it is time to link them two. When you apply the principle of the HKM, beginning from the smallest to the entire process, that is Fractal Kaizen methodology.
Everything starts with the methodology. We decide the overall strategy and general business directive and expected end results, just like we would with a BPM. Then, from then on we let the Kaizen Methodology take over. We implement improvement cycles based on the Kaizen principles.
At the end this methodology is applied to every step in the process independently.
The result is a Hybrid Kaizen appearing like a fractal in every process - Making the overall process more efficient - and energising every small detail of it.
The Fractal Kaizen Methodology provides a significant advantage to a Data Quality initiative. By nature DQ projects tend to require improvements overall as analysis and investigation open new avenues for further - improvement in the business processes/ New technical initiatives down the road.
Everything starts with the methodology. We decide the overall strategy and general business directive and expected end results, just like we would with a BPM. Then, from then on we let the Kaizen Methodology take over. We implement improvement cycles based on the Kaizen principles.
At the end this methodology is applied to every step in the process independently.
The result is a Hybrid Kaizen appearing like a fractal in every process - Making the overall process more efficient - and energising every small detail of it.
The Fractal Kaizen Methodology provides a significant advantage to a Data Quality initiative. By nature DQ projects tend to require improvements overall as analysis and investigation open new avenues for further - improvement in the business processes/ New technical initiatives down the road.
Wednesday, October 8, 2008
Fractal - Kaizen Methodology - Part 2
Fractals! What are Fractals? For those of you who are unfamiliar with the word fractals, have a look at the picture below.

It is a rough or fragmented geometric shape that can be split into parts, each of which is (at least approximately) a reduced-size copy of the whole.
Still not clear. Consider the Koch Snowflake. One of the famous and simplistic Fractal around. (Ok.. Almost simplistic..)

You start with a equilateral triangle. Then replace the middle third of every line segment with a pair of lines so as to form a bump. Now this process repeated endlessly results in a fractal.

So now how is this related to a Project Development methodology? And What does it have to do with Data Quality? Let's wait and see!

It is a rough or fragmented geometric shape that can be split into parts, each of which is (at least approximately) a reduced-size copy of the whole.
Still not clear. Consider the Koch Snowflake. One of the famous and simplistic Fractal around. (Ok.. Almost simplistic..)

You start with a equilateral triangle. Then replace the middle third of every line segment with a pair of lines so as to form a bump. Now this process repeated endlessly results in a fractal.

So now how is this related to a Project Development methodology? And What does it have to do with Data Quality? Let's wait and see!
Thursday, October 2, 2008
ETL- Is it an art or a science? And which should it be?
Science when we start building a rigid framework for it in which it has to be done this one and only one way. The funny thing with the latter bit is exceptions to the latter rule invariably occur since a company's data is of an irregular nature and soon enough the purported advantages like consistency of approach no longer compel.
Art when executed with a certain flair; otherwise it's simply the developer's whim disguised as art. As for the flair itself, let's just call it a set of of one or more guiding principles or if you dig industry parlance, the phrase "best practice" may come to mind although best practices are collective in nature and a guiding principle tends to be individual specific. In yours truly case, I go by what I termed the principle of non-repetitive reads (NR²), i.e. frequently consumed data is to be read only once, the strict abidance of which automatically results in a “lean” ETL process with a compact form factor.
Apart from building a "mean" ETL engine, why else should organizations pursue ETL as an art form? For the designer, ETL is suddenly "fun" as creativity permeates the picture, resulting in additional avenues for downstream implementation, i.e. the developer can now interpret the design in further ways than before, much like a connoisseur fathoming an art piece. Intangible benefits to be sure but nonetheless powerful employee motivators and retention factors.
The meaning of life is to attain happiness and the route to it is by producing competitive values for others. A value is most desired when it is unique - just look at prices for one of a kind pieces of art. Put the two together and you have staff happily producing non-bloated custom solutions away without their managers' coercion and with no contemplations of greener pastures. What can better that?
Art when executed with a certain flair; otherwise it's simply the developer's whim disguised as art. As for the flair itself, let's just call it a set of of one or more guiding principles or if you dig industry parlance, the phrase "best practice" may come to mind although best practices are collective in nature and a guiding principle tends to be individual specific. In yours truly case, I go by what I termed the principle of non-repetitive reads (NR²), i.e. frequently consumed data is to be read only once, the strict abidance of which automatically results in a “lean” ETL process with a compact form factor.
Apart from building a "mean" ETL engine, why else should organizations pursue ETL as an art form? For the designer, ETL is suddenly "fun" as creativity permeates the picture, resulting in additional avenues for downstream implementation, i.e. the developer can now interpret the design in further ways than before, much like a connoisseur fathoming an art piece. Intangible benefits to be sure but nonetheless powerful employee motivators and retention factors.
The meaning of life is to attain happiness and the route to it is by producing competitive values for others. A value is most desired when it is unique - just look at prices for one of a kind pieces of art. Put the two together and you have staff happily producing non-bloated custom solutions away without their managers' coercion and with no contemplations of greener pastures. What can better that?
Subscribe to:
Posts (Atom)