Data Quality Scorecard by client

March 30, 2010

Data Quality Scorecard by client - and beyond...

In a recent post on a ComputerWeek blog, David Lacey states "We need to encourage greater attention and priority to the issue of poor data quality. If the data management community cannot raise the subject higher up the management agenda, then perhaps it's time for security managers to add their weight to the issue." Mr. Lacey also provides us two very interesting quotes:
"the estimated the cost of poor quality data to be around 20-40% of sales turnover." Joseph Juran, famous quality expert
"8% or more of non-verified bank account details" George Barron of BankVal

Here at my shop, the team has adopted the concept of a "Data Quality Score" which can be assigned to each client. Our version of the score itself is derived from a somewhat complicated algorithm, but here is an example of a simple score:

"People" at your "Client" purchase "Orders" that contain "Line Items" of "Widgets".
If this is your business model, I'd anticipate you have database tables which represent these "entities", tables such as:


  - CONTACT (listed as "People" above)
  - CUSTOMER (listed as "Client" above)
  - SALES ORDER (listed as "Orders" above)
  - SALES ORDER LINE ITEM (listed as "Line Items" above)
  - PRODUCT (listed as "Widgets" above)

Your Data Quality Score for each client could be the average of the following:
1) the percentage of records in your CONTACT table which you have verified as "complete" and "fit for purpose"
2) the percentage of records in your SALES ORDER table which you have verified as "complete" and "fit for purpose"
3) the percentage of records in your SALES ORDER LINE ITEM table which you have verified as "complete" and "fit for purpose"

I'd anticipate each organization would need to spend some quality time determining an algorithm which is right for you. Some folks might like to stand clear of using numeric values and might give a rating such as those schools here here in the United States give students (e.g. "A", "B", "C", "D", "F"). Other shops might adopt a rating such as the financial services industry rates bonds (e.g. AAA). Others might adopt a score such as the Fair Isaac (FICA) score used by the credit industry to score individuals looking for credit. One notable note here which we've learned, beware calling "the baby ugly". Our first shot at an algorithm made certain client's data look very bad and we had to go back to the drawing board once or twice.

Here is a look at a simple scorecard by client:

Let's take it one step further!
Since we keep track of any and all changes our team has made on our data, we can take our Data Quality Scorecard one step further and illustrate what the client's data quality score would have been had we not taken action on their data. This images shows the client, their score had we "not done anything", their current score and finally the "difference" or "value" added by our program.

In summary, I'm confident our organization now has a framework in place which allows us to communicate to our senior managers, clients and partners the "quality" of our (their) data. The framework also allows us to illustrate the added value our Data Governance(Management) group has brought to the table.

You ask "fit for purpose"? You bet and now we have the numbers to prove it.

Until next time...Rich

Search This Blog

Rich Murnane's Blog

Data Quality Scorecard by client - and beyond...

Comments

Popular Posts

Levenshtein Distance Algorithm: Oracle PL/SQL Implementation using a two-dimensional array of numbers

Data Architect - DC Metro Area (Annapolis MD)