I thought I would share an experience I had when implementing a data quality firewall which I think illustrates the difficulties faced when trying to gain traction for any sort of data quality initiative.
I was responsible for developing an entrerprise Single Customer View or Entity Resolution solution. This was a major undertaking which was at the core of a major data management upgrade and would provide the “glue” that would enable the organisations multitude of data sources to be integrated and combined quickly and flexibly with third party supplier data. At the heart of the solution was a central repository that recorded the contact details of every person processed through the software. It was big!!
As the solution grew and uptake increased, it became clear that poor quality data was leaking into the repository and was having an adverse effect on the match decisions made. The source of this data was quickly identified as being the third party data which was processed by the client facing data processing teams. These teams processed the data on behalf of paying customers to deliver marketing campaigns, data cleansing activities, marketing databases and the like.
I held a meeting with the team managers during which I explained what was happening, what the causes were and what impact the problem was having on their clients. I proposed the development of an automated data quality firewall which would test the accuracy of the data being processed and raise an alarm should a suspected issue be discovered. Crucially, the workstream would be stopped until the issue was resolved. Everyone agreed that this was a good idea. Data issues could be fed back to the data suppliers. Data quality would improve. Processing errors would reduce. Clients would benefit. Everyone’s a winner!
I issued a word of warning. Assessing the data quality of a data source is not an exact science. There was a risk of getting a false positive result. But that was alright – better to be safe than sorry.
So the development began and I was soon in the position to implement the firewall. It was then that the trouble started. Some of the team managers began to realise the impact that the firewall could have on them and their team members. If they were processing data through a scheduled overnight process and a problem was encountered then the process would stop. One of their people may get called out in the middle of the night to sort the issue. Or maybe the process just would not complete on time and the client would be upset. It seemed that some people would rather output wrong results on time than get the job done properly.
I was sureptitiously approached by team managers who wanted to know if their data could be “excused” the quality checks. Of course the checking must be applied to everyone else but since they took such care and tested everything properly there would be no danger in allowing their jobs to bypass the firewall. This was patently not the case. There was no way that anyone could guarantee that data being provided by third parties would be free from errors.
I won’t bore you with the machinations that followed over the following weeks and months suffice to say that the firewall was implemented and subsequently succesfully picked up many data processing errors that had eluded the thorough manual testing by the individual teams.
I think this little story serves to illustrate how data quality may be viewed by many throughout an organisations. It’s obviously a good idea. It obviously benefits everyone. Who could argue against it? But if it forces people to change, if they have to do something, if they have to go out of their way, and if they have to face up to the fact that they are responsible for data quality then it’s a different story. It’s easier to go into denial and leave things as they are.
What does this say about us? I think it says that we are perfectly normal. Nobody likes to have their applecart upturned. But sometimes things need to be shaken up.
Complacency is a dangerous thing.