Step 1 – Data Preparation
Successful matching is dependent on having the most accurate and complete data possible. All steps must be taken to ensure that the data you are dealing with is clean, accurate, standardised and parsed correctly.
Step2 – Identify Candidate Matches
Given a database of any reasonable size, it would not be practical to attempt to match your target record to every entry on the database – the likelihood is it would take too long. Pretty much all matching software will address this problem by reducing the number of candidate matches by means of an index or preliminary match key. This is designed to retrieve database entries which might match to your target record. This can be a tricky business. You don’t want to retrieve excessive numbers of candidates. Neither do you want to risk not retrieving the data you are looking for. And on what criteria do you retrieve the candidates. Postcode is an obvious and common choice but the postcode on your target record could be missing or incorrect. Indeed any piece of data within your target record could contain inaccuracies which could compromise the identification of candidates, resulting in failure to retrieve and subsequently failure to match, thereby creating duplicates within your database.
Step 3 – Decision
You have your target record. You have a list of possible candidates retrieved from your database. One or more of those candidates may match to your target record. This is where the decision making algorithm takes over, comparing your target record to each candidate in turn and deciding whether the business rules which define a match have been met.
No matter what the sales material tells you (if it tells you anything at all!), all name and address matching software will adopt these three underlying steps. The secret to successful matching is to get each of these three steps to work in harmony. All too often deficiencies in one step are compensated for by additional processing in the other steps and this can lead to a vicious circle of compromise and complexity. A salesman will paint a picture of super efficient code, class leading mathematical scoring modules, intelligent decisioning software etc. The developers will most likely paint a different picture of work around, quick fix’s, special processing and exceptions. This is not a criticism of the developers. Few pieces of software are written from scratch. All software goes through the normal product lifecycle from inception, through maturity and finally reaches the end of its life when it becomes impossible to cram any more fix’s into it. And name and address matching is very, very complicated.