There are always two aspects to data quality improvement.
Data cleansing is the one-off process of tackling the errors within the
database, ensuring retrospective anomalies are automatically located and
removed. Another term, data maintenance, describes ongoing correction and
verification – the process of continual improvement and regular checks.
Often, businesses ask us: which process is the most
important? In the long term, which one should we focus on? Unfortunately there
is no simple answer, but there is an easy way to understand the differences
between them.
An Apple A Day…
When we think about data, we can compare it to caring for
our health. In particular, data maintenance is a lot like brushing your teeth.
We brush our teeth at least twice a day to stop decay from taking hold. If we
didn’t, the sugar that we consume would gnaw away at the enamel and cause rot
to set in.
The longer we leave it between brushings, the more
vulnerable our teeth become. Similarly, our database must be continually cared
for and maintained.
Why?
Data in a database rots and decays in exactly the same way
as teeth do. Frequent data maintenance is required to keep the data in good
health, ensuring that the rot cannot progress to a catastrophic stage. That’s
one good argument for data maintenance, and it proves why it is an unavoidable
task that all businesses must commit to.
But what about cleansing data?
Facing Facts
Simply brushing your teeth helps to stop them from crumbling
and decaying, but we also need to organise frequent visits to the dentist. At
these essential appointments, our teeth are thoroughly checked and professionally
cleaned, and any tooth damage repaired before it escalates. Brushing the teeth
does not mean these visits can be skipped.
We might not find the dentist’s chair pleasant, and there
are certainly more enjoyable things to spend time and money on. But these
regular appointments are essential if we want our teeth to last.
In the same way, data needs to be checked and validated by
an expert. In our example, we do this by using data quality software. This is
your database’s ‘dentist’s appointment’ – the chance to catch and fix errors
that have built up over time. Using sophisticated matching techniques,
automated processes can pick out likely duplicates, and find data that doesn’t
play by the rules.
Don’t Depend on Dentures
If you don’t look after your teeth, you’ll end up with
nothing – at best, you might get a set of false ones for your old age. If you
don’t care for data, all the effort and money that was invested in collecting
it will turn out to be wasted. And it will be impossible to build meaningful
reports based on the scraps of accurate data that you have left. The only way
to continue will be to start from scratch, buying a new set of data from
someone else.
Aside from that, a successful business with no reliable data
is facing a perilous future. Deprived of its most important asset – the
information it needs for sensible decisions – it must navigate without knowing
who its customers are.
Article From: datafloq.com