14.06.2022 | News Improve data quality: Companies need to take action now
Increasing amounts of data and ever more powerful analytics tools are unlocking new, data-driven business models for companies. Their success, however, critically depends on the quality of the data and therefore also on the data cleansing, quality assurance, and data enrichment processes and solutions available. They are necessary to avoid redundancies, detect and correct errors, and provide comprehensive access to the data.
In contrast, low-quality data – and this can’t be repeated often enough – quickly leads to a loss of market share, as business decisions are made on the basis of false assumptions. This is the case if old or incomplete data is being used, if it isn’t validated, or even if it is simply incorrectly tagged or classified. Potential savings through improved processes remain untapped. In addition, serious compliance risks can arise from missing data or data that cannot be located. For example, if a company receives a request for information under Article 15 of the GDPR or if it needs to delete certain personal data, the company must respond quickly. What many companies overestimate is how quickly and, above all, thoroughly they can locate the data. This is because the data usually isn’t saved in structured databases, but also in unstructured form on file servers and in emails, text files, spreadsheets, or other documents. These kinds of data silos represent a fundamental challenge for companies.
So what constitutes “good” data? And is there actually such a thing as “bad” data? First of all, no – there’s no such thing as bad data, only poorly maintained data. It goes without saying that you can sort out information with little significance for the specific task at hand. However, most data sets are a precious commodity that companies need to capitalize on. After all, having high-quality data means you can get the right answer to every question and make the right decision based on it. First of all, this assumes that the required information is available to users and not gathering dust somewhere in deeply hidden subdirectories and file folders. Good data is therefore accessible and thus usable. If the data is then meaningfully interconnected and, if necessary, enriched with metadata, employees have all the information they need. Good data is characterized by the fact that the meta information is complete, meaning, for example, that the creator, keywords, tags, and validity period are added and updated whenever necessary. In light of the fact that adding keywords, which is tedious for the user without the help of automatic tools, is often not mandatory, the reality is unfortunately different – metadata is not added properly or, if it is added at all, created very subjectively. And yet state-of-the-art solutions based on artificial intelligence have long been able to generate this additional information automatically by analyzing the content of the document. This makes it possible to seamlessly analyze documents according to desired or legally specified criteria.
Companies should certainly ask themselves whether they are truly leveraging their data’s full potential. If they aren’t satisfied with the answer, they’re taking a major risk.