Image
DSGVO

07.05.2020 | Blog Enterprise Search can do more than searching and finding

The benefits of enterprise search software go far beyond the classic company-wide search. Thanks to innovative AI technologies, it can also intelligently interconnect a wide variety of information from different data sources in order to gain insights.
Data integration and information networking with real added value for companies and authorities

Enterprise search software such as the IntraFind iFinder makes it possible to integrate structured and unstructured data from different sources and thus make them findable. However, this information has to be put into relation to each other to increase the potential of this smart and easy to implement data integration. Via connectors all data is combined into the iFinder Elasticsearch index, which is a scalable NoSQL database, enriched with intelligent AI methods in various ways or classified thematically. This way, it is possible to seamlessly analyze content according to criteria that companies or authorities need or that are legally specified, such as German DSGVO regulations. Structured data can be combined with unstructured data - and search-based access with smart filters thus enables every possible type of ad-hoc analysis, either as a classic hit list or, of course, more and more often via flexible dashboards.

There are many possible applications - from the detection of personal data to the identification of data to comply with export control regulations. Another possible application scenario is the providing of all relevant information in automotive development in different development phases of a vehicle. The ultimate goal is to overcome the limitations of data silos without the need for resource-intensive IT projects and to create real added value by bringing together previous data islands. In this way, companies can keep an overview of their data, exploit their data treasures and cultivate their knowledge management with manageable effort. Even in the situation of the current pandemic authorities could meaningfully combine information from different sources in order to derive decisions from it.


Data integration as the basis
Data must be put in relation to each other in order to understand and interpret them. The tool for this is data integration and of course a lot of different connectors and flexible options for evaluation.

Regardless of where the data is stored and whether it is unstructured, semi-structured or structured data, the enterprise search software iFinder by IntraFind combines data in a search index. Technically speaking, this is a search index based on NoSQL, functionally speaking it is a powerful knowledge infrastructure. The most different user interfaces for search, analysis or dashboard can be added to it in a completely flexible way. Classic search with search input field as well as hit list with navigation and knowledge graph is the basic form of use. However, the evaluation of correlations can also be carried out in dashboards or any other applications or company processes. There are no limits to the imagination.

Semi-structured Data

Semi-structured data is used in computer science when information is not subject to a general structure, but only carries a part of the structural information with it. Semi-structured data can, however, be recognized by means of procedures. Example: XML data

The search specialist IntraFind makes use of the methodologies of enterprise search technology as well as the classical Business Intelligence (BI) technologies. IntraFind provides an AI-driven Cognitive Search and Analytics Engine based on the stack of Elasticsearch which enables users to understand data. Today, the cost of this is only a fraction of that of earlier data integration approaches.


Data aggregation as the icing on the cake
Once the different (structured and unstructured) data are stored in a central index, the AI of the software really takes off.

The aggregation can be carried out flexibly in many different ways. This is ensured by an AI technology stack consisting of machine learning and rule-based, linguistic and semantic processes. The search-driven approaches enable ad-hoc analyses. However, the questions do not have to be defined beforehand - as is the case with BI systems - and the system does not have to be extensively pre-trained for this purpose.

The software extracts important data points, such as personal data for a data protection regulation analysis from any documents, merges the relevant data and, by flexibly aggregating the aspects relevant to a particular question, provides the desired information aggregation that the user needs.

The tool normalizes the data into machine readable and usable units (date, metric, GIS coordinates etc.). It translates relevant content and analyzes images and video files in order to use machine-readable text for research. In addition, all data in the company can be automatically enriched with the relevant metadata, i.e. labels can be attached to the data to enable further processing and sorting. Data is thus classified according to specific subject areas.

We as human beings have the ability to define these topics and contexts and then interpret the data for the next action and set up new analyses.

Using powerful AI procedures, the software merges data and shows the user connections - even for several hundred or billion documents. This provides companies and their employees with relevant information to make decisions or to identify risks or general issues.


Implementation of comprehensive analyses - application scenarios in practice
The application scenarios of enterprise search as a data integration tool are diverse and flexible, no matter where the information is located.

In connection with the current Covid-19 pandemic, state and federal authorities could use generally accessible information from various sources to assess the current situation and derive recommendations for action or measures. The intelligent use and linking of existing data would be of great benefit. The most diverse databases in public authorities, e.g. capacities for intensive care beds, available protective equipment, information from the Federal Institute for Drugs and Medical Devices (Bundesinstitut für Arzneimittel und Medizinprodukte, BfArM), finally interconnected and enriched with expert information or freely accessible sources from the Web such as the Robert Koch Institute or Johns Hopkins University.

Regardless of the application scenario, it is always a matter of intelligently linking the most diverse information from the most diverse data pools and conducting comprehensive analyses. In this way, correlations between information can be identified that people would not be able to see in a short time.

The author

Franz Kögl
CEO
Franz Kögl is co-founder and co-owner of IntraFind Software AG and has more than 20 years of experience in Enterprise Search and Content Analytics.
Image
Franz Kögl