The Scarus Data Quality Server (SDQ) offers a comprehensive solution for ensuring high-quality master data. Our software integrates various services such as duplicate checking, address validation and error-tolerant searches via intelliSearch technology in heterogeneous system landscapes, including seamless SAP integration.
Thanks to the intelliSearch API, the Scarus Data Quality Server (SDQ) is able to seamlessly integrate various services into heterogeneous system landscapes. Integration takes place via SOAP or REST web services and can be used universally.
Our solution includes a variety of test modules that can be customized and integrated into any system:
The services can be integrated either by the customer themselves or by the ISO-Gruppe. With our 15 years of experience in the field of master data quality, we offer you comprehensive conceptual support.
For SAP integration, we offer tried and tested in-house products for the various test modules and services. These not only provide 100% integration into the SAP standard transactions, they also enable you to process check results from the various modules.
All modules are based on ISO's own search technology intelliSearch® for Enterprise Search & Matching. The search technology is memory-efficient and scales both vertically and horizontally.
Our modules are based on ISO's own intelliSearch technology with the following core functionalities:
Our solution offers a powerful near-real-time search that can be flexibly adapted to your requirements. Various search methods such as fuzzy, phonetics, wildcard, phrases, time period, geodistance, numerical values as well as auto-complete and auto-suggest are available to you. This allows you to find relevant results quickly and precisely.
The Data Ingestion Pipeline ensures smooth and optimized data ingestion. Your data is efficiently prepared during pre-processing and in further processing steps to ensure high data quality and processing speed.
The Duplicate Matching Engine allows you to define precise criteria for detecting similarities between data records. The engine enables batch-based mass processing and ensures reliable identification and handling of duplicates in large data sets.
The modules extend these functionalities and make them available as a web service.
The duplicate check module primarily consists of three components:
The first component allows an index for the duplicate check to be created individually or in batches. This is usually started once in the system to be integrated. There is an update function during operation to keep the index up to date according to your requirements.
The second component is the freely configurable call of the individual check. Here you can perform an error-tolerant search or a duplicate check on all indexed data. Various algorithms are available for this purpose. Classic comparison methods such as Jaro-Winkler, Damerau-Levenshtein and the ISO-Gruppe's own algorithms are used here, which offer different strengths and advantages depending on the requirements.
The third component is an inventory check module that makes it possible to check an index completely for duplicates. Here you benefit in particular from the scalability and high performance of the in-memory technology that we use to process the check. Even with large volumes of data, a high data throughput can be achieved by adapting the system landscape accordingly.
To check the postal correctness of addresses, we rely on reference data from our partners Deutsche Telekom, Arvato Bertelsmann or Informatica (AddressDoctor). In a rhythm defined by you, we provide the always up-to-date reference data, which you store in the SDQ directory. The next time the server instance is restarted, the new data is available for checking.
Here too, we offer an easy-to-integrate and universally usable web service that you can integrate into any application. Search algorithms specially optimized for address validation find potential hits in the reference database when an address is entered and provide you with a corrected spelling for input in a hit list.
The fault-tolerant search makes it easy to find entries within an SDQ data pool. For the search, it is possible to generally switch fault tolerance on and off for each attribute, define the threshold values and parameterize the comparison algorithms to be used. In addition to the field-specific search, it is also possible to find search strings in long texts and to define cross-field searches. Wildcards can also be used to better filter the search results. The search is called up via a web service.
With this module, you can feed one of your data sources into an SDQ data pool as a reference source. This makes it easy for you to enrich existing data with valuable additional information. The data records are then automatically matched using the duplicate check module. If the mapping is successful, the module transfers field data from the reference source to the assigned master record.
The additional filter package provides further ready-made filters that contain business logic for processing personal and company master data. The module extends the generic field processing and thesaurus functions of the basic module with domain-specific business rules.
Find out how the Scarus Data Quality Server can optimize your data quality. Contact us for individual advice and customized solutions.