Release of Optimaize PII Platform, version 10.6.0.
This update brings several improvements designed to enhance the accuracy and performance of our services.
1. NameParser: Removed Near-Duplicate Results
Our NameParser service has been improved to eliminate what appears to be duplicate results to the outside. Previously, an API call could reply with parsing result options that are very similar, differing only in likeliness and confidence scores.
2. Extended Dictionaries
These dictionaries are used by the NameParser and NameMatcher components, which are also part of SearchCluster and RiskDetector.
Professions
Our data team has added more professions in Portuguese, Icelandic, English, and Finnish. This expanded vocabulary will allow for greater coverage in parsing and understanding professional titles across different languages.
Given Names and Surnames
Our data team has researched and added more given names and family names from a diverse range of cultures: Danish, Romanian, Czech, Kyrgyz, and Aymara.
3. Classifying Domain Names
The 1000 most commonly appearing unclassified domain names have been categorized, into the usual classes disposable, organization, freemail, etc.
This data is used by our EmailClassifier, as well as the RiskDetector.
4. Switched to HikariCP
Previously, the c3p0 database connection pool library was used to connect to SQL data stores for some dictionary data sources. HikariCP offers several benefits, including faster connection setup times, reduced memory footprint, and overall better performance.
5. Updated Java Libraries
Updated Java dependencies to latest stable versions.