Welcome to Rosette Match Studio (RMS), an interactive tool for evaluating and configuring Rosette Name Indexer (RNI) for record matching. Rosette Match Studio uses RNI for fuzzy retrieval and matching, while storing the records and search keys in the Elasticsearch full-text search engine.
Rosette Match Studio includes the following options:
Search: Perform searches or batch searches, returning matches from an index. Configure search parameters, import search data, and switch between multiple indices.
Compare: Displays the details of a pairwise match, including the algorithms used to calculate the match scores. Modify the values of match parameters and see the impact on the match score. Use these values to optimize the RNI parameters for your data and use case.
Evaluate: Calculate the accuracy of RNI using your gold data and determine the best match threshold.
Configure: A match configuration includes the parameters that control how a match is scored. In this section, you can create, save, edit, import, and export match configurations.
Help: Displays this help file and version information.
Server: Use the Server dropdown menu to change which server RMS is connected to or access the Manage Servers page, where you can add and remove external servers.
Your business determines your specific use case and priorities. Search can be optimized for your use case by managing the trade-offs between accuracy and speed, as well as precision (percentage of returned results that are relevant) and recall (percentage of relevant results returned). Optimizing for recall can increase false positives; optimizing for precision can increase false negatives (missed matches).
System requirements will depend on the size of your index, the required throughput, and your target accuracy levels.
Matching refers to the process of comparing identifying information about an individual, such as their name, company, address, and/or age, between two records. With Rosette Match Studio, you can enter one or more pieces of identifying information and it will return a list of potential matches from your loaded index. Each match will have a score, between 0 and 100%, indicating the match strength.
Name matching is the core of multi-field entity matching. Names are complex to match because of the large number of variations that occur within a language and across languages. These include, but are not limited to, typographical errors, phonetic spelling variations, transliteration differences, initials, and nicknames.
Rosette Match Studio also matches other data types such as organization name, location name, date and address.
You can investigate why particular fields matched and how scores were calculated using the compare functionality. You can even change how the scores are calculated by modifying the match parameters in real time to better understand the process and tune it for your specific application.
RMS currently has two levels of language support: complete and limited. Complete support uses the full set of algorithms to calculate match scores and match parameters. The table below lists the languages and scripts with complete support.
For all other languages, RMS has limited support:
Types of Token and Name Matches