The Compare function allows you to see more details about the match score of two values with the same field type. In this case, we will compare a search value with a match value.
-
Select the Compare icon
in the Action column of the Name row for Phillip Ward. Match Studio displays the Compare tab.
The Match Score Computation table separates each name into tokens and calculates how much each token from one name matches each token from the other name.
The tokens from Name 1 are listed down the first column, while the tokens from Name 2 are along the top row.
The shaded boxes highlight the token pairs selected during matching that produce the best score. A token pair is a token from Name 1 and its matching token from Name 2. Each shaded box contains a match reason and a match value.
-
Match Reason: Hovering your mouse over the question mark icon next to the match reason will give you more details. In this case, the match reason indicates the tokens are an exact match.
-
Match Value: This is a number between 0 and 1, with 1 indicating a perfect match. Since we are currently comparing a name against itself, each token pair has a match score of 1.
The match value takes into consideration the placement of the token in the score calculation. A penalty is applied if the tokens are out of order. When the tokens line up on the diagonal, they are all in order.
Under each token is a weight. The weightings determine how important the token pair match is in calculating the final score. Unusual tokens get a higher weighting than common names because it is more significant when they match and initials are weighted less than full names.
Let’s see what happens when we misspell one of the tokens.
-
Change the full name for Person 2 to “Fillip Ward”.
-
Select Compare.
Now let’s examine how the following have changed:
-
Weight: The weight for Fillip increased to 74%. This can be attributed to the fact that Fillip is a less common name, meaning a match would be more significant.
-
Match Reason: Fillip and Philip have “HMM_MATCH,” also known as a fuzzy match, which means they are similar strings.
-
Match Score: RMS still matches Fillip with Phillip, but their match score has been decreased to 0.661.
The final score for these names is 87%. While not an exact match, this is still considered a match in the current configuration, as indicated by match score’s green text color.
What happens if one of the names is out of order?
-
Change the full name for Person 2 to “Ward Phillip”.
-
Select Compare.
Now let’s examine how the following have changed:
-
Weight: The weight has returned to the initial 60/40 split for both names.
-
Match Reason: Both token pairs have “MATCH,” indicating an exact match.
-
Match Score: A penalty was applied, lowering the score, because the names were not in the same order.
The final score for these names is 92%.
Next, let’s give Rosette a bigger challenge by misspelling the first and last name, adding an initial, and placing them out of order.
-
Change the full name for Person 2 to “Wand J. Fillip”.
-
Select Compare.
Now let’s examine how the following have changed:
-
Weight: Fillip has a slightly higher weight at 61% because it is less common than Phillip. The initial J has a low weight of 8%, indicating it might not be an important part of the name. Note how RMS has ignored all punctuation when separating the name into tokens.
-
Match Reason: Both token pairs have “HMM MATCH,” indicating a fuzzy match. The initial J has “DELETION,” indicating that it does not have a match.
-
Match Score: The first and last name token pairs have middling match scores of 0.506 and 0.413, respectively. The unmatched j has a much lower match score, but since it is only weighted at 8%, it does not bring the final score down too far.
The final score for these names is 67%. This is not high enough to be considered a match in our current configuration (note how it is no longer displayed in green text), but it is close. What if we want RMS to consider these two names a match? We will look at a few ways to accomplish this next.