How should I interpret confidence scores returned from language detection?

The value returned by in the SDK or from the "confidence" JSON value in the Rosette API is not a traditional confidence measure, but rather a score comparing the strength of results on the same document. One way to quantify the certainty of the top result is to take the ratio of its score to that of the second result. On a strong result, the difference will be roughly an order of magnitude.

For example, assume the results of a particular analysis are:

name iso score
English eng 0.04165
French fra 0.00448
Norwegian nor 0.00380
Romanian ron 0.00365
Dutch nld 0.00362

The ratio of the first score (English) to the second score (French) is 0.4165 / 0.0048 = 9.2894. At nearly an order of magnitude difference, this suggests high confidence in the result.

Contrast an analysis that returns the results:

name iso score
Spanish spa 0.02812
Catalan cat 0.01524
Portuguese por 0.01389
Romanian ron 0.01032
French fra 0.00934

The ratio of the first score (Spanish) to the second score (Catalan) is 0.02812 / 0.01524 = 1.8458. This suggests a much less confident result.

Normalized confidence scores

For some applications, you may want normalized confidence scores that can be compared across analyses. As a best practice for this case, we recommend summing the scores for the first 5 results, and normalizing them to add up to 1.0. The resulting figures can be interpreted as confidence scores. Note that this is still not the same as a probability or likelihood -- the likelihood that a result is correct is often much higher than its confidence score.

Take the first example above. If you normalize the results to add up to 1.0, you get:

name iso score confidence
English eng 0.04165 0.7281
French fra 0.00448 0.0784
Norwegian nor 0.00380 0.0664
Romanian ron 0.00365 0.0638
Dutch nld 0.00362 0.0633

While 72.8% doesn't sound all that confident, it's much higher than the 7.8% confidence of the second highest result. In practice, this indicates a near certainty that the answer is correct.

Contrast the second example:

name iso score confidence
Spanish spa 0.02812 0.3657
Catalan cat 0.01524 0.1981
Portuguese por 0.01389 0.1806
Romanian ron 0.01032 0.1341
French fra 0.00934 0.1215

Here the confidence of the first result, 36.6%, accurately conveys that the likelihood of an accurate result is far from certain.


Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request


Please sign in to leave a comment.

Powered by Zendesk