Does the Rosette Base Linguistics SDK do term expansion?

Term expansion is the opposite of lemmatization, where you start with a lemma such as "get" and produce all the possible forms that it could derive from, such as "got", "getting", etc.

We do not do this.

The number of possible forms that any given lemma could expand into is large, especially when you consider obscure forms like, "The mail remained ungotten." Any attempt to produce an exhaustive list of possibilities leads to huge bloat, lots of red herrings, and inevitably some cases that you miss even so.

We generally find that problems for which you hope to use term expansion can be readily adapted to use lemmatization instead, which is far more reliable and efficient. Write to if you would like to work through the problem you are trying to address with an engineer at Basis.

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request


Please sign in to leave a comment.

Powered by Zendesk