The event schema defines the event types you are extracting. It includes key phrases, roles, role types, and extractors.
For each key phrase and role, there is a role-type. A role type is made up of one or more extractors. Extractors are reusable components which define the rules and techniques to identify roles and key phrases.
The supported extractor types are:
-
Entity: A list of REX entity types. You can use the standard, pre-defined REX entity types or train a custom model to extract other entity types. The custom model must be loaded in Rosette Server to define an entity extractor with custom entity types.
-
Exact: a list of words or phrases. Exact will match any words on the list, whether they are identified as entity types or not. For example, you could have a list of common modes of transportation, including armored personnel carrier and specific types of tanks.
-
Morphological: A list of words. When a word is added to this list, it is immediately converted to and stored as its lemma. Words with the same lemmatization will match. For example, a morphological extractor for go will match going, went, goes, gone.This is the only extractor type valid for key phrases.
-
Semantic: A list of words or phrases. Any word whose meaning is similar to one of these words will match. For example, an extractor of meeting will match assembly, gathering, conclave. Rosette uses word vector similarity to identify similar words. While a semantic extractor can be defined by a phrase, it will only identify single words as candidate roles.
You cannot modify the schema for a trained model. You can view it through the /events/info
endpoint.
GET /events/info
Returns the list of all models currently installed in the system along with the schemas used to create the models.
GET /events/info?workspaceId={wid}
Returns the schema used to create the model, where wid
is the workspace identifier for the particular events model.