The type of annotation depends on the project type:
-
Named entity recognition (NER) identifies entities and tags them with labels.
-
Event extraction identifies event mentions, tagging key phrases and roles within the event mention.
Annotation is the process of applying labels to text. Annotated text is used to train the model; you can think of it as the "supervision" portion of supervised learning. Rosette Entity Extractor will suggest annotations for each sample, but it is up to human annotators to correct and supplement it. Annotators will annotate one sample at a time. The size of each sample is determined by the Sample type on Ingest configuration setting during project creation.
Note
Once the project has been created, the Sample type on Ingest setting cannot be changed.
-
Select Annotate in the project navigation bar to start annotating. The Rosette Entity Extractor suggests annotations from the start. As soon as you annotate a sample, the system starts training a new model based on your annotations.
Note
You can ask the engine to suggest annotations for only samples in the training set and not those in the evaluation set. To disable suggested annotations from the evaluation set, enable the project configuration option Hide Eval Predictions.
-
Select words or phrases in the sentence and then select the correct label for the selected words.
-
To select a word, click anywhere on the word.
-
To select a phrase, highlight the desired text.
Note
To remove a label, select it and then select the x icon to the right of the label name. You can then add a new label.
-
Select Annotate when you are satisfied that the sample is correctly annotated.
Note
If the sample does not contain anything that should be annotated, you must still select Annotate so that the sample status is changed to annotated.
Other options are:
-
Clear Annotations: Erase all tags and start afresh.
Note
If you clear annotations using this button, the sample status will remain as annotated. You must still select Annotate to save the sample with no annotations.
-
Undo: Undo changes made in the sample.
-
Skip for now: Skip this sample and annotate a different sample.
-
Previous: Go back to a previously annotated sample from this session.
To move a sample between the validation and training sets, select the appropriate radio button.
Once you've created your data corpus and loaded the documents, at any point in the annotation process, you can review the corpus and the annotations. Select View Annotations to see a detailed view of your work.
Event mentions in text have multiple components with nuanced relationships to each other, making annotating events much more complex than annotating for named entity recognition and extraction. The schema must be defined before starting annotation.
Event annotation is a two-step process:
-
Identify and label the key phrase. This identifies the event type.
-
Identify and label the roles. The set of potential roles depends on the event type.
Annotators will annotate one sample at a time. The size of each sample is determined by the Sample type on Ingest configuration setting during project creation.
Note
Once the project has been created, the Sample type on Ingest setting cannot be changed.
Select Annotate in the project navigation bar to start annotating. From the start, candidate annotations are displayed based on the extractors defined in the system. Each time you annotate a sample, the model is trained. If the model can identify an event, candidate annotations will be shown. At the minimum, the key phrase will be labeled. Roles, if identified, may also be identified.
To annotate a sample where a candidate event type is identified:
-
The key and any candidate role mentions will be identified. Mouse over a key phrase or candidate role to see the labels. This will also display model certainty, indicating how confident the model is that the candidates are correct.
Note
If you are using a custom entity model to extract roles and Adaptation Studio is not identifying any candidate role mentions, make sure the model is loaded in Rosette Server.
-
Select Annotate to accept the candidates as presented. Otherwise, proceed to step 3.
-
Clear a tag or select a different tag by selecting a role mention.
-
Select Annotate once you are satisfied that the event mention is correctly annotated.
If no event mention was identified, you can create one:
-
Select the key phrase. Select the event type from the displayed list.
-
Potential role mentions may be identified with the entity type under the word. Select the role mention and select the role type from the list. You can select role mentions that don't have a listed entity type.
-
Select Annotate.
Note
To remove a role, select it and then select the x icon to the right of the role name. You can then add a new role.
Note
If the sample does not contain anything that should be annotated, you must still select Annotate so that the sample status is changed to annotated.
Note
When a key phrase or role is not identified as a candidate, but is used as part of the annotation, the extractor is tentatively updated. A manager will review the tentative modifications and choose whether to make these changes a permanent part of the schema.
Other options are:
-
Unfocus event: Removes focus from all events. A sample may contain multiple events, but only one can be in focus at a time.
-
Clear Annotations: Erase all tags and start afresh.
Note
If you clear annotations using this button, the sample status will remain as annotated. You must still select Annotate to save the sample with no annotations.
-
Undo: Undo changes made in the sample.
-
Skip for now: Skip this sample and annotate a different sample.
-
Previous: Go back to a previously annotated sample from this session.
To move a sample between the validation and training sets, select the appropriate radio button.
Once you've created your data corpus and loaded the documents, at any point in the annotation process, you can review the corpus and the annotations. Select View Annotations to see a detailed view of your work.
Multiple Event Mentions in a Single Sample
A sample may contain multiple event mentions, but each mention must be annotated individually. Let's look at an example:
My flight to San Diego took 6 hours, but my flight back to Boston only took 5.
In this example, we're extracting flight events. The simplified schema includes:
There are 2 event mentions in the above sample:
-
flight from Boston to San Diego
-
flight from San Diego to Boston
Each event mention gets its own annotation.
-
Select the first flight
. It gets labeled as key
. San Diego and Boston have location listed under them.
-
Select San Diego
and label it as destination
.
-
Unfocus event.
-
Select the 2nd flight
and flight_booking event.
-
Select Boston
and label it as destination
.
-
Select San Diego
and label it as origin
.
-
Annotate.
Both event mention annotations will be saved.
Add Comments to Annotations
Comments allow you to pose questions or provide clarifications about the sample being edited to adjudicators and managers. Adjudicators and managers can view the comments for each sample via View Annotations. Comments do not affect model training in any way. To add a comment while annotating, follow these steps:
-
Select any annotation.
-
Select Add Comment.
-
Type desired text in the Comment field.
-
Select the green checkmark icon.
You can edit comments you have made by selecting the commented annotation and selecting Edit Comment. From here you can also delete the comment by selecting the trash can icon.