The Studio is a web-based tool. Your site administrator can provide you with the web address of your installation.
- All Projects: Lists the projects you have access to. Superusers will see all projects on the server. Managers see only the projects they have created. Annotators and adjudicators see the projects they are assigned to.
- New Project: If you are a manager or superuser, you can create new projects. Managers only see the projects they have created.
- Admin Settings: Add, modify, or delete users. Only enabled for superusers.
- Help: Displays this help file in a new tab.
Click on a project to access the project dashboard. The dashboard displays the current state of the project and provides access to project-related tasks. Your user type will determine the information displayed and the tasks you can perform.
Common tasks are accessed from the navigation menu below the project status bar. Less-frequent tasks are accessed from the project (hamburger) menu in the upper right-hand corner of the project dashboard. Some tasks, such as Delete, are only available in the All Projects view.
When you log into Rosette Adaptation Studio, you will be one of the following types of users:
- Superuser: creates users in addition to all other tasks
- Manager: creates and manages new projects
- Adjudicator: reviews annotations by multiple users and settles conflicts when samples are annotated differently
- Annotator: annotates samples from documents in one or more projects
All user types can be annotators.
To add and manage users, select Admin Settings (superuser only).
Once Rosette Adaptation Studio is installed and running, go to Admin Settings to create users.
- Select Login
The initial user is:
This user has superuser (administrator) privileges.
- Username: admin
- Password: admin
- Select Admin Settings
- Add user to create an id for each user. Assign a role to each user. You can also change the default superuser id and password.
Each user can edit their personal information, including changing their password.
- Click on your user name in the upper right corner of the Studio.
- Edit any of the fields except for the role. Only the superuser can update an individual's role.
Active learning is a form of supervised machine learning in which the learning algorithm guides the order of samples presented to the annotator. This can lead to significant reductions in annotation time and expense.
Active Learning accelerates the annotation process in two ways:
- It presents the most uncertain sample for annotation, ensuring that the next sample you annotate will provide the most useful annotations.
- The engine provides its best guess or suggestion for an annotation, where possible.
When you upload a document, it is divided into samples for annotation. A document is made up of multiple samples, where each sample is a sentence or paragraph in the document. The active learning module selects the "best" document for annotation. The best document is the one which has the lowest certainty for predicting annotations.
Active learning ensures that the samples containing unique or uncommon terms are annotated early in the process. Each time a sample is completed, the engine looks for the sample with the lowest confidence, that is the sample that is the most unfamiliar to the system. By selecting the unfamiliar samples, you prevent skewing of results towards more common and represented terms.
Annotators must be careful to review suggested annotations, especially early in the annotation process when they are more error-prone. You may want to suppress these suggestions to avoid biasing the human annotators. Set the configuration option
Hide Eval Predictions to suppress suggested annotations for samples in the evaluation set. Suggestions will still be displayed for the training set.
Project Types and Templates
Rosette Adaptation Studio can support different types of projects. Each project type has a predefined template, customized for the type of information required for its tasks. Your installed system may not contain all templates. Currently, only the NER-Rosette template is available.
- NER-Rosette: This is the standard template for Named Entity Recognition (NER) projects. It uses active learning to select samples to annotate and predict labels.
Entities are the key actors in your free-form text data; the people, locations, products, and organizations mentioned in your documents. Named entity recognition (NER) evaluates text and extracts entities.
Each entity has a label, indicating the type of entity. The standard labels in the NER-Rosette template are:
Not all supported languages include all standard labels.
You can add additional labels when you are training a new model, as long as you did not select Use Basis Training Data when the model was created.
The project dashboard provides a quick view of the project status. It displays several key metrics to give a high-level view of the status of the annotation effort. The metrics displayed are determined by the user's assigned tasks. Managers and superusers see all fields, annotators and adjudicators a subset.
- Completed: (manager and superuser) This field displays how many samples have been annotated out of the total samples requiring annotation. It is counting sample/annotator pairs, so if five samples have each been assigned to two annotators, the total work is ten.
- Adjudicated: (manager and superuser) This field displays how many samples have been adjudicated out of all the samples. This is includes auto-adjudicated samples in addition to the samples requiring manual adjudication.
- Annotators: The total number of samples assigned to you in the project. That is, the total count of all the sentences found in all the documents that have been assigned to you.
- Managers and superusers: The total number of samples
- Labeled: (Visible to users with annotation assignments) How many of the samples assigned to you have you completed.
Precision, Recall, F1 Measure: These are only displayed once the model has started to learn. They are typically ready once about twenty samples in the validation set have been annotated.
The precision, recall, and F1 measures are based only on the samples in the validation set. The values are not calculated for the training set samples.
The values displayed on the project dashboard are calculated using the annotated validation data as the gold data. As the model is trained, it generates new suggestions and the scores are recalculated. The suggestions generated by the model for the validation samples are compared with the annotated values in the samples.
View annotations provides a detailed view of annotations and adjudication.
Once you've created your data corpus and loaded the documents, at any point in the annotation process, you can review the corpus and the annotations. Select View Annotations to see a detailed view of your work.
All samples in the project are listed. The Filters panel allows you to filter the list by:
- Annotator, selecting specific annotator(s) (manager only)
- Annotation status
- Adjudication status (manager only)
- Text, filtering for specific terms within the document text. The text field supports regular expression searches.
- Document id
Select on a column header to sort the results by that column. Shift-Select to add another column to the sort.
For each sample, you will see:
- The original text, marked up with its annotations
- The id of the source document
- The annotators who have worked on this sample, and those who were assigned but have not yet completed the task (manager only)
- The sample state (annotated or not annotated)
- The sample’s dataset (training or evaluation)
If multiple annotators have been assigned to a sample’s document, you can see their status in the Annotators and Pending columns of the table. If annotators have created differing results, their work will be listed below the main result. You can click on any of these annotated texts to accept or adjudicate a result.
If there is a prediction for the sample, that will also be displayed.
Select Test Document on the action bar to enter short text and see the current model's predictions for the text.