Note
You must be registered as a manager.
The New Project option on the global navigation bar allows managers to either create a new project from scratch or import an existing project to the server.
Note
You must be registered as a manager.
-
Select New Project in the global navigation bar.
-
Select Create.
-
Enter a name for the project. Choose a name that will be meaningful to you and your annotation team.
-
Select a Template type from the drop down menu. The templates are predefined and customized for each type of project.
-
Add annotators and adjudicators to the project. Select Add/edit roles and permissions to select from the users in the system. Users must be added to the system before you can add them to a project.
-
Configure the project. Each template type has its own set of configuration options.
-
Select Create. Your new project appears in the project list.
Before starting annotation, complete the project configuration, if necessary.
-
For entity extraction projects, add new labels.
-
For event extraction projects, modify the project schema.
Each template type has its own set of configuration options. In many cases the defaults are acceptable.
Select Reset to set all options back to their default values.
Once you've set the options for your project, select Create to create the project.
The NER-Rosette template is for annotating documents to train models for named entity recognition with Rosette Entity Extractor and Rosette Server.
-
Model language: The language of the samples.
-
Compute IAA only during overnight processing: Inter-annotator agreement (IAA) is a measure of the reliability of the annotators. Calculating IAA can be resource-intensive. This option is disabled by default.
-
Show token boundaries: When enabled, each unannotated span is underlined to make it easier to see the token spans. This option only affects the presentation of the text samples. This option is disabled by default.
-
Mouse full token selection: When enabled, selecting part of a token causes the entire token to be selected. This option is enabled by default.
-
Use Basis training data: When enabled, while training the model, Adaptation Studio includes the Basis-provided training data that was used to train the statistical model shipped with Rosette Entity Extractor. When this option is enabled, new labels cannot be added to the project. This option is enabled by default.
Note
If you are creating a project to train a model to extract new entity types (defining new labels), do not select this option.
Note
The time to train the model when Use Basis training data is enabled may be a few minutes longer than without the extra training data. The time is determined by the number of annotated documents as well as the language.
-
Train case sensitive model: When enabled, the trained model is case-sensitive. This option is enabled by default.
-
Sample type on Ingest: Determines how documents will be divided into samples for annotation after being uploaded.
-
Sentence: Each sentence in a document becomes a sample. This is a good choice for NER projects because it increases the effectiveness on active learning. By dividing documents into many small samples, Adaptation Studio can more easily select the most uncertain sample for annotation, which allows the model to train on "difficult" samples faster.
-
Paragraph: When a section of text in a document is separated from surrounding text by empty rows, that section becomes its own sample. This is a good choice for events projects, since events can span multiple sentences. It also ensures annotators have a reasonable number of natural stopping points.
-
Document: Each document is its own sample. This is a good choice for events projects, since events can span multiple sentences. It also ensures annotators have the complete context for each document.
-
Hide eval predictions: When enabled, Adaptation Studio suppresses annotation suggestions on the samples in the evaluation set to avoid biasing the human annotators. This option only affects suggestions for samples in the evaluation set; Adaptation Studio still displays suggestions for samples in the training set if this option is enabled. This option is disabled by default.
-
Prioritize partially annotated docs: When enabled, Adaptation Studio prioritizes the samples presented for annotation such that full documents are completely annotated. This option is disabled by default.
-
Auto project backup limit: Each version saved for a project requires resources. This option is set to 5 by default.
Most project configuration options can be changed after the project is created. Select the project menu in the upper right corner of the project and select Configure to change configuration options. Some options, such as model language and use Basis training data, cannot be changed once the project has been created.
The Events-Rosette template is for annotating documents to train models for entity extraction with Rosette Server.
-
Model language: The language of the samples.
-
Initial schema template: Select a schema template from the drop-down list box. Once the project has been created, you can modify the schema by selecting Project Schema from the project dashboard menu.
-
Semantic extractor match threshold: The higher the value, the fewer candidates will be identified by a semantic extractor. You may want to experiment with different values to find the right value for your data.
-
Show token boundaries: When enabled, each unannotated span is underlined to make it easier to see the token spans. This option only affects the presentation of the text samples. This option is disabled by default.
-
Mouse full token selection: When enabled, selecting part of a token causes the entire token to be selected. This option is enabled by default.
-
Sample type on Ingest: Determines how documents will be divided into samples for annotation after being uploaded.
-
Sentence: Each sentence in a document becomes a sample. This is a good choice for NER projects because it increases the effectiveness on active learning. By dividing documents into many small samples, Adaptation Studio can more easily select the most uncertain sample for annotation, which allows the model to train on "difficult" samples faster.
-
Paragraph: When a section of text in a document is separated from surrounding text by empty rows, that section becomes its own sample. This is a good choice for events projects, since events can span multiple sentences. It also ensures annotators have a reasonable number of natural stopping points.
-
Document: Each document is its own sample. This is a good choice for events projects, since events can span multiple sentences. It also ensures annotators have the complete context for each document.
Tip
If events training document size exceeds 2000 characters, select Sentence or Paragraph sample type. Large documents will impact performance.
-
Hide eval predictions: When enabled, Adaptation Studio suppresses annotation suggestions on the samples in the evaluation set to avoid biasing the human annotators. This option only affects suggestions for samples in the evaluation set; Adaptation Studio still displays suggestions for samples in the training set if this option is enabled. This option is disabled by default.
-
Prioritize partially annotated docs: When enabled, Adaptation Studio prioritizes the samples presented for annotation such that full documents are completely annotated. This option is disabled by default.
-
Auto project backup limit: Each version saved for a project requires resources. This option is set to 5 by default.
Note
You must be registered as a manager.
To import a project, you must have an exported project file.
-
Select New Project in the global navigation bar.
-
Select Import.
-
Drop the desired project file into the field that appears, or select Browse to select it manually and select Open.
It may take a few minutes for an imported project to appear in the Project list, especially if it is a large project. The imported project will only be visible in the Project list to superusers and users assigned to the project.
Note
If a user on an exported project does not exist on the destination server, that user is created on the new server when the project is imported.
-
The project owner is created on the new server as a manager.
-
Any adjudicators are created on the new server as adjudicators.
-
Any other users are created on the new server as annotators.
Users added in this fashion are unable to log into the new server until an admin manually sets their password via the User Management option on the global navigation bar.