This guide is the system administrator guide for both the training and production environments of the Rosette Model Training Suite.
The training section contains installation instructions for the complete Rosette Model Training Suite. Included components are Rosette Server, Adaptation Studio, REX Training Server, and Event Training Server. Your installation may include one or both training servers.
The production section contains installation instructions for a production environment, as well as how to perform event and entity extraction. Included are instructions for moving trained models from the training environment into the production environment.
This document is one component of the complete Rosette Model Training Suite documentation set. The full set includes:
-
System Administrator Guide
A guide for installing and maintaining both the training and production environments of the Rosette Model Training Suite. Included are instructions for moving trained models from the training environment into the production environment, as well as the documentation for the API calls for entity and event extraction.
-
Developing Models
A guide for the system architects and model administrators to aid in defining the modeling strategy and understanding the theory of model training. It includes an explanation of event modeling and how to design an event schema in preparation for training event extraction models, as well as guidelines for gathering and preparing data for model training.
-
Adaptation Studio User Guide
A guide for the managers, adjudicators, and annotators using Rosette Adaptation Studio describing how to use the tool to create and maintain projects, annotate and train entity and event extraction models, and create event schemas.
Rosette Model Training Suite
Rosette Adaptation Studio is an interactive tool for annotating data. It is part of the Rosette Model Training Suite for training models for Rosette entity and event extraction. The tool uses active learning to guide the annotation process, providing suggestions and choosing samples that will ensure the model converges as rapidly as possible towards the highest quality results. As data is annotated, the model is trained. The trained models are uploaded into your production instance of Rosette Server to perform custom entity and event extraction.
Features of Rosette Model Training Suite include:
Reduced training data requirements
Optimized annotator and project manager experiences
Modular templates supporting different types of projects
Integration with the Rosette linguistic framework
A robust data store capable of managing multiple simultaneous multi-user annotation efforts
Display and search features providing both high-level and deep-dive views of each project’s progress
Accuracy metrics
Automatic model training
Trained custom models for deployment in production installations of Rosette Server
Templates currently available:
The following languages are supported by Model Training Suite for model training and extraction.
Table 1. Supported Languages by Task
Language |
Model Type |
|
NER |
Events |
Arabic (ara ) |
✓ |
✓ |
Chinese (zho ) |
✓ |
✓ |
Dutch (nld ) |
✓ |
|
English (eng ) |
✓ |
✓ |
French (fra ) |
✓ |
|
German (deu ) |
✓ |
✓ |
Hebrew (heb ) |
✓ |
|
Hungarian (hun ) |
✓ |
✓ |
Indonesian (ind ) |
✓ |
|
Italian (ita ) |
✓ |
|
Japanese (jpn ) |
✓ |
✓ |
Korean (kor ) |
✓ |
✓ |
Malay, Standard (zsm ) |
✓ |
|
Persian (fas ) |
✓ |
|
Portuguese (por ) |
✓ |
|
Pashto (pus ) |
✓ |
|
Russian (rus ) |
✓ |
✓ |
Spanish (spa ) |
✓ |
|
Swedish (swe ) |
✓ |
|
Urdu (urd ) |
✓ |
|
Vietnamese (vie ) |
✓ |
|
The following languages are supported for NER and event extraction:
Arabic (ara
)
Chinese (zho
)
English (eng
)
Japanese (jpn
)
Korean (kor
)
Russian (rus
)
Model Training Architecture
A complete Rosette Adaptation Studio system installation includes the following major components. All installations must include Rosette Adaptation Studio and Rosette Server. An installation may include one or both of the training servers: REX Training Server and Events Training Server.
Rosette Adaptation Studio: Provides annotation and project management features, as well as user and role management and the project database.
-
Rosette Server: Rosette Server is an on-premise package that provides access to the Rosette text analytics endpoints. Your license determines which endpoints and languages are active in your installation. The entities endpoint is part of the Rosette Entity Extractor (REX), which is deployed through Rosette Server.
The suggestions provided for annotation labels are generated by the entities and morphology endpoints.
REX Training Server: Trains entity extraction models and stores the models while training.
Events Training Server: Trains event extraction models and stores events models for training and event extraction in production.
The components of the Rosette Model Training Suite are used for two different purposes, training and production.
-
Training: Annotation and training of entity and event models. The training environment includes:
-
Production: Using previously-trained models to perform entity and event extraction. The production environment includes:
Rosette Server
Events Training Server
The training and production environments can use the same instance of Rosette Server or the two environments can be completely separate. You determine how many physical machines are required based on the size of your models and your organization's requirements. The following diagram shows two possible implementations.