Model Training Suite Release 1.0.7
July 2023
Versions included in this release:
Rosette Adaptation Studio: 1.0.7
Events Training Server: 1.0.7
REX Training Server: 2.0.9
Rosette Server: 1.26.0
Rosette Adaptation Studio Release 1.0.7
July 2023
New
The task to resolve tentative roles has been moved from the Project Schema page to the Adjudication page. Tentative roles now mark an annotation as needing adjudication. (RAS-1511)
When resolving a tentative role a pop-up window now appears with the extractor types or to reject the tentative role. (RAS-1513)
Adjudication is not complete until all tentative roles have been resolved. (RAS-1514)
We've added a new filter option to the annotation page to view all samples that need adjudication. (RAS-1516)
When you add a document to the project from the Test Document page, it is now assigned to all annotators. (RAS-1506)
The reset_admin.sh
script now supports python3. (RAS-1510)
We've made it easier to assign all documents for annotation or adjudication to a user. There is now a checkbox next to the user's name. Check the box to select all documents. (RAS-1562)
Bug Fixes
We fixed a bug where all projects were not loading on the All Projects page. (RAS-1484)
Accuracy scores are now updated when annotated documents are deleted. (RAS-1439)
Roles which appear in multiple events in a single sample are now properly displayed in all events. (RAS-1541)
The samples counter is now displayed properly for users who only have adjudication tasks assigned. (RAS-1536)
You can now annotate multiple events in a single sample. (RAS-1556)
Events Training Server Release 1.0.7
July 2023
This release is for compatibility with Rosette Server. There are no new features or bug fixes.
Known Issues
REX Training Server Release 2.0.9
July 2023
Bug Fixes
Model Training Suite Release 1.0.6.2
June 2023
Versions included in this release:
Rosette Adaptation Studio: 1.0.6
Events Training Server: 1.0.6
REX Training Server: 2.0.0
Rosette Server: 1.25.1
New
Bug Fixes
Model Training Suite Release 1.0.6.1
June 2023
Versions included in this release:
Rosette Adaptation Studio: 1.0.6
Events Training Server: 1.0.6
REX Training Server: 2.0.0
Rosette Server: 1.25.1
Bug Fixes
The script update-rs-for-ets.sh
and the associated .properties
file have been updated to support the headless installer. (WS-2804)
-
The script update-rs-for-rts.sh
and the associated .properties
file have been updated to support the headless installer. (WS-2804)
This script only needs to be run when using an on-premise version of Rosette Server or a a containerized Rosette Server that was not shipped with Model Training Suite.
Model Training Suite Release 1.0.6
May 2023
Versions included in this release:
Rosette Adaptation Studio: 1.0.6
Events Training Server: 1.0.6
REX Training Server: 2.0.0
Rosette Server: 1.25.1
New
Rosette Adaptation Studio Release 1.0.6
May 2023
New
The view annotations page has been improved for events projects. (RAS-1370)
You can now add comments to events annotations. (RAS-1393)
Multiple annotators can be assigned to each sample for events annotations. (RAS-1412)
Adjudication is now supported for event annotations. (RAS-1476)
You can annotate link IDs for entities, enabling training custom knowledge base linking.
We added the force admin to the headless installer and a --dry-run
option. Now ./install-ras-headless.sh --dry-run
will validate the properties file, output the values that will be used and exit.
Bug Fixes
Events Training Server Release 1.0.6
May 2023
Bug Fixes
Entity type Temporal:Time
is now mapped to Time
for backwards compatibility with older models. (WS-2757)
Entity type Identifier:Money
is now mapped to Money
for backwards compatibility with older models. (WS-2757)
Known Issues
REX Training Server Release 2.0.0
May 2023
This release is for compatibility with Rosette Server. There are no new features or bug fixes.
Model Training Suite Release 1.0.5.1
March 2023
Versions included in this release:
Rosette Adaptation Studio: 1.0.5.1
Events Training Server: 1.0.5.1
REX Training Server: 1.0.5.1
Rosette Server: 1.24.1
New
Known Issues
Adjudication is not supported for events. We recommend only having a single annotator for each sample.
You cannot upload ETS model files that were created with older versions of MTS and contain invalid or unknown entity types. If your event model contains invalid or unknown entity types, export the project from Rosette Adaptation Studio and then import the project back into the Studio. A new events project will be created which corrects the entity types. Invalid entity types include TIME and MONEY, which were part of some sample schemas.
Rosette Adaptation Studio Release 1.0.5.1
March 2023
New
Events Training Server Release 1.0.5.1
March 2023
New
REX Training Server Release 1.0.5.1
March 2023
New
Model Training Suite Release 1.0.5
January 2023
Versions included in this release:
Rosette Adaptation Studio: 1.0.5
Events Training Server: 1.0.5
REX Training Server: 1.0.5
Rosette Server: 1.24.1
New
Docker support: Scripts have been updated to determine if docker compose
is installed and to use that rather than the deprecated docker-compose
command.
Rosette Server install: The Rosette Server conf
directory is now exposed outside the docker container so that settings can be customized using instructions from the MTS System Administrator Guide and the Server User Guide.
SSL: Scripts to enable and disable SSL now check the expiration dates of the given keystores and certificates. In addition the scripts no longer require the user to concatenate the key and cert together and instead ask for --cert --key
.
Installation verification: Added a new script verify-rs-configuration-for-ets.sh
in ETS that will verify that the Rosette Server installation is set up correctly by validating the rex-factory-config.cfg
file.
-
Resource management: We've exposed 3 properties in /rts/rts-docker/.env
to control RTS resource consumption The number of training threads that can be running in the system at any one time.rites in
RTS_CONCURRENT_TRAIN_THREADS=2
: The number of training threads that can be running in the system at any one time.
RTS_CONCURRENT_SERIALIZE_THREADS=1
: The number of model serialization threads that can be running in the system at any one time.
RTS_CONCURRENT_WORDCLASS_THREADS=2
: The number of wordclass generation threads that can be running in the system at any one time.
Bug Fixes
Known Issues
Adjudication is not supported for events. We recommend only having a single annotator for each sample.
You cannot upload ETS model files that were created with older versions of MTS and contain invalid or unknown entity types. If your event model model contains invalid or unknown entity types, export the project from Rosette Adaptation Studio and then import the project back into the Studio. A new events project will be created which corrects the entity types. Invalid entity types include TIME and MONEY, which were part of some sample schemas.
Rosette Adaptation Studio Release 1.0.5
January 2023
New
Browser support: Firefox is now supported. (RAS-460)
-
New annotation filters:
Version information: The Help About window now contains version information for all installed components. (RAS-1259)
Event schema modification: You can now modify the Required flag for roles in existing project schema. (RAS-1284)
Schema categories: When creating schemas, you can now assign them to a category. Categories organize schemas and can make it easier to locate a particular schema. (RAS-1169, RAS-1175)
Link annotation: NER models now supports annotating links: (RAS-1202)
Bug Fixes
Exported versions are now sorted correctly when sorted by date. (RAS-1201)
Undo now correctly reverts states when annotating events (RAS-1190)
Exporting schemas now contain the version number. (RAS-817)
Errors are now caught when importing annotated ADM files. (RAS-1266)
Errors are now caught when importing schemas. (RAS-1226)
Samples are now correctly added for validation. Previously, they were always added to training, even when validation was selected. (RAS-1289)
A user can now be added to a project as an adjudicator only; they don't automatically get added as an annotator. (RAS-1263)
Known Issues
Events Training Server Release 1.0.5
January 2023
New
Bug Fixes
Role dataspans are now correct when the document contains emojis. (EDA-231)
Model retrieval with concurrent mutation no longer causes an error (EDA-205)
Worker info links are now correct in the eureka dashboard. (EDA-230)
REX Training Server Release 1.0.5
January 2023
New
New endpoint: To enable linking annotations in RAS we added a new endpoint for knowledge base search. Given a search string, the server will retrieve a list of Wikidata QIDs that have matching entries in our alias dictionaries and provide a response containing their Wikidata titles and descriptions (TEJ-1851)
Model Training Suite Release 1.0.4
October 2022
Versions included in this release:
Rosette Adaptation Studio: 1.0.4
Events Training Server: 1.0.4
REX Training Server: 1.0.4
Rosette Server: 1.23.0
New
Rosette Adaptation Studio Release 1.0.4
October 2022
New
Duplicate documents: When a document is loaded that is identical to a previously loaded document, an error will now be displayed. (ANST-1082)
Schema validation: Schemas are now validated to verify that all referenced entity types are deployed in the attached deployment of Rosette Server. If an entity type does not exist, an error is returned. (WS-2566)
UI improvements: Many improvements have been made to the user interface, incuding improvements for working with large documents.
Bug Fixes
The DNN extractor is no longer used in calls to the /entities endpoint; the statistical model is always used. (ANST-1139)
Documents deleted from events projects are no longer used in training. (ANST-1027)
Comments are now downloaded correctly in ADM files. Previously, the ADM files wouldn't download if the annotations contained comments. (ANST-1103)
Known Issues
Events Training Server 1.0.4
October 2022
New
Schema validation: Schemas are now validated to verify that all referenced entity types are deployed in the attached deployment of Rosette Server. If an entity type does not exist, an error is returned. (WS-2566)
Improved accuracy using POS: Part of speech information is now used to help differentiate between event type roles. (EDA-178)
Improved accuracy - negative events: We've improved identification of samples which do not contain an event. (EDA-163)
Improved performance and decreased memory usage: We've improved performance, especially in validation. (EDA-143, EDA-184)
Bug Fixes
Documents deleted from events projects are no longer used in training. (ANST-1027)
Key candidates are no longer removed by overlapping candidate mentions. (EDA-169)
Semantic extractors now also do an exact comparison. (EDA-166)
Concurrent queries are now handled properly to ensure data consistency. (EDA-197)
REX Training Server Release 1.0.4
October 2022
New
Model Training Suite Release 1.0.3.1
September 2022
Versions included in this release:
Rosette Adaptation Studio: 1.0.4.2
Events Training Server: 1.0.3.1
REX Training Server: 1.0.3.1
Rosette Server: 1.22.0
New
Security updates: Components re-released to remove known high or critical vulnerabilities.
Model Training Suite Release 1.0.3
July 2022
Versions included in this release:
Rosette Adaptation Studio: 1.0.3
Events Training Server: 1.0.3
REX Training Server: 1.0.3
Rosette Server: 1.22.0
New
Rosette Adaptation Studio Release 1.0.3
July 2022
New
Initial password: The initial admin password is now entered twice to verify the correct password has been entered. (ANST-920)
Version information: The Help menu now contains an About option which lists the version of the Studio. (ANST-963)
New Project menu: The New Project menu now has two options: Create and Import.
Adjudication reports: The Adjudication reports have been removed from events models. (ANST-1019)
Case sensitive parameter set for events: When using entity extractors for events models the text is always sent as case insensitive. This improves the identification of role candidates. (ANST-1039)
Scripts added: enable-browser-ras-ssl.sh
and disable-browser-ras-ssl.sh
scripts added to enable SSL on the browser facing proxy interface of RAS. (ANST-1026)
Bug Fixes
The Help menu is no longer disabled for Annotators and Adjudicators. (ANST-1042)
The View Document button on the Manage page is working properly. (ANST-1051)
The first documented added to the system can now be for validation. Previously, it would upload if for training, but not if it was for validation. (ANST-989)
Correct word class status is now displayed. Previously it displayed Not Ready, even when the word classes were available. (ANST-990)
Only supported languages are available when creating a schema template. (ANST-935)
NER models can now be successfully exported. (ANST-978)
reset_admin
password script now works when SSL is enabled. (ANST-939)
ETS, if enabled, is checked by the ras_healthcheck script. (RAS-983)
Known Issues
Events Training Server 1.0.3
July 2022
New
Improved document processing: Long documents are now processed by sentence boundaries. (EDA-147)
Improved performance: All event analyses, including identifying candidate roles, event extraction, and validation, are faster.
Improved tokenization: Tokens are now taken from the /morphology endpoint. This has improved tokenization, especially in languages such as Korean. Samples may be tokenized differently than in previous releases. (EDA-129)
Java 17 supported: Java 8 is no longer supported.
-
Security updates:
Bug Fixes
REX Training Server Release 1.0.3
July 2022
Bug Fixes
New workspace names can no longer corrupt the workspaces directory. (RQA-387)
A corrupt data file has been repaired enabling Japanese model training.
You can now train entity models in Chinese using the language code zho
.
Model Training Suite Release 1.0.1
April 2022
Versions included in this release:
Rosette Adaptation Studio: 1.0.1
Events Training Server: 1.0.2
REX Training Server: 1.0.2
Rosette Server: 1.21.0
New
Rosette Adaptation Studio Release 1.0.1
April 2022
Bug Fix
Events Training Server 1.0.2
April 2022
New
Bug Fixes
Swagger's try me feature now works when running over HTTPS. (WS-2473)
All role mentions in Chinese, Japanese, and Korean are now considered. Previously, if there were two mentions next to each other, one would not be considered.
Minor event extractions improvements have been made to Chinese, Japanese, and Korean.
Model Training Suite Release 1.0.0
April 2022
Versions included in this release:
Rosette Adaptation Studio: 1.0.0
Events Training Server: (Java/Python Servers) 1.0.0
REX Training Server: 1.0.2
Rosette Server: 1.21.0
New
-
Documentation update
The System Administrator Guide has been updated. The Rosette Server install instructions are in the Rosette Server User Guide shipped with the Rosette Server package.
The Adaptation Studio User Guide has been updated to reflect enhancements in the product.
The Developing Models guide has been updated to include guidance on events schemas.
Release notes: The release notes have been restructured. Each server now has its own section.
Rosette Server installation: You can now use an existing, stand-alone installation of Rosette Server, instead of the Docker container. Rosette Server is shipped separately.
Rosette Adaptation Studio Release 1.0.0
April 2022
New
-
Project Management Reports: We've added the following reports to help track the progress of the project:
Annotation progress report
Adjudication progress report
Inter-annotator agreement
IAA history
Import project: You can now import a previously exported project in the Studio. Previously, you had to run a command line script to import a project.
Document Level Annotations: This is a new setting that can be enabled only during initial project configuration. When enabled, annotation is performed on one full document at a time, as opposed to one sample at a time.
Annotation Comments: Annotators can now add a comment to each annotation they make. Adjudicators and managers can see these comments via View Annotations.
Annotator Assignment: A new table of annotators and documents allows managers to edit annotator assignments on a project level. Previously, annotator assignment could only be edited on a document level.
Adjudicator Assignment: A new table of adjudicators and documents allows managers to edit adjudicator assignments on a project level. Previously, adjudicator assignment could only be edited on a document level.
Sample Context Text: When annotating, Rosette Adaptation Studio now displays the full document surrounding each sample in light gray.
Adjudication UI: When adjudicating, Rosette Adaptation Studio now displays a table of agreeing and disagreeing annotations.
View Annotations UI: We've updated the layout of the samples list by separating the sample text, annotations, and further information/actions.
Filters Panel: We've added filters for annotations containing comments and assigned annotator(s).
Semantic exactor match threshold: You can now set the value for the semantic exactor match threshold when creating a new events project. This value can also be changed after the project is created.
User management: Admin Settings has been renamed User Management.
Extract ADM improvements: You can now export ADMs for all uploaded samples, even if they have no annotations.
Bug Fixes
Tentative extractors no longer show up as tentative once resolved.
Undo on the Adjudicate page returns to the last saved state instead of clearing all annotations.
Export ADM no longer overwrites records with the same name.
The key phrase is now deleted when an event type is deleted.
Morphological extractors are now extracting correctly values that were previously not being extracted.
Multi-token candidates are not captured correctly.
Exported NER model info now contains the correct precision numbers.
When clearing annotations, all selected annotations are now cleared.
Known Issues
Events Training Server Release 1.0.0
April 2022
Bug Fixes
Model file upload now checks that the custom profile is installed.
Endpoints now check for an installed language license.
Multiple events with the same key phrase are no longer extracted.
Events lacking required roles are no longer extracted.
March 2022
This release contains all components of Rosette Model Training Suite.
Versions included in this release:
Rosette Server: 1.21.0
Events Training Server: 0.0.27.0/0.8.12 (Java/Python Servers)
REX Training Server: 1.0.2
Rosette Adaptation Studio: 0.9.5.4
Installation Instructions
This release should be installed in an empty directory. Complete installation instructions are in the System Administrator Guide.
To transfer existing projects models from a previous release, you will need the old release installation directory and the new release installation directory.
Adaptation Studio project data is stored in the ras/mongo_data_db
directory. To transfer existing projects, copy the ras/mongo_data_db
directory from the old installation directory to the new ras/mongo_data_db
directory.
REX Training Server (RTS) data is stored in the rts/workspaces
directory. To transfer existing rts data, copy the rts/workspaces
directory from the old installation directory to the new rts/workspaces
directory.
New
Bug Fixes
The ETS swagger port is now correct in the [Try Me] calls.
The timeout for server send events from ETS to RAS was extended to prevent timeouts in long-lived sessions.
The update-rs-configuration.sh
now honors the HTTP scheme of the eventTrainingServerUrl
setting in the event-extractor-factory-config.yaml
configuration file.
Exact extractors and tentative extractors are now configured correctly when given multiple tokens (words). This includes tentative extractors configured during annotation of documents. Previously created extractors should be recreated if they include multiple tokens.
Duplicate events are no longer extracted from the same key phrase when there are multiple matching extractors.
Accuracy has been improved when evaluating text with uncommon words.
Known Issues
February 2022
This release contains all components of Rosette Model Training Suite. Only Rosette Adaptation Studio has been updated from the 0.9.5.2 release.
Versions included in this release:
Rosette Server: 1.20.4
Events Training Server: 0.0.26.1/0.8.8 (Java/Python Servers)
REX Training Server: 1.0.1
Rosette Adaptation Studio: 0.9.5.3
Installation Instructions
This release should be installed in an empty directory. Complete installation instructions are in the System Administrator Guide.
To transfer existing projects models from a previous release, you will need the old release installation directory and the new release installation directory.
Adaptation Studio project data is stored in the ras/mongo_data_db
directory. To transfer existing projects, copy the ras/mongo_data_db
directory from the old installation directory to the new ras/mongo_data_db
directory.
REX Training Server (RTS) data is stored in the rts/workspaces
directory. To transfer existing rts data, copy the rts/workspaces
directory from the old installation directory to the new rts/workspaces
directory.
New
The Adaptation Studio User Guide has been updated to include a note about disabling popup blocking on Chrome to allow multiple files to download.
Named entity extraction (NER) model training is now supported for Hebrew.
Bug Fixes
Known Issues
February 2022
This release is a complete reinstall to upgrade from 0.9.5.0.
New
-
Versions included in this release:
Rosette Server: 1.20.4
Events Training Server: 0.0.26.1/0.8.8 (Java/Python Servers)
REX Training Server: 1.0.1
Rosette Adaptation Studio: 0.9.5.2
Log4j updates: Updated log4j to version 2.17.1 to implement fixes for the vulnerabilities identified in CVE-2021-44832.
-
Enhanced logging for Events Training Server:
-
Application properties now use the standard logger instead of standard out.
2022-01-06 14:04:19.387 ...[omitted]... : *********** ETS Properties ***********
2022-01-06 14:04:19.387 ...[omitted]... : version: 0.0.26.1
2022-01-06 14:04:19.387 ...[omitted]... : build: 2022-01-05 19:18:45
2022-01-06 14:04:19.387 ...[omitted]... : ets.mode: training
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.enable-outgoing-ssl: false
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.key-store: /ets/certs/mbp-thubb-2915.basistech.net.jks
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.key-store file found
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.key-store-password: *******
2022-01-06 14:04:19.387 ...[omitted]... : ets.trust-store: /etc/certs/basisca/basistruststore.jks
2022-01-06 14:04:19.387 ...[omitted]... : ets.trust-store file found
2022-01-06 14:04:19.387 ...[omitted]... : ets.trust-store-password: *******
2022-01-06 14:04:19.387 ...[omitted]... : ets.rsUrl (Rosette Server URL): http://memento.basistech.net:8181/rest/v1
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.connectionTimeoutMS: 60000
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.readTimeoutMS: 60000
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.writeBufferSizeKB: 200
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.minimumVersion: v0.8.7
2022-01-06 14:04:19.388 ...[omitted]... : *********** Done ETS Properties **********************
We've reduced the logging verbosity around worker management.
Configuration information is now sent to the logs at startup.
-
Improved performance:
Event extraction runtime speed and memory usage have been significantly improved for documents containing multiple events.
The number of network calls between components due to eureka registrations has been reduced.
Improved upgrades: The version information for the Events Training Server has been internalized, allowing most upgrades to consist of re-releasing container images only.
-
Events guidance updates: The documentation now provides guidance for events modeling.
For event extraction, you may only need a few hundred training samples. For performance reasons, we recommend a maximum of 1000 training samples. We also recommend a mix of positive and negative samples. A negative sample is one where the key phrase is not an example of an event you'd like to extract. The exact number will depend on how ambiguous the key phrase is; a more ambiguous key phrase will require more negative examples. At the most, 10% of the samples should be negative examples.
Input documents for event extraction should be no larger than 4K characters.
Bug Fixes
Chinese, Japanese, and Korean are now working correctly for event extraction.
The unused sanity_check.sh
script has been removed from the release. It was no longer relevant.
Spurious errors no longer appear in Adaptation Studio logs.
Events Training Server no longer has occasional NullPointerException
during worker registration.
The debug log for Events Training Server now contains object descriptions instead of object addresses.
January 2022
New
-
Versions included in this release:
Rosette Server: 1.20.3
Events Training Server: 0.0.26.1/0.8.8 (Java/Python Servers)
REX Training Server: 1.0.1
Rosette Adaptation Studio: 0.9.5.1
-
Enhanced logging for Events Training Server:
-
Application properties now use the standard logger instead of standard out.
2022-01-06 14:04:19.387 ...[omitted]... : *********** ETS Properties ***********
2022-01-06 14:04:19.387 ...[omitted]... : version: 0.0.26.1
2022-01-06 14:04:19.387 ...[omitted]... : build: 2022-01-05 19:18:45
2022-01-06 14:04:19.387 ...[omitted]... : ets.mode: training
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.enable-outgoing-ssl: false
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.key-store: /ets/certs/mbp-thubb-2915.basistech.net.jks
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.key-store file found
2022-01-06 14:04:19.387 ...[omitted]... : ets.ssl.key-store-password: *******
2022-01-06 14:04:19.387 ...[omitted]... : ets.trust-store: /etc/certs/basisca/basistruststore.jks
2022-01-06 14:04:19.387 ...[omitted]... : ets.trust-store file found
2022-01-06 14:04:19.387 ...[omitted]... : ets.trust-store-password: *******
2022-01-06 14:04:19.387 ...[omitted]... : ets.rsUrl (Rosette Server URL): http://memento.basistech.net:8181/rest/v1
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.connectionTimeoutMS: 60000
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.readTimeoutMS: 60000
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.writeBufferSizeKB: 200
2022-01-06 14:04:19.388 ...[omitted]... : ets.pets.minimumVersion: v0.8.7
2022-01-06 14:04:19.388 ...[omitted]... : *********** Done ETS Properties **********************
We've reduced the logging verbosity around worker management.
Configuration information is now sent to the logs at startup.
-
Improved performance:
Event extraction runtime speed and memory usage have been significantly improved for documents containing multiple events.
The number of network calls between components due to eureka registrations has been reduced.
Improved upgrades: The version information for the Events Training Server has been internalized, allowing most upgrades to consist of re-releasing container images only.
-
Events guidance updates: The documentation now provides guidance for events modeling.
For event extraction, you may only need a few hundred training samples. For performance reasons, we recommend a maximum of 1000 training samples. We also recommend a mix of positive and negative samples. A negative sample is one where the key phrase is not an example of an event you'd like to extract. The exact number will depend on how ambiguous the key phrase is; a more ambiguous key phrase will require more negative examples. At the most, 10% of the samples should be negative examples.
Input documents for event extraction should be no larger than 4K characters.
Bug Fixes
Chinese, Japanese, and Korean are now working correctly for event extraction.
The unused sanity_check.sh
script has been removed from the release. It was no longer relevant.
Spurious errors no longer appear in Adaptation Studio logs.
Events Training Server no longer has occasional NullPointerException
during worker registration.
The debug log for Events Training Server now contains object descriptions instead of object addresses.
Upgrade Instructions
Use the following instructions to upgrade from release 0.9.5 to 0.9.5.1.
-
Load the new Rosette Server image. On each machine running Rosette Server, run:
docker load < rosette-server-enterprise-cp-1.20.3.tar.gz
-
Edit the Rosette Server configuration. On each machine running Rosette Server, perform the following steps:
-
Edit the file <install_dir>/rs/rs-docker/.env
, updating the value of ROSETTE_SERVER_IMAGE
:
# Rosette Server information
ROSETTE_SERVER_IMAGE=rosette/server-enterprise-cp:1.20.3
-
Edit the file <install_dir>/rs/config/com.basistech.ws.transport.embedded.cfg
, setting the value of workerThreadCount
to 4. This will improve system performance:
# workerThreadCount is the number of threads that are created to do the actual work
# in the embedded local transport, for worker residing in the same machine. Default is 2.
# It is probably best to not go above 2-3x the number of physical cores on the host machine.
workerThreadCount=4
-
Restart the server. From the <install_dir>/rs/rs-docker
directory:
docker-compose down
docker-compose up
-
Load the new ETS images. On each machine running ETS, run:
docker load < python-events-training-server-v0.8.8.tar.gz
docker load < events-training-server-0.0.26.1.tar.gz
-
Edit the ETS configuration. On each machine running ETS, perform the following steps:
-
Edit the file <install_dir>/ets/ets-docker/.env
, updating the values of ETS_IMAGE
and and PETS_IMAGE
:
ETS_IMAGE=events-training-server:0.0.26.1
PETS_IMAGE=python-events-training-server:v0.8.8
-
Edit the file <install_dir>/ets/config/application.yml
to remove the version
and build
properites from the info.app
section. The version information is now contained in the container itself.
info:
app:
name: "Rosette Events Training Server"
description: "Rosette Event Extraction and Training Server"
version: "x.y.x" ← remove this line
build: "1234" ← remove this line
-
To always restart the proxy image, edit the file <install_dir>/ets/ets-docker/docker-compose.yml
, adding the line restart: always
to the proxy service:
proxy:
restart: always
Note: you must match the indentation of the yml file and use spaces, not tabs.
-
Restart the server. From the <install_dir>/ets/ets-docker
directory:
docker-compose down
docker-compose up
-
Verify the new ETS images. In the <install_dir/ets/ets-docker
directory, run:
docker-compose images
The output should be similar to the following (note the updated tags):
Container Repository Tag Image Id Size
-------------------------------------------------------------------------------------------
ets-server_1 events-training-server 0.0.26.1. e02dda278f04 485 MB
pets-worker_1 python-events-training-server v0.8.8 3fc2c8685380 1.786 GB
December 2021
New
SSL support: SSL is supported for all servers.
Inference renamed to extraction: Inference mode for events has been renamed to extraction mode for clarity, based on feedback from users.
Improved install: The install scripts have been improved, including the addition of installation log files.
Improved model upload script: The script to upload an events model for extraction (ets-upload-model.sh
) now checks if the model already exists on the server. If it does, the user can choose whether to replace the model.
New export model info file: An information file is downloaded along with the model when a model is exported. This is for both event and entity models.
Superuser password: We improved the dialogue around setting the RAS superuser password. You are prompted to reset it on the first login.
RAS Password security: Setting of passwords is now more secure.
Single ETS install file: The same ETS install file (install-ets.sh
) supports both the training and extraction modes of ETS. The script prompts the user for the mode (training or extraction).
IAA: Inter-Annotator Agreement (IAA) is supported for event annotation.
Help: The RAS help file has been updated.
-
Versions included in this release:
Rosette Server: 1.20.0
REX Training Server: 1.0.1
Events Training Server: 0.0.25.5
Rosette Adaptation Studio: 0.9.5
Bug Fixes
Samples of 6 words or fewer now return role mentions from entity extractors.
Tentative role extractors are now created as exact extractors. Previously, they were created as morphological extractors. Tentative extractors created for key phrases are still morphological.
Exact transactors support multiple values with tokens.
Required roles are enforced when extracting event mentions.
Clone project is now supported for events projects.
When using the plan
option to query multiple event models, only the models listed in the plan are queried.
Calls to the /events endpoint on Rosette Server extracts events that use entity extractors from custom profiles.
The event server healthcheck endpoint (/ets/health) reports correct status.
ETS no longer runs out of memory with large extraction documents.
Numerous enhancements and bug fixes have been completed.
Known Issues
Adjudication is not supported for event annotation.
Event models from previous releases cannot be opened in this release. Contact Support for assistance if you have old schema or events models you would like to convert.
The headless install capability of ETS, RTS, RS, and RAS is not yet supported.
The RAS healthcheck script reports a false error message of unknown container on network.
When a sample contains multiple event mentions, the order of the mentions in the sample may impact which samples are extracted.
November 2021
New
Multiple event models in a single call: The /events endpoint in Rosette Server supports event extraction from a single model, multiple specified models, or all loaded models. This is documented in section 7.3 of the System Adminstrator Guide.
Custom profile schema support: When creating an events schema template, a custom profile can be selected. This allows a project schema based on the template to use custom entity extractors.
Events metrics: The Project dashboard now displays event metrics (precision, recall, and F1) for events projects.
-
Documentation enhanced and reorganized: The Model Training Suite documentation set now consists of 3 documents:
-
Adaptation Studio User Guide
A guide for the managers and annotators using Rosette Adaptation Studio describing how to use the tool to create and maintain projects, annotate and train entity and event extraction models, and create event schemas.
-
Developing Models
A guide for the system architects and model administrators to aid in defining the modeling strategy and understanding the theory of model training. It includes an explanation of event modeling and how to design an event schema in preparation for training event extraction models, as well as guidelines for gathering and preparing data for model training.
-
System Administrator Guide
A guide for installing and maintaining both the training and production environments of the Rosette Model Training Suite. Included are instructions for moving trained models from the training environment into the production environment, as well as the documentation for the API calls for entity and event extraction. This includes the content previously in the Deploying models guide.
-
Versions included in this release:
Rosette Server: 1.19.5
REX Training Server: 1.0.1
Events Training Server: 0.0.25.4
Rosette Adaptation Studio: 0.9.4
Bug Fixes
Known Issues
Event models from previous releases cannot be opened in this release. Contact Support for assistance if you have old schema or events models you would like to convert.
Samples of 6 words or less will not return role mentions from entity extractors.
In this release, tentative role extractors are created as morphological extractors. In the next release, they will be exact transactors. Tentative extractors created for key phrases will remain morphological.
In this release, exact transactors do not support multiple tokens. Multiple tokens will be supported in the next release.
Event mentions may be extracted when a required role is missing.
Clone project is not currently supported for events projects. To copy an events project, export and import the project.
When using the plan
option to query multiple event models, all event models are queried, not just the models listed in the plan.
When a sample contains multiple event mentions, the order of the mentions in the sample may impact which samples are extracted.
Calls to the /events endpoint on Rosette Server will not extract any events that use entity extractors when using a custom profile.
SSL is not supported in this release.
The headless install capability of ETS, RTS, RS, and RAS is not yet supported.
Inter-Annotator Agreement (IAA) is not supported for event annotation.
Adjudication is not supported for event annotation.
The help file has not been updated.
When extracting entity mentions from a trained model on the production server, the events server healthcheck endpoint (/ets/health
) will return a status of DOWN
. This does not occur if a single Rosette Server is used for both training and production.
September 2021
New
-
Naming: Rosette Model Training Suite refers to the complete set of annotation, training, and extraction tools. This includes the following products:
Event annotation: Rosette Adaptation Studio now supports annotating event mentions.
Events endpoints: The endpoints /events
and /events/info
endpoints have been added to Rosette Server to support extraction of event mentions.
Events Training Server: An events training server has been added to support training event extraction models.
Events Inference Server: An events inference server has been added to support extracting event mentions in Rosette Server.
Update Rosette Server: When installing events, the Rosette Server installation must be updated using the rs-configuration-update.sh
script. The script must be run on both the training and inference instances of Rosette Server.
-
REX Training Server Modifications and Improvements
New URL: The base URL is now rts
instead of model
.
-
Renamed endpoints: All endpoints that started with /rex
(training, annotating, etc.) are now under /workspaces/{workspaceId}
. Endpoints that took a workspace ID as part of their POST request or as a request parameter now use the route to specify a workspace.
/rts/workspaces/{workspaceId}/train-model
replaced /model/rex/train-model
/rts/workspaces/{workspaceId}/generate-wordclasses
replaced /model/rex/generate-wordclasses
/rts/workspaces/{workspaceId}/annotate
replaced /model/rex/annotate
. The language request parameter is no longer necessary.
/rts/workspaces/{workspaceId}
replaced /model/rex/{workspaceID}/status
Serialization improvements: The REX Training Server now starts serializing once annotation has paused.
Training sessions: Only 2 training sessions will occur simultaneously.
Info endpoint: The GET /rts/info/server
now returns the various configuration properties along with the version.
Improved status: The status endpoint for workspaces now includes additional information.
-
Documentation enhanced and reorganized: The Model Training Suite documentation set now consists of 4 documents:
-
Adaptation Studio User Guide
A guide for the managers, adjudicators, and annotators using Rosette Adaptation Studio describing how to use the tool to create and maintain projects, annotate and train entity and event extraction models, and create event schemas.
-
Developing Models
A guide for the system architects and model administrators to aid in defining the modeling strategy and understanding the theory of model training. It includes an explanation of event modeling and how to design an event schema in preparation for training event extraction models, as well as guidelines for gathering and preparing data for model training.
-
System Administrator Guide
A guide for installing and maintaining both the training and production environments of the Rosette Model Training Suite. Included are instructions for moving trained models from the training environment into the production environment, as well as the documentation for the API calls for entity and event extraction.
-
Deploying Models
A guide for Rosette model administrators, discussing how to deploy the models generated by the training suite and how to configure Rosette Server for production use. This includes installing a version of Events Training Server for extracting event mentions using previously trained models.
-
Versions included in this release:
Rosette Server: 1.19.4
REX Training Server: 1.01
Events Training Server: 0.0.24
Rosette Adaptation Studio: 0.9.3
Bug Fixes
Known Issues
Inter-Annotator Agreement (IAA) is not supported for event annotation.
Adjudication is not supported for event annotation.
SSL is not supported in this release.
Event extraction against multiple event models in a single call in Rosette Server is not supported.
All events endpoints and features are supported in English only.
The help file has not been updated.
The User Guide updates for events and new features are still in progress.
The headless install capability of ETS, RTS, RS, and RAS is not yet supported.
November 2020
New
SSL enabled: SSL can now be enabled between all servers. Scripts are provided to enable and disable SSL.
Upgrade script removed: The upgrade-rs-0.8-0.9.sh
script has been removed from the installation package. This script only supported upgrading the Rosette Server release when installing 0.9.0.
Rosette Server version: The Rosette Server version is 1.17.5.
Bug Fixes
Known Issues
The healthcheck scripts cannot check connectivity in SSL-enabled environments since they lack access to the cacert, certificates, and keys. The healthcheck scripts can be run after installation but before enabling SSL.
When SSL is initially enabled on Rosette Server, the wrapper.log will print an exception. This exception can be safely ignored and is due to a transport rule containing an http route in an SSL enabled environment.
October 2020
Note
When you reinstall the Rosette Adaptation Studio package, all components are reinstalled. Any customizations you make to configuration files should be saved and reapplied after installation.
New
Word classes status: The manage page now includes the training status of word classes. You can export a model before word classes are available.
Filter improvements: You can now filter on labels in the View Annotations page.
Adjudication counts modified: The Adjudicated field on the Project Dashboard now includes the total samples adjudicated, both manually and auto-adjudicated.
Case-insensitive models: You can now build case-insensitive models.
Help file: The Help file has been updated to match the User Guide.
Restart policy added: The restart policy was added to all services. The default is restart: "no"
.
-
Installation Improvements: We've changed the following files in the Rosette Adaptation Studio installation.
Renamed services adding ras_
prefix (ras_server
, ras_proxy
) to disambiguate from other services running on the machine.
Removed the ./config
directory from the deployment; it is no longer used.
Improved docker-compose.yml
file by removing incorrect comments and unnecessary volume declarations.
Installing SSL now comments out non-used port rather than deleting the declaration.
Rosette Server version: The Rosette Server version is 1.17.4.
Bug Fixes
Export Model now always downloads the model. Previously the model was not always downloaded as expected.
Installation script no longer creates unused nginx files.
The Rosette Server Entity Extractor no longer modifies the normalized
field for mentions that are extracted by a custom processor.
Known Issues
To force a model to be trained, for example after word classes are completed you may want to retrain a model, you must annotate a couple of samples. The models are automatically training as you annotate samples.
Usage Note
The ad-suggestions
profile deployed with Adaptation Studio is designed for sole use of the Adaptation Studio server. Calling the Rosette Server /entities endpoint using the typical calling conventions and specifying the ad-suggestions
profile will have undetermined results. This is because the ad-suggestions
profile expects hidden parameters to be passed in the call and for an RTS model to have been trained beforehand. Additionally, the response differs from a typical response from the /entities endpoint.
The ad-suggestions
profile can be modified like any other profile to customize the behavior of Adaptation Studio. You'll need to use a separate "testing" profile to test the changes.
Create a new custom profile for testing.
Modify the profile with the customizations (such as gazetteers) you want to implement.
Test the changes with the regular /entities endpoint, specifying the profile.
Once testing is complete, apply the customizations to the ad-suggestions
profile.
Refer to Section 10.2 of the Adaptation Studio User Guide for information on how to create a custom profile.
October 2020
Upgrade Script
Rosette Server does not require a full install when upgrading from 0.8 to 0.9. Only the roots for the additional languages need to be added; there is no change to the rest of the installation.
To upgrade Rosette Server from 0.8 to 0.9:
Stop Rosette Server
Unzip the file rs-installation-0.9.zip
-
From the directory rs-installation-0.9
, run the upgrade script:
./upgrade-rs-0.8-0.9.sh
Start Rosette Server
New
New languages: Rosette Adaptation Studio now supports Arabic, Chinese, Korean, and Russian in addition to English and Japanese.
Use Basis Training data: You now have the option of using the Basis training data to augment the stock model or to build a model from scratch with just the annotations provided by the Studio. Note that you cannot add new entity types when using the Basis training data.
UI support for case sensitive and case insensitive models: When creating a new project you now select whether the trained model will be case sensitive or case insensitive. Note: you cannot build case sensitive models in this release.
User management improvements: Users can now modify their personal information, including password.
Cache management: Dormant models are now automatically ejected from the REX training server memory.
Healthcheck scripts: Scripts are now provided to check the health of each of the servers.
Status improvements: The Manage page now displays system status, model status, and project status.
Online help available: The User Guide is now available from the Adaptation Studio page.
New labels: Adding and modifying labels has been improved.
Reconcile added: The Finalize task on the project menu has been renamed to Reconcile to more accurately reflect the task performed.
Install improvements: Containers are now zipped, so the install is slightly smaller and the containers load about 20% faster.
Improved error handling: Error handling has been improved in the Rosette Server and REX Training Server installers.
Bug Fixes
Unused labels can now be deleted, even if they were previously in use.
All errors that occur while loading documents are now displayed.
RTS_URL
is now updated in the nginx.conf
file after install. The value in the .env
file is now used.
Known Issues
Case sensitive and case insensitive models: You can only build case sensitive models in this release.
Help file: The section Adjudicate in the online Help does not match the section in the User Guide. The User Guide is the latest version.
REX Training Server log file messages: The log file for the REX Training Server may contain multiple train-model request failed
messages. These are not actual failures. RTS ignores new training requests from a project while there is an active training for that project in process and generates this message. RTS is working as expected.
September 2020
New Features
Japanese is now supported.
The newly-trained models can now be used along with the standard REX extractions to select samples and make label predictions.
Rosette Adaptation Studio supports extracting entities by mixing the newly trained model with all other REX extractions including the standard REX statistical model, gazetteers, and regexes. Which extractions you use depends on your requirements and is configured through the ad-suggestions
custom profile and may require experimentation.
Rosette Adaptation Studio now supports model permanence across restarts. If the Rosette Training Server (RTS) crashes at any point in the annotation process, it's able to recover from the crash by reloading the model back into memory from disk.
Known Issues
The training data for the new model comes only from the annotations provided by the user and does not currently use the existing REX model training data.
We recommend waiting fifteen minutes after the last annotation before downloading the model to ensure all annotations have been incorporated into the training of the model. If not provided sufficient time, the downloaded model may be an earlier version that doesn't include the latest annotations.
August 2020
Supported languages: English.
This release of Adaptation Studio trains new models for REX as annotations are completed, but the newly-trained models are not used by the Active Learning to select samples or make label predictions.
The Export Model option in the Manage page downloads the newly-trained model. Copy this file into Rosette Server to deploy the new model.
The NER-Rosette template is included in this package.