Rosette Server can support multiple profiles, each with different data domains (such as user dictionaries, regular expressions files, and custom models) as well as different parameter and configuration settings. Each profile is defined by its own root directory, thus any data or configuration files that live in the root directory of an endpoint can be part of a custom profile.
Using custom profiles, a single endpoint can simultaneously support users with different processing requirements within a single instance of Rosette Server. For example, one user may work with product reviews and have a custom sentiment analysis model they want to use, while another user works with news articles and wants to use the default sentiment analysis model.
Each unique profile in Rosette Server is identified by a string, profileId
. The profile is specified when calling the API, by adding the profileId
parameter, indicating the set of configuration and data files to be used for that call.
Custom profiles and their associated data are contained in a <profile-data-root>
directory. This directory can be anywhere in your environment; it does not have to be in the Rosette Server install directory.
Table 6. Examples of types of customizable data by endpoint
Endpoint |
Applicable data files for custom profile |
/categories |
Custom models |
/entities |
Gazetteers, regular expression files, custom models, linking knowledge base |
/morphology |
User dictionaries |
/sentiment |
Custom models |
/tokens |
Custom tokenization dictionaries |
Note
Custom profiles are not currently supported for the address-similarity
, name-deduplication
, name-similarity
, and name-translation
endpoints.
Setting up Custom Profiles
-
Create a directory to contain the configuration and data files for the custom profile.
The directory can have any name and can be anywhere on your server; it does not have to be in the Rosette Server directory structure. This is the profile-data-root
.
-
Create a subdirectory for each profile, identified by a profileId.
For each profile, create a subdirectory named profileID in the profile-data-root. The profile-path for a project is profile-data-root/profileId
.
-
Edit the Rosette Server configuration files to look for the profile directories.
The configuration files are in the launcher/config/
directory. Set the profile-data-root
value in these files:
# profile data root folder that may contain profile-id/{rex,tcat} etc
profile-data-root=file:///Users/rosette-users
-
Add the customization files for each profile. They may be configuration and/or data files.
When you call the API, add "profileId" = "myProfileId"
to the body of the call.
{"content": "The black bear fought the white tiger at London Zoo.",
"profileId": "group1"
}
New profiles are automatically loaded in Rosette Server. You do not have to bring down or restart the instance to add new models or data to Rosette Server.
To add or update models or data, assuming the custom profile root rosette-users
and profiles group1
and group2
.
Add a new profile with the new models or new data, for example group3
.
Delete the profile and re-add it. Delete group1
and then recreate the group1
directory with the new models and/or data.
The configurations for each endpoint are contained in the factory configuration files. The worker-config.yaml
file describes which factory configuration files are used by each endpoint as well as the pipelines for each endpoint. To modify default parameter values or any other configuration values, copy the factory configuration file into the profile path and modify the values.
Example 3. Modifying entities parameters default values
Let's go back to our example with profile-ids of group1 and group2. Group1 wants to modify the default entities parameters, setting entity linking to true
and case sensitivity to false
. These parameters are set in the rex-factory-config.yaml
file.
Copy the file /config/rosapi/rex-factory-config.yaml
to rosette-users/group1/config/rosapi/rex-factory-config.yaml
.
-
Edit the new rex-factory-config.yaml
file as needed. This is an excerpt from a sample file.
# rootDirectory is the location of the rex root
rootDirectory: ${rex-root}
# startingWithDefaultConfigurations sets whether to fill in the defaults with CreateDefaultExtrator
startingWithDefaultConfiguration: true
# calculateConfidence turns on confidence calculation
# values: true | false
calculateConfidence: true
# resolvePronouns turns on pronoun resolution
# values: true | false
resolvePronouns: true
# rblRootDirectory is the location of the rbl root
rblRootDirectory: ${rex-root}/rbl-je
# case sensitivity model defaults to auto
caseSensitivity: automatic
# linkEntities is default true for the Cloud
linkEntities: true
Each profile can include custom data sets. For example, the entities endpoint includes multiple types of data files including regex and gazetteers.
Example 4. Custom regex for the Entities Endpoint
The custom regex file used in this example is named custo-regexes.xml
. It is assumed that you have already created the custom regex file as described in the section Supplemental Regexes in the Rosette Entity Extractor Application Developer Guide.
Copy the file /config/rosapi/rex-factory-config.yaml
to rosette-users/group1/config/rosapi/rex-factory-config.yaml
.
-
Edit the new rex-factory-config.yaml
file, setting the dataOverlayDirectory
and adding a supplemental regex.
# rootDirectory is the location of the rex root
rootDirectory: ${rex-root}
dataOverlayDirectory: "/Users/rosette-users/group1/rex/data"
supplementalRegularExpressionPaths:
- "/Users/rosette-users/group1/rex/data/regex/eng/accept/supplemental/custom-regexes.xml"
Add the file custom-regexes.xml
to the directory Users/rosette-users/group1/rex/data/regex/eng/accept/supplemental
. This file contains the new regex expressions.
In this example we're going to add the entity types COLORS and ANIMALS to the entities endpoint, using a regex file.
Create a profile-data-root
, called rosette-users in the Users
directory.
-
Create a user with the profileId
of user1. The new profile-path
is:
/Users/rosette-users/user1
-
Edit the Rosette Server configuration files:
adding the profile-data-root.
# profile data root folder that may contain app-id/profile-id/{rex,tcat} etc
profile-data-root=file:///Users/rosette-users
-
Copy the rex-factory-config.yaml
file from /config/rosapi
into the new directory:
/Users/rosette-users/user1/config/rosapi/rex-factory-config.yaml
-
Edit the copied file, setting the dataOverlayDirectory
parameter and adding the path for the new regex file. The overlay directory is a directory shaped like the data
directory. The entities endpoint will look for files in both locations, preferring the version in the overlap directory.
dataOverlayDirectory: "/Users/rosette-users/user-1/custom-rex/data"
supplementalRegularExpressionPaths:
- "/Users/rosette-users/user1/custom-rex/data/regex/eng/accept/supplemental/custom-regexes.xml"
-
Create the file custom-regexes.xml
in the /Users/rosette-users/user1/custom-rex/data/regex/eng/accept/supplemental
directory.
<regexps>
<regexp type="COLOR">(?i)red|white|blue|black</regexp>
<regexp type="ANIMAL">(?i)bear|tiger|whale</regexp>
</regexps>
-
Call the entities endpoint without using the custom profile:
curl -s -X POST \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Cache-Control: no-cache" \
-d '{"content": "The black bear fought the white tiger at London Zoo." }' \
"http://localhost:8181/rest/v1/entities"
The only entity returned is London Zoo:
{
"entities": [
{
"type": "LOCATION",
"mention": "London Zoo",
"normalized": "London Zoo",
"count": 1,
"mentionOffsets": [
{
"startOffset": 41,
"endOffset": 51
}
],
"entityId": "T0"
}
]
}
-
Call the entities endpoint, adding the profileId to the call:
curl -s -X POST \ -H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "Cache-Control: no-cache" \
-d '{"content": "The black bear fought the white tiger at London Zoo.",
"profileId": "zookeeper"}' \
"http://localhost:8181/rest/v1/entities"
The new colors and animals are also returned:
"entities": [
{
"type": "COLOR",
"mention": "black",
"normalized": "black",
"count": 1,
"mentionOffsets": [
{
"startOffset": 4,
"endOffset": 9
}
],
"entityId": "T0"
},
{
"type": "ANIMAL",
"mention": "bear",
"normalized": "bear",
"count": 1,
"mentionOffsets": [
{
"startOffset": 10,
"endOffset": 14
}
],
"entityId": "T1"
},
{
"type": "COLOR",
"mention": "white",
"normalized": "white",
"count": 1,
"mentionOffsets": [
{
"startOffset": 26,
"endOffset": 31
}
],
"entityId": "T2"
},
{
"type": "ANIMAL",
"mention": "tiger",
"normalized": "tiger",
"count": 1,
"mentionOffsets": [
{
"startOffset": 32,
"endOffset": 37
}
],
"entityId": "T3"
},
{
"type": "LOCATION",
"mention": "London Zoo",
"normalized": "London Zoo",
"count": 1,
"mentionOffsets": [
{
"startOffset": 41,
"endOffset": 51
}
],
"entityId": "T4"
}