Configuring the JVM Heap Size
There is not a single one size fits all number here. The best value for max heap size depends on a number of factors:
activated endpoints and features
usage pattern
data characteristics such as size (both character and token lengths), language, and genre
java garbage collector and its settings
Our recommendation is to follow directions from well-known sources, such as this to experiment with heap settings by testing your usage of Rosette Server in order to identify the ideal settings that suits you the best.
Please note that it’s not recommended setting the max heap to the amount of physical RAM in the system. More heap doesn’t always translate to better performance, especially depending on your garbage collection settings. Also, we do require sufficient amount of free memory for memory mapped files.
Use this table to estimate the minimum heap required based on your selection of endpoints. Note that endpoints may have implicit code dependencies on other endpoints, so the dependencies' heap needs to be added if they have not been accounted for.
Tip
We recommend setting the initial and max heap to the same value.
Table 1. Per endpoint JVM heap recommendation
Endpoint |
Min Heap |
Note |
language |
0.25GB |
|
morphology |
1.5GB |
|
transliteration |
0.5GB |
|
entities |
1GB |
add 1.5GB if morphology is not already enabled |
sentiment |
1GB |
add 1.5GB if morphology is not already enabled; add 1GB if entities is not already enabled |
categories |
1GB |
add 1.5GB if morphology is not already enabled |
topics |
1.5GB |
add 1.5GB if morphology is not already enabled; add 1GB if entities is not already enabled |
text-embeddings |
1GB |
add 1.5GB if morphology is not already enabled |
relationships |
3GB |
add 1.5GB if morphology is not already enabled; add 1GB if entities is not already enabled |
dependencies |
0.4GB |
|
name-similarity |
2GB |
combined with name-translation |
name-translation |
2GB |
combined with name-similarity |
name-deduplication |
2GB |
add 2GB if neither name-similarity or name-translation is on |
On macOS/Linux or Windows:
Edit the file server/conf/wrapper.conf
Modify the value of wrapper.java.maxmemory
With Docker:
Edit the file docker-compose.yaml
Modify the value of ROSETTE_JVM_MAX_HEAP
Setting Rosette to Pre-Warm
To speed up first call response time, Rosette can be pre-warmed by loading data files at startup at the cost of a larger memory footprint.
Most components load their data lazily, meaning that the data required for processing will only be loaded into memory when an actual call hits. This is particularly true for language-specific data. The consequence is that when the very first call with text in a given language arrives at a worker, the worker can take a quite a bit of time loading data before it can process the request.
Pre-warming is Rosette's attempt to address the 1st-call penalty by hitting the worker with text in every licensed language it supports at boot time. Then, when an actual customer request comes in, all data will have already been memory mapped and you won't experience a first call delay as the data is loaded. Only languages licensed for your installation will be pre-warmed.
The default is set to false
, pre-warm is not enabled.
To set Rosette to warm up the worker upon activation
On macOS/Linux or Windows:
Edit the file /com.basistech.ws.worker.cfg
set warmUpWorker=true
Tip
When installing on macOS or Linux, Rosette can be set to pre-warm in the installation. Select Y
when asked Pre-warm Rosette at startup?
You can always change the option by editing the com.basistech.ws.worker.cfg
file.
With Docker:
Edit the file docker-compose.yaml
Set ROSETTE_PRE_WARM=true
Configuring the Transport Rules
The transport rules provide a means of mapping an endpoint to a processing URL.
There is a special URL, local:
, which routes requests within the JVM, bypassing the overhead associated with a network connection. Generally, our recommendation is to use local:
if the worker resides on the same machine and same JVM as the frontend.
Typically, for large scale deployments, we would recommend having endpoints with a high hit-rate distributed on a machine separate from the server’s frontend. The /language endpoint is an exception to this rule. It is called internally on every request that does not have the language preset. However, it does not consume a lot of the server’s resources, so we advise keeping it on local:
to minimise the networking overhead.
Tip
When using the single-box monolith Rosette Server deployment, we recommend setting the URL for all licensed endpoints to local:
.
Configuring Worker Threads for HTTP Transport
Multiple worker threads allow you to implement parallel request processing. Generally, we recommend that the number of threads should be less than the number of physical cores or less than the total number of hyperthreads, if enabled.
You can experiment with 2-4 worker threads per core. More worker threads may improve throughput a bit, but typically won't improve latency. The default value of worker threads is 2.
If the URL for all licensed endpoints are set to local:
(not distributed):
-
Edit the file /launcher/config/com.basistech.ws.transport.embedded.cfg
.
-
Modify the value of workerThreadCount
If using transport rules in a distributed deployment on macOS/Linux or Windows:
Edit the file /launcher/config/com.basistech.ws.transport.embedded.cfg
.
Modify the value of workerThreadCount
.
Edit the file /launcher/config/com.basistech.ws.worker.cfg
Modify the value of workerThreadCount
If using Docker, only the docker-compose.yaml
file must be modified:
Edit the file docker-compose.yaml
Modify the value of ROSETTE_WORKER_THREADS
Setting the Language Parameter
If the language of the input text is known, you can add the language parameter to bypass the language identification step in the processing pipeline, speeding up the processing time and increasing throughput.
Each document endpoint accepts an optional language parameter:
{"content": "your_text_here", "language":"eng"}
Optimizing the /entities Endpoint
If the data consists of many relatively small individual files, concatenating them will improve the throughput. But you must be aware that this can impact the accuracy of the model. The statistical model includes a consistency feature which reflects a tendency of the model to label recurring tokens with the same type. This may cause entities to be labelled incorrectly when concatenating text samples that don't share the same context.
Regular Expressions
Regular expressions (regexes) are used for finding entities which follow a strict pattern with a rigid form and infinite combinations, such as URLs and credit card numbers. In the default REX installation the regex files are:
language specific: data/regex/<lang>/accept/regexes.xml
where <lang> is the ISO 693-3 language code
cross-language: data/regex/xxx/accept/regexes.xml
supplemental: data/regex/<lang>/accept/supplemental
Regular expressions can decrease throughput performance. The /entities endpoint is pre-configured with a set of regular expressions. You can improve performance by removing unused expressions by:
moving the files with the unused expressions out of the directory, or
commenting out specific expressions within the file.
The supplemental regular expressions are configured in the rex-factory-config.yaml
file. Remove or comment out values from the supplementalRegularExpressionPaths
parameter to remove unused supplemental regex files.
Disable Entity Linking. By default, entity linking is disabled, but enabling it can slow down the response time of Rosette Server.
Disable Pronominal Resolution By default, pronominal resolution is disabled, but enabling it can slow down the response time of Rosette Server.
Disable In-document Coreference Documents often contain multiple references to a single entity. In-document coreference (indoc coref) chains together all mentions to the same entity. By default, indoc coref is disabled (NULL
).