Building and Running the Sample Applications
To build and run the sample applications, you must have the Java SDK (11 or later). To use the Ant build files we provide to build and run the samples, you need Ant (1.7.1 or later) with the JAVA_HOME
environment variable set to the root of your Java SDK. For more information, see http://ant.apache.org.
The source files for these applications and the Ant build file for compiling and running them (build.xml
) are located in $BT_ROOT/rlpnc/samples/java
.
Tip
The Ant scripts and build files require one input property: bt.arch=$BT_BUILD
(bt.arch=amd64-glibc217-gcc48
, for example). If you set this property in the script (build.xml
), you do not need to include it on the command line.
Table 2. Sample Applications
Source File
|
Description
|
AddNamesSample.java
|
Adds names from a UTF-8 file to an RNI Index.
|
LoadGazetteerSample.java
|
Loads an XML gazetteer into an RNI Index.
|
IndexQuerySample.java
|
Submits a series of queries (names) to an index and reports on the results.
|
DistributedTransactionSample.java
|
Queries an index, deletes the names returned from that index, and adds the names to a second index. The deletions and additions are performed in a single distributed transaction with two-phase commit.
|
MatchNamesSample.java
|
Determines the similarity of two or more names.
|
MatchPhenomenaSample.java
|
Demonstrates the different name matching phenomena that RNI supports.
|
AutomatedTranslationSample.java
|
Translates one or more names.
|
InteractiveTranslationSample.java
|
Simulates a series of user interactions resulting in the translation of an Arabic name.
|
RNISolrjSample.java
|
Integrates RNI with Solr to add and query Solr documents with multiple and multivalued name fields.
|
AddressIndexQuerySample.java
|
Submits a series of queries (addresses) to an index and reports on theresults.
|
AddressMatchPhenomenaSample.java
|
Demonstrates the different address matching phenomena that RNIsupports.
|
Your License
You must copy the license file you obtained from BasisTech to $BT_ROOT/rlp/rlp/licenses
. If the license is not in place, you cannot access any RNI-RNT functionality. The license defines the scope of the activities you may perform with RNI-RNT.
Using the Ant Build Script
Tip
The Ant scripts and build files require one input property: bt.arch=$BT_BUILD
(bt.arch=amd64-glibc217-gcc48
, for example). If you set this property in the script (build.xml
), you do not need to include it on the command line.
Change directory to $BT_ROOT/rlpnc/samples/java
and run Ant:
ant -Dbt.arch=$BT_BUILD target
where target is one of the Ant build targets in the following table.
As you create your own applications, you can use the Ant build file as the starting point for establishing your own build procedures.
RNI Command-Line Interface
RNI includes a Java command line interface RNICLI
. This is provided in binary form only in btrlpnc.jar
.
Use RNICLI
to create RNI indexes, to add names from gazetteers or UTF-8 flat files, to run RNI queries, and to pair-wise match in a query file against names in an input file.
Use the command-line utility script in rlpnc/samples/java/bin
. For Unix, the script is rnicli.sh
. For Windows, the script is rnicli.bat
. Include the arguments shown below.
Tip
Windows Powershell users
The Windows Powershell command line does not automatically use UTF-8 encoding. To use the command-line interface with non-Latin script languages:
-
Set your system locale to UTF-8
-
Add this flag to each command:
-Dfile.encoding=65001
Table 3. RNICLI Commands
Description
|
Command
|
Create a new index
|
RNICLI create -root -index [-in] [-lang] [-entity] [-threads] [ignoreBadData] [-langOfOrigin]
|
Add names to an existing index
|
RNICLI add -root -index -in [-lang] [-entity] [-threads] [ignoreBadData] [-langOfOrigin]
|
Query an index
|
RNICLI query [ignoreBadData] -root -index -query [-report] [-lang] [-entity] [-threads] [ignoreBadData] [-threshold] [-maxToConsider][-scoreToCheckRestriction] [-langOfOrigin] [-maxToCheck] [-namesToCheckAllowance] [-max]
|
Pair-wise match
|
RNICLI match -root -in -query [-report] [-lang] [-entity] [-threads] [ignoreBadData] [-langOfOrigin] [-threshold]
|
For more information on the command-line interface, including definitions of the arguments, display the help using one of the following commands:
RNICLI -?
RNICLI -help
You can also use the Ant build file in $BT_ROOT/rlpnc/samples/java
to run the RNI command-line interface.
Tip
The Ant scripts and build files require one input property: bt.arch=$BT_BUILD
(bt.arch=amd64-glibc217-gcc48
, for example). If you set this property in the script (build.xml
), you do not need to include it on the command line.
Change directory to $BT_ROOT/rlpnc/samples/java
and run Ant:
ant -Dbt.arch=$BT_BUILD args target
where args provide argument values and target is one of the Ant build targets in the following table. The argument values are described in the RNICLI help file. For any arguments you omit, the Ant build file supplies default values. You can also edit build.xml
to change these values.
RNT Command-Line Interface
The RNT command-line interface, RNTCLI
, translates names from one text domain to another, as described in Translating Names. For each name in an input file, RNTCLI
generates a translation and a confidence score (0 - 1) associated with the translation.
Use the command-line utility script in rlpnc/samples/java/bin
. For Unix, the script is rntcli.sh
. For Windows, the script is rntcli.bat
. Include the arguments shown below.
Tip
Windows Powershell users
The Windows Powershell command line does not automatically use UTF-8 encoding. To use the command-line interface with non-Latin script languages:
-
Set your system locale to UTF-8
-
Add this flag to each command:
-Dfile.encoding=65001
Table 4. RNTCLI Arguments
Argument
|
Description
|
root
|
$BT_ROOT directory; always required
|
in
|
path to the input file; always required
|
targTr
|
transliteration for the target language; only required if not specified in the input file
|
entity
|
entity type of the names in the infile, set to PERSON if omitted
|
srcScr
|
script for the source, set to the default script if omitted.
|
srcLang
|
language of the names in the input file, statistically guessed if omitted
|
srcTr
|
transliteration scheme for the source, set to native if omitted
|
targScr
|
script for the target, set to default script if omitted
|
targLang
|
target language, set to eng if omitted
|
threads
|
number of threads to use for translation, 1 if omitted
|
results
|
maximum number of results returned, 16384 if omitted
|
tokens
|
maximum number of tokens to allow in Name data, 10 if omitted
|
report
|
path to the report file, report to console if omitted
|
langOfOrigin
|
language of origin, statistically guessed if omitted
|
O=
|
flags, where valid options are:
-Oortho=yes|no
-Ostat=yes|no
-Operftrade=normal|fast|careful|precise
-Osegment=yes|no
-OvariantSpelling=yes|no
-Oregion=default|north|south
-OkorGeography=default|northKorean|southKorean
-Onormalize=yes|no
|
For more information on the command-line interface, display the help using one of the following commands:
RNTCLI -?
RNTCLI -help
You can also use the Ant script in $BT_ROOT/rlpnc/samples/java
to run the RNT command-line interface.
Tip
The Ant scripts and build files require one input property: bt.arch=$BT_BUILD
(bt.arch=amd64-glibc217-gcc48
, for example). If you set this property in the script (build.xml
), you do not need to include it on the command line.
Change directory to $BT_ROOT/rlpnc/samples/java
and run Ant:
ant -Dbt.arch=$BT_BUILD args target
where args
provide argument values and target
is one of the Ant build targets in the following table.