RNI includes plugins for Solr 7.6, Solr 8.11.1, and Solr 9.0.0 that support the use of RNI with Solr documents that contain names, addresses, and dates along with other data. The plugins support single-valued and multi-valued name, address, and date fields. With them you can run Solr queries against documents that include, but are not limited to, name, address, and date fields.
Getting Started with the Solr Plugin
To index and search documents with RNI in a Solr application, you must add JARs to the Solr classpath, add the name fields to the schema.xml
, and modify the solrconfig.xml
.
Placing the Solr Plugin Jar
The Solr plugin Jar should be in your Solr sharedLib
directory.
Jar files used by all of the cores in your Solr application (including bt-rni-solr<version>-plugin.jar
) should be placed in a sharedLib
directory that is defined in solr.xml
.
We have placed the Solr plugin jar in rlpnc/data/rnm/sample/solr_shared_lib
and included the corresponding sharedLib
setting in solr.xml
in our sample solr home.
Example, from rlpnc/data/rnm/sample/solr9x_home/solr.xml
:
<!--Adjust the sharedlib setting if you move bt-rni-solr9x-plugins.jar to a different location.-->
<str name="sharedLib">${bt.root}/rlpnc/data/rnm/sample/solr_shared_lib</str>
Add fieldType
and field
definitions to schema.xml
.
In types
, define the NameField
field type.
<fieldType name="bt_rni_name" class="com.basistech.rni.solr.NameField" needNameStore="true"/>
In types
, define the AddressField
field type.
<fieldType name="bt_rni_addr" class="com.basistech.rni.solr.AddressField" needAddressStore="true"/>
In types
define the DateField
field type.
<fieldType name="bt_rni_date" class="com.basistech.rni.solr.DateField" needDateStore="true"/>
Add your name, address, and date fields in fields
. For example:
<field name="primaryName" type="bt_rni_name" indexed="true" stored="true" multiValued="false"/>
<field name="aka" type="bt_rni_name" indexed="true" stored="true" multiValued="true"/>
<field name="residence" type="bt_rni_addr" indexed="true" stored="true" multiValued="false"/>
<field name="dateOfBirth" type="bt_rni_date" indexed="true" stored="true" multiValued="false"/>
You can copy fragments from rlpnc/data/rnm/sample/solr8x_home/collection1/conf/schema-xml-sample-fragments.xml
.
These changes can also be made using the Solr Schema API in the Solr Admin page.
As top-level elements, add the reRank queryParser
included in the RNI release along with rniMatch valueSourceParser
to solrconfig.xml
.
<queryParser name="rniRerank" class="com.basistech.rni.solr.RNIReRankQParserPlugin"/>
<queryParser name="rniAddrRerank" class="com.basistech.rni.solr.RNIAddressReRankQParserPlugin"/>
<queryParser name="rniDateRerank" class="com.basistech.rni.solr.RNIDateReRankQParserPlugin"/>
<valueSourceParser name="rniMatch" class="com.basistech.rni.solr.NameMatchValueSourceParser"/>
<valueSourceParser name="rniAddrMatch" class="com.basistech.rni.solr.AddressMatchValueSourceParser"/>
<valueSourceParser name="rniDateMatch" class="com.basistech.rni.solr.DateMatchValueSourceParser"/>
If your documents include one or more multivalued name fields, include an RNI updateRequestProcessorChain
.
<updateRequestProcessorChain name="RNIName">
<processor class="com.basistech.rni.solr.MultiValueNameUpdateRequestProcessorFactory"/>
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
Modify the /update requestHandler
to use the RNI update chain.
<requestHandler name="/update"
class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">RNIName</str>
</lst>
</requestHandler>
If your documents include one or more multivalued address fields, include an RNI updateRequestProcessorChain
.
<updateRequestProcessorChain name="RNIAddr">
<!--Custom processor required when using multivalued address fields-->
<processor class="com.basistech.rni.solr.MultiValueAddressUpdateRequestProcessorFactory"/>
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
Modify the /update requestHandler
to use the RNI update chain.
<requestHandler name="/update"
class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">RNIAddr</str>
</lst>
</requestHandler>
If your documents include one or more multivalued date fields, include an RNI updateRequestProcessorChain
.
<updateRequestProcessorChain name="RNIDate">
<!--Custom processor required when using multivalued date fields-->
<processor class="com.basistech.rni.solr.MultiValueDateUpdateRequestProcessorFactory"/>
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
Modify the /update requestHandler
to use the RNI update chain.
<requestHandler name="/update"
class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">RNIDate</str>
</lst>
</requestHandler>
If your documents include one or more multivalued name, address or date fields, include an RNI updateRequestProcessorChain
.
<updateRequestProcessorChain name="RNI">
<!--Custom processor required when using multivalued name fields-->
<processor class="com.basistech.rni.solr.MultiValueNameUpdateRequestProcessorFactory"/>
<!--Custom processor required when using multivalued address fields-->
<processor class="com.basistech.rni.solr.MultiValueAddressUpdateRequestProcessorFactory"/>
<!--Custom processor required when using multivalued date fields-->
<processor class="com.basistech.rni.solr.MultiValueDateUpdateRequestProcessorFactory"/>
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
Modify the /update requestHandler
to use the RNI update chain.
<requestHandler name="/update"
class="solr.UpdateRequestHandler">
<lst name="defaults">
<str name="update.chain">RNI</str>
</lst>
</requestHandler>
You can copy fragments from rlpnc/data/rnm/sample/solr8x_home/collection1/conf/solrconfig-xml-sample-fragments.xml
.
When starting Solr, you must include a java property setting that points to the root of the RNI SDK as well as increase the heap size. If you are running JDK 17, you need to enable security manager by including -Djava.security.manager=allow
in -a
options. For example:
bin/solr -a "-Dbt.root=$BT_ROOT
-Djava.security.manager=allow" -m 2g
Note
If you are using Solr 7, include -XX:+IgnoreUnrecognizedVMOptions
in -a
options instead. For example:
bin/solr -a "-Dbt.root=$BT_ROOT -XX:+IgnoreUnrecognizedVMOptions" -m 2g
Documents or records typically contain multiple names and not all are the same type. For instance, in the OFAC Specially Designated Nationals list, a record may contain a primary name and a list of akas (also known as). Ideally these would all be stored in a single Solr document to efficiently process complex queries involving multiple document fields, especially in a distributed setting.
For example, a Solr document might contain the following data:
<field name="primary">Muhammad Ali</field>
<field name="aka">Cassius Clay Jr</field>
<field name="aka">The Greatest</field>
<field name="dob">1/7/1942</field>
Solr documents may also include multiple names referring to different persons, locations, or organizations. A single news document, for example, may contain references to a number of individuals.
An address may include any of the fields in Table 9, “Supported Address Fields” below. At least one field must be specified, but no specific fields are required.
Addresses can be defined either as a set of address fields or as a single string. When defined as a string, the jpostal library is used to parse the address string into address fields.
The format to represent an address with fields consists of non-empty consecutive address fields where each field is an AddressField
's name in lower camel case (house, houseNumber, road, unit, level, staircase, entrance, suburb, cityDistrict, city, island, stateDistrict, state, countryRegion, country, worldRegion, postCode, poBox) followed by the value of the field with Hex encoded special characters preceded by the percent sign, and the value itself is enclosed with angle brackets.
RNI optimizes the matching algorithm to the field type. Named entity fields, such as street name, city, and state, are matched using a linguistic, statistically-based algorithm that handles name variations. Numeric and alphanumeric fields, such as house number, postal code, and unit, are matched using character-based methods.
Table 9. Supported Address Fields
Field Name
|
Description
|
Example
|
house
|
venue and building names
|
house<Brooklyn Academy of Music>
|
houseNumber
|
usually refers to the external (street-facing) building number
|
houseNumber<123>
|
road
|
street name(s)
|
road<Harrison Avenue>
|
unit
|
an apartment, unit, office, lot, or other secondary unit designator
|
unit<Apt. 123>
|
level
|
expressions indicating a floor number
|
level<3rd Floor>
|
staircase
|
numbered/lettered staircase
|
staircase<2>
|
entrance
|
numbered/lettered entrance
|
entrance<front gate>
|
suburb
|
usually an unofficial neighborhood name
|
suburb<Crown Heights>
|
cityDistrict
|
these are usually boroughs or districts within a city that serve some official purpose
|
cityDistrict<Brooklyn>
|
city
|
any human settlement including cities, towns, villages, hamlets, localities, etc.
|
city<Boston>
|
island
|
named islands
|
island<Maui>
|
stateDistrict
|
usually a second-level administrative division or county
|
stateDistrict<Saratoga>
|
state
|
a first-level administrative division
|
state<Massachusetts>
|
countryRegion
|
informal subdivision of a country without any political status
|
countryRegion<South/Latin America>
|
country
|
sovereign nations and their dependent territories, which have a designated ISO-3166 code
|
country<United States of America>
|
worldRegion
|
currently only used for appending "West Indies" after the country name, a pattern frequently used in the English-speaking Caribbean
|
worldRegion<Jamaica, West Indies>
|
postCode
|
postal codes used for mail sorting
|
postCode<02110>
|
poBox
|
post office box: typically found in non-physical (mail-only) addresses
|
poBox<28>
|
In the string that defines the content of an address field, place a tilde (~) after the address, followed by a comma-delimited attribute-value pair: (fielded=true
) or (fielded=false
) to specify whether the address consists of a single string or a set of fields.
The above example of a Solr document might contain the following additional data where the address is defined as a set of fields:
<field name="primary">Muhammad Ali</field>
<field name="aka">Cassius Clay Jr</field>
<field name="aka">The Greatest</field>
<field name="dob">1/7/1942</field>
<field name="address">houseNumber<3302>road<Grand Av.>city<West Louisville>state<KY>~fielded=true</field>
The address field can also consist of a single string, and the above example of a Solr document would look like this:
<field name="primary">Muhammad Ali</field>
<field name="aka">Cassius Clay Jr</field>
<field name="aka">The Greatest</field>
<field name="dob">1/7/1942</field>
<field name="address">3302 Grand Av., West Louisville, KY~fielded=false</field>
Documents or records may also contain one or multiple dates in which a format can be specified. In order to specify a date format, in the string that defines the content of a date field, place a tilde (~) after the date, followed by a comma-delimited attribute-value pair: (format=dd-MM-yyyy
) or (format=MMdd-yyyy
) for example, to specify the format to parse the date string with.
An example including a date which is defined without specifying a format:
<field name="primary">Muhammad Ali</field>
<field name="aka">Cassius Clay Jr</field>
<field name="aka">The Greatest</field>
<field name="dob">1/7/1942</field>
<field name="address">houseNumber<3302>road<Grand Av.>city<West Louisville>state<KY>~fielded=true</field>
The date field can also consist of a date string with a specified format:
<field name="primary">Muhammad Ali</field>
<field name="aka">Cassius Clay Jr</field>
<field name="aka">The Greatest</field>
<field name="dob">01/07/42~format=dd/MM/yy</field>
<field name="address">3302 Grand Av., West Louisville, KY~fielded=false</field>
You can process names with data fields. Use "|" to separate the fields. For example, "Mr|Jon|Q|Smith" has four fields. You can define names with empty fields: in "|Jon|Q|Smith", the first field is ""; in "Mr|Jon||Smith", the third field is "".
You have the option of specifying that there is an unknown value in a field. To specify an unknown name field, replace the field with *?*
.
Specifying Name Attributes
In addition to the name itself, an RNI Name object may contain attributes that you can specify when you index the name or perform a query.
Note
The entityType
field in the query must match the entityType
field in the indexed name. If the query does not specify an entity type, the indexed name must also not specify an entity type.
In the string that defines the content of a name field, place a tilde (~
) after the name, followed by a comma-delimited list of attribute-value pairs.
Examples:
When posting a Solr document:
<field name="primaryName">Muhammad Ali~language=eng,languageOfOrigin=ara,entityType=PERSON</field>
In a query:
primaryName:"Muhamid Ali~language=eng,languageOfOrigin=ara,entityType=PERSON"
In a pairwise name match:
&rq={!rniRerank reRankQuery=$rrq}
&rrq={!func}rniMatch(primaryName,"Muhammad Ali~language=eng,languageOfOrigin=ara,entityType=PERSON")
Attributes in the bt
Namespace. You can include bt
attributes as query parameters. These attributes are then used in both the base query and the reRank
pairwise match query. For example: &bt.language=jpn &bt.script=Kana
Setting Default Attribute Values. You can include name attributes in field or field type definition as defaults that can be overridden by individual name entries. For example:
<field name="primaryKoreanName" type="bt_rni_name" indexed="true" stored="true" multiValued="false"
language="kor" script="Hang" entityType="PERSON"/>
Then you only need to include these attributes in name entries when you want to override the defaults.
It is often necessary to query on other fields besides names fields, such as date of birth and address. The plugin enables the seamless integration of RNI into your Solr queries. To apply Boolean logic to queries, combine multiple fields with Boolean operators. The plugin supports all Boolean operators supported by the standard Lucene query parser (AND, OR, NOT, + , -). The OR operator is the default conjunction operator; if there is no Boolean operator between two terms (fields), the OR operator is used.
Example of a query with name and date fields:
primaryName:"Chuy Lopez A Deyas~entityType=PERSON" AND dateOfBirth:"1960-09-30"
Example of a query including an address field:
primaryName:"Chuy Lopez A Deyas~entityType=PERSON" AND residence:"road<Avenida Const. Pedro L Zavala 1957>
house<Colonia Libertad>city<Culiacan>region<Sinaloa>postalCode<80180>country<Mexico>~fielded=true"
You can include name fields and other fields in your base query in conjunction with an RNI Solr reRank
query and a custom valueSourceParser
. The base query identifies candidate documents. The reRank
query sends the top N
candidates to the rniMatch valueSourceParser
for pairwise matching. You can combine multiple fields in function queries which enable you to generate a relevancy score of those fields. The plugin supports all the functions available for function queries in Solr.
In a pairwise name match we return the maximum score of querying for primaryName
and contactName
:
&rq={!rniRerank reRankQuery=$rrq reRankMode=replace reRankWeight=1.0}
&rrq={!func}max(rniMatch(primaryName, "Chuy Lopez A Deyas"), rniMatch(contactName, "Chuy Lopez"))
In a pairwise address match we return the maximum score of querying for primaryAddress
and residence
:
&rq={!rniAddrRerank reRankQuery=$rrq reRankMode=replace reRankWeight=1.0}
&rrq={!func}max(rniAddrMatch(primaryAddress, "road<Calle Lago Cuitzeo 1394>house<Colonia Las Quintas>
city<Culiacan>region<Sinaloa>postalCode<80060>country<Mexico>~fielded=true"),
rniAddrMatch(residence, "road<Avenida Const. Pedro L Zavala 1957>house<Colonia Libertad>
city<Culiacan>region<Sinaloa>postalCode<80180>country<Mexico>~fielded=true"))
In a pairwise date match we return the maximum score of querying for dateOfBirth
and dob
:
&rq={!rniDateRerank reRankQuery=$rrq reRankMode=replace reRankWeight=1.0}
&rrq={!func}max(rniDateMatch(dateOfBirth, "01/07/42~format=dd/MM/yy"), rniDateMatch(dob, "1/7/1940"))
You can combine the score of multiple RNI fields of the same type where each field can be given a weight to reflect its importance in the overall matching logic.
For example, in a pairwise name match we can return the combined score of querying for aka
and primaryName
where aka
has a weight of 0.3 and the remaining 0.7 is assigned to primaryName
field:
&rq={!rniRerank reRankQuery=$rrq reRankMode=replace reRankWeight=1.0}
&rrq={!func}sum(linear(rniMatch(aka, "Jesus Alfonso Diaz"), 0.3, 0),
linear(rniMatch(primaryName, "Jesus Diaz"), 0.7, 0))
Setting reRank
Parameters
The RNIRerankQParserPlugin
provides parameters that you can set to customize your reRank
query:
-
reRankDocs
(an integer) specifies the maximum number of documents from the base query to pass to the RNI pairwise match.
Use this parameter to limit the number of compute-intensive name matches that need to be performed, thus decreasing maximum query latency.
-
reRankMode
("add" or "replace") specifies whether the RNI match score is added to the Solr score (the default) or replaces the Solr score.
-
reRankWeight
(a float) specifies the weighting of the maximum RNI pairwise match score when it is combined with the Solr score. This parameter is ignored if reRankMode
is set to "replace".
The RNI score, multiplied by the reRankWeight
(the Solr default is 2.0), is added to the Solr score to provide the document score that is used to determine the ordering of the documents in the result set. Use this parameter to influence the role that the RNI pairwise match plays in the ordering of the result set. If you want to prioritize the RNI score and de-emphasize the Solr score, specify a large reRankWeight
-
reRankDocsAllowance
(a float from 0 to 1) controls the general proportion of documents from the base query to pass to the RNI pairwise match. This is used at query time to dynamically determine the number of documents to rescore based on the commonality of the query name in the index. Setting this to 1.0 will ensure that the maximum number of documents (reRankDocs) are always rescored.
Use this parameter to limit the number of compute-intensive name matches that need to be performed, thus decreasing query latency.
-
scoreToRerankRestriction
(a float from 0 to 1) influences the minimum similarity score, calculated based on the results of the base query, that documents returned by the base query must have in order to be passed to the RNI pairwise match for rescoring.
-
reRankFilter
(a Solr query) further filters any results from the main query from being passed to the RNI pairwise match.
In the following example, pairwise matching is performed on the top 200 names returned by the base query, and the RNI score is multiplied by 3 before it is added to the Solr score.
q=primaryName:"Lopez Diaz"
fl=primaryName,aka,score
&rq={!rniRerank reRankQuery=$rrq reRankDocs=200 reRankWeight=3}
&rrq={!func}rniMatch(primaryName, "Lopez Diaz")
This example walks you through the steps for using the Solr 9 Admin example to perform queries.
Basic Procedure
-
Download and expand Solr 9.0.0.
-
Start the Solr webserver.
You can point it at a Solr core included in the RNI package that contains the OFAC list already indexed. From Solr-9.0.0
, run the following:
bin/solr -f -s $BT_ROOT
/rlpnc/data/rnm/sample/ofac_solr_home -a \
"-Dbt.root=$BT_ROOT
-Djava.security.manager=allow" -m 2g
-
Use a Web browser to navigate to http://localhost:8983/solr/#/collection1/query. This form provides the full interface for submitting queries in Solr Admin.
-
Submit a Solr Query
-
Fill in the q (query) textbox with a query that includes a name string and a date-of-birth range starting at 9/30/1960:
name:"Chuy Lopez A Deyas~entityType=PERSON" AND dateOfBirth:[1960-09-30T00:00:00Z TO *]
-
Fill in the fl (fields to return) textbox:
name,aka,dateOfBirth,address,nationality,score
-
Set raw query parameters to define the reRankQuery
&rq={!rniRerank reRankQuery=$rrq reRankMode=replace
reRankWeight=1.0} &rrq={!func}rniMatch(name, "Chuy Lopez A Deyas~entityType=PERSON")
-
Click Execute Query.
Solr Admin displays a response.
For this query, Solr returns the appropriate Diaz document.
{
{
"responseHeader": {
"status": 0,
"QTime": 387,
"params": {
"rrq": "{!func}rniMatch(name, \"Chuy Lopez A Deyas~entityType=PERSON\\")",
"q": "name:\"Chuy Lopez A Deyas~entityType=PERSON\" AND dateOfBirth:[1960-09-30T00:00:00Z TO *]",
"fl": "name,aka,dateOfBirth,address,nationality,score",
"_": "1631115490163",
"rq": "{!rniRerank reRankQuery=$rrq reRankMode=replace reRankWeight=1.0} "
}
},
"response": {
"numFound": 145,
"start": 0,
"maxScore":0.6820811,
"numFoundExact":true,
"docs": [
{
"name": "Jesus Alfonso LOPEZ DIAZ~uid=10353,entityType=PERSON",
"address": [
"c/o ESTABLO PUERTO RICO S.A. DE C.V.\nCuliacan Sinaloa\nMexico",
"Avenida Const. Pedro L Zavala 1957\nColonia Libertad\nCuliacan Sinaloa 80180\nMexico"
],
"nationality": [
"Mexico"
],
"dateOfBirth": [
"1962-09-30T00:00:00Z"
],
"score": 0.6820811
},
...
]
}
}
Example Using the solrj
API
You can use the org.apache.solr.client.solrj
API to integrate the RNI Solr plugin into a Solr application.
The basic steps are as follows:
-
Add bt-rni-solr8.11-plugin.jar
(distributed in rlpnc/data/rnm/sample/solr_shared_lib/lib
) to the classpath.
-
Set solr.solr.home
to a solr directory that contains a collection with a modified schema.xml
and solrconfig.xml
as described in previous sections.
-
Instantiate a SolrServer and use it to add documents to a Solr index. The documents should contain one or more name fields along with any other fields of interest. Name, address, and date fields may be multivalued.
-
Define a Solr query that involves name fields and other fields of interest, and that reranks the documents according to RNI's pairwise name match score.
-
Run the query and examine the documents that are returned.
The following sample code snippets use these imports:
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrDocument;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.common.params.CommonParams;
import org.apache.solr.core.CoreContainer;
Setup:
// Set the bt.root property to point to the RNI installation
String btRoot = args[0];
System.setProperty("bt.root", btRoot);
// Set solr.solr.home to the parent of a collection1/conf directory that contains
// a modified schema.xml and solrconfig.xml.
String solrHome = btRoot + "/rlpnc/data/rnm/sample/solr8x_home";
System.setProperty("solr.solr.home", solrHome);
CoreContainer coreContainer = new CoreContainer(solrHome);
coreContainer.load();
// For simplicity, use an embedded SolrServer rather than an HTTPSolrServer.
EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, "");
Add a Solr document with fields of interest, including name fields:
SolrInputDocument doc = new SolrInputDocument();
// Primary name field
doc.addField("primaryName", "Midiam Patricia ZAMBADA NIEBLA");
// Multivalued also-known-as name field
doc.addField("aka", "Midian Patricia ZAMBADA NIEBLA");
doc.addField("aka", "Miriam ZAMBADA NIEBLA");
doc.addField("aka", "Midian Patricia LOPEZ LANDEY");
doc.addField("id", "3");
// Entity id field.
doc.addField("uid", "10358");
// Date field
doc.addField("dob", "1971-03-04");
// Address field
doc.addField("address", "road<Calle Lago Cuitzeo 1394>house<Colonia Las Quintas>"
+ "city<Culiacan>region<Sinaloa>postalCode<80060>"
+ "country<Mexico>~fielded=true");
doc.addField("nationality", "Mexico");
// Add the document to the index.
server.add(createInputDoc());
Commit updates:
// When you have completed updates, commit the updates.
server.commit();
Define and run a query against name and other fields, using RNI pairwise matching to rerank the documents returned:
// Define a query that combines name fields and other fields, and uses RNI
// pairwise matching to rerank the documents returned.
String queryName = "Chuy A Lopez";
SolrQuery solrQuery = new SolrQuery("aka" + ":\"" + queryName +
"\" AND dob:[1960-09-30T00:00:00Z TO *]");
// Set the rerank query parser parameters
solrQuery.set(CommonParams.RQ,
"{!rniRerank reRankQuery=$rrq reRankDocs=100 reRankWeight=1}");
// Create a rerank query that uses the RNI pairwise matching function
solrQuery.set("rrq", "{!func}rniMatch(" + "aka" + ", \"" + queryName + "\")");
// Set which fields to include in the results
solrQuery.setFields("uid", "primaryName", "address", "dob", "score");
QueryResponse qResults = server.query(solrQuery);
//QueryResponse qResults = server.query(createQuery());
Define and run a query against name, address, and other fields, using RNI pairwise matching to rerank the documents returned:
// Define a query that combines name, address and other fields, and uses RNI
// pairwise matching to rerank the documents returned.
String queryName = "Chuy A Lopez";
String queryAddress = "road<Avenida Const. Pedro L Zavala 1957>house<Colonia Libertad>"
+ "city<Culiacan>country<Mexico>~fielded=true";
SolrQuery solrQuery = new SolrQuery("aka" + ":\"" + queryName +
"\" AND dob:\"1960-09-30\"");
// Set the rerank query parser parameters
solrQuery.set(CommonParams.RQ,
"{!rniAddrRerank reRankQuery=$rrq reRankMode=replace reRankDocs=100 reRankWeight=1}");
// Create a rerank query that uses the RNI pairwise matching function
solrQuery.set("rrq", "{!func}rniAddrMatch(" + "address" + ", \"" + queryAddress + "\")");
// Set which fields to include in the results
solrQuery.setFields("uid", "primaryName", "address", "dob", "score");
QueryResponse qResults = server.query(solrQuery);
//QueryResponse qResults = server.query(createQuery());
Define and run a query against name, date, and other fields, using RNI pairwise matching to rerank the documents returned.
// Define a query that combines name, address and other fields, and uses RNI
// pairwise matching to rerank the documents returned.
String queryName = "Chuy A Lopez";
String queryDate = "04/03/1971~format=dd/MM/yyyy";
SolrQuery solrQuery = new SolrQuery("aka" + ":\"" + queryName +
"\" AND dob:\"1960-09-30\"");
// Set the rerank query parser parameters
solrQuery.set(CommonParams.RQ,
"{!rniDateRerank reRankQuery=$rrq reRankMode=replace reRankDocs=100 reRankWeight=1}");
// Create a rerank query that uses the RNI pairwise matching function
solrQuery.set("rrq", "{!func}rniDateMatch(" + "dob" + ", \"" + queryDate + "\")");
// Set which fields to include in the results
solrQuery.setFields("uid", "primaryName", "address", "dob", "score");
QueryResponse qResults = server.query(solrQuery);
//QueryResponse qResults = server.query(createQuery());
Display the results:
// Print information about the documents returned with their Solr score.
for (SolrDocument rdoc : qResults.getResults()) {
System.out.println("Returned Entity: " + rdoc.getFieldValue("uid")+
"\n Name: " + rdoc.getFieldValue("primaryName") +
"\n Address: " + rdoc.getFieldValue("address") +
"\n DOB: " + rdoc.getFieldValue("dob") +
"\n Document Score: " + rdoc.getFieldValue("score"));
}
The RNI-RNT SDK ships with an example that illustrates the use of the org.apache.solr.client.solrj
API to integrate the RNI Solr plugin into a Solr application. See RNISolrjSample. This sample also illustrates a procedure for posting Solr documents from an xml file.
For convenience utilities for working with RNI names in a Solrj environment, see the Javadoc for com.basistech.rni.solr.index
.