The gazetteer and regex files for customizing entity extraction are in the
Figure 4. Customization File Structure
The structure for both regex and gazetteer entries are the same. For each type of customization, the files are organized by language. The
xxx directory contains files that are not language-dependent, but are relevant for all languages. Each language directory can have both
reject subdirectories, to determine whether matches are added or removed from the entity extraction.