Manage Project Patterns (Regular Expressions)

Home > selected Project > menu or right-click > Settings > Patterns
Project > Settings drop-down > Project Settings > Patterns

Requires Project - Patterns - View, Add/Edit, Delete Permissions

A Pattern (regular expression) is a sequence of characters typically used to perform a pattern match and identify patterned data during the parsing process. The Digital Reef software includes a number of predefined Patterns, called System PatternsClosed A set of preconfigured Patterns (also known as regular expressions) that match specific types of data. The content of a System Pattern cannot be edited and a System Pattern cannot be deleted. They can be enabled and disabled with or without storing values and can be copied to serve as the basis for Custom (user-defined) Patterns. They must be enabled before they are available for use. A subset of System Patterns are enabled automatically., included in the Default Project Patterns. The email, UNC, and URI System Patterns are enabled by default. In new Projects, these Patterns store values by default. Users with permissions can also create Custom PatternsClosed A locally defined pattern (also known as a regular expression) identified by its name, that is used during the parsing process to match specific data patterns. Custom Patterns can be created, deleted, enabled, and disabled. Custom Patterns must be enabled before data is added or reprocessed..

Note: Patterns apply to initial import, a Pattern update, or reprocessing. If you change the System Patterns or Custom Patterns for your Project, you can update the Patterns for a given Data Set by using either the standalone Update Patterns option (for example, by right-clicking on a Data Set in the tree), or by using the Reprocess option from results (of a search for all documents in a Data Set), which will pick up the latest Pattern changes and have them reflected in View Configuration. Until you use either of these operations, your Pattern changes will have no effect on the existing Data Set documents.

Users in a role with the appropriate permissions can view, add, and/or manage the System and Custom Patterns supported by the Project.

Patterns Summary

The Patterns summary shows information about each Pattern, as follows:

  • Name – The name of the Pattern. For data processed prior to 4.3.11.0, the Pattern name serves as a searchable Token Name, and is the name you include in searches and that can appear in the Top Terms list for a Cluster. For newly processed, updated, or reprocessed data as of Release 4.3.11.0, Tokens no longer apply, and you use the Pattern name in a pattern metadata field search to find documents that match an enabled Pattern (using the format pattern::<pattern_name>). For example, to search for the Pattern email, you type pattern::email.
  • Searchable Token Name (optional column, not shown by default) In data processed prior to 4.3.11.0, this column displays how to search for the token using the 'token-<token_name>' format. This format requires you to place the search within single quotes and specify the Token name in lowercase, since the software normalizes a Token name to lowercase. Example: 'token-ssn'. In newly processed, updated, or reprocessed data as of Release 4.3.11.0, you do not use this format and instead search for an enabled Pattern using the pattern metadata field and the Pattern name (pattern::<pattern_name>, such as pattern::email).
  • Description – A description of the Pattern, if applicable.
  • Enable – A check mark indicates that this Pattern is enabled.
  • Store Value When enabled, as indicated by a check mark, you can search for an enabled Pattern as well as individual Pattern values.For data processed prior to 4.3.11.0, both the tokens and the individual values are added to the system dictionary and are available for search and clustering operations. For documents processed as of 4.3.11.0, this means you can search for an enabled Pattern using the pattern field and search for a Pattern value using the patternvalue field. If values are not stored for an enabled Pattern, documents processed prior to 4.3.11.0 have the token applied, which means that you can identify the documents that contain matching data but you cannot search for specific values that triggered the match.For documents processed in 4.3.11.0, this means you can search for an enabled Pattern using the pattern metadata field, but you cannot search for specific Pattern values using the patternvalue field. To change settings for a Pattern, edit the Pattern.
  • Pattern – The contents of the Pattern.
  • Created By – The login name of the user who created the Pattern. System indicates a System Pattern.

Store Values Tips:

  • Control characters, such as newline, tab, and so forth, can be stored but they are not searchable. When you search for Pattern matches that include control characters, you need to use wildcards to represent the control characters.
  • Patterns that can return long matches might not be the best candidates for storing values. Patterns that are concise and that would not span lines or have a lot of embedded control characters are better candidates for value storage and subsequent searching.

New and Selected Pattern Options

To add a new Pattern (a Custom Pattern), use the top-level New Pattern option.

For a selected Pattern, right-click the Pattern or click the ellipses at the far right to see a menu with the following options, as long as you have permissions to perform those actions (actions that are not permitted will be grayed out):

Note: The Copy, Edit, and Delete options require that you first select an item in the list. See Add, Edit or Copy a Pattern for more information about adding, editing, or copying a Pattern.

  • Copy – Creates a new Custom Pattern by copying a System Pattern or an existing Custom Pattern. A user with permissions can perform a copy of regex content with standard Ctrl-C operations.
  • Edit – Enables you to edit a selected Pattern. A user with appropriate permissions can edit all fields for a Custom Pattern. For a System Pattern, you can edit only the Enable and Store Value options.
  • Delete – Deletes a Custom Pattern upon confirmation. A user with permissions can delete a Custom Pattern. System Patterns are not eligible for deletion.

Patterns: Save to and Load from Template Options

If you have the appropriate permissions, you can save your Patterns settings to a template or load Patterns settings from a template. To do this, click the ellipses to the right of Patterns in the tree of eDiscovery Project Settings, as follows:

Note: Save to and Load from Template operations for this setting observe an "append" behavior with regard to Custom Patterns. For example, for a Load from operation, your current Custom Pattern settings are preserved and only new, unique items from the source template/settings are added. Items with any name collisions are not added. Note that other settings, such as Index settings and Analytic settings, observe an "overwrite" behavior instead.

  • Save to Template - If you have Add/Edit permissions to Patterns templates in the Organization, you can use the Save to Template option to save your current Patterns to a selected Organization template. You can either select an existing Organization template (including the Default Patterns template), or you can select the top-level (New Template,) which launches the New Template dialog.
  • Load from Template - If you have Add/Edit permissions to Patterns in the Project, you can use the Load from Template option to load settings from a selected template (from a list of available Organization templates). The loaded settings then appear and are saved automatically. Note that loading from a System template requires System-level View permission for a given Setting. (This means you must be a System User in a role with at least View permission to see a list of System templates for a particular type of template.)

Usage Notes

  • Parsed Patterns can affect all data in a Project. The enabled Patterns affect how data is parsed.
  • Enabling Patterns increases processing overhead and the disk space required to support the Data Set. Storing values can increase disk space requirements.
  • In general, you should enable Patterns before adding documents to a Data Set or before reindexing a Data Setbecause that is when the parsing operation takes place.

See also: