Set Up and Perform a Bulk Search

Main search bar in Project > Bulk

While working in a Project, you may find it useful to test the responsiveness of a known set of words or phrases against a given Search target. You can use the Bulk Search option to perform a set of Searches based on either a list of Searches entered in a Queries box, or an uploaded text file with the queries you want to perform. A Bulk Search expects one query per line in the Queries box or the file.

Each query in a standard Bulk Search runs as a separate Search against the target selected for your Bulk Search. (Running a Metadata Bulk Search or running a Bulk Search as a Combined Search will behave differently, as described later in this topic.) When you run a standard Bulk Search, a parent Bulk Search task appears in the Work Basket, and you can watch the progress of each individual Search that is part of the Bulk Search. A parent Bulk Search node also appears in the tree, and each Search appears indented under this parent node. The parent Bulk Search node controls display of the Bulk Search Report that is generated after all searches in the Bulk Search are complete.

What You Need to Know

Before you run a Bulk Search, you should be aware of the following:

  • Your Search syntax is determined by a Project Search Setting.  By default, a Project uses Standard for the Project Search Setting, which means that you must use the Standard Search Syntax to form your Searches. See Standard Search Query Syntax for information on how to build Standard syntax queries. You may want to become familiar with this syntax before performing a Search other than a simple term search.
  • Your current view serves as the default Target of your Search. When you first enter a Project, you can select an available view, such as Project Data. As you work within the Project and create items such as Folders, you can always use the Navigation Tree to change to the view you want before performing a Search, or you can use select a Target to Search. You may want to search the entire Project Data, or you may only want to search the data associated with a given Custodian, a set of search results, or a Folder you created to contain documents of interest. The Select Target option enables you to select from a list of targets, such as a Custodian, a Folder, a Tag view, or a search result.
  • You always see a Bulk Search Report when you issue a Bulk Search (unless you run the Bulk Search as a Combined Search). The report enables you to see a summary of the Bulk Search automatically and tag one or more searches from the Bulk Search.
  • Bulk Search enables you to control Search Settings that affect the results of your Search, such as Include Metadata (enabled by default) and Include Families (enabled by default). See Bulk Search Options for more information.
  • You can control the collapsed or expanded state of the Bulk Search area. To collapse the Bulk Search area, click the up arrow that appears on a gray panel resize bar beneath the checkbox options (center of screen). To expand it again, click the down arrow that appears on the gray panel resize bar (center of screen).

How to Run a Bulk Search

A Bulk Search involves these steps:

  1. Select where you want to Search. Make sure you are in the context of the view you want to Search, or click Select Target, which launches the Select Target dialog and enables you to select a Target. When you make your selection, click OK. You will then see your Target selection in the box.
  2. Either type a series of queries in the Queries box, one query per line,or browse to upload a file, or, if you have the appropriate permissions, use a Connector file location. (A Connector file location is beneficial for a large text file of queries or metadata field values.) Without the appropriate permissions, you can use a local file or browse to select and upload a file with the queries you want to run.
  3. Validate your queries by clicking the button. This step is now required before you can perform the search.
  4. Click to run the search against your selected target or current view/results. (The target is shown in the target box.) If no target is selected or applicable, Project Data is used. See Select a Search Target or Search within Current Results for more information.

See Use the Standard Search Query Syntax for Basic Queries for detailed information on how to build Standard syntax queries using the other primary Digital Reef search methods:

Bulk Search Options

You can perform the Bulk Search by typing queries, cutting and pasting queries, or uploading a text file with queries. If you have permissions, you can also use a file at a Connector file location.

Key to your setup of the Bulk Search is to evaluate whether you need to use a Bulk Metadata Search format.

Evaluate By Metadata Field Option for a Bulk Metadata Search

By Metadata Field (disabled by default) — Selecting this option enables you to perform a Bulk Search based on the values you supply for one of the available metadata fields. For a large number of supplied values, consider using a Connector file to upload the values. When you select this option, the following restrictions apply:

  • You cannot select or control the Expand Synonyms, Include Metadata, or Run as Combined Search options.

  • You cannot include any regular (term-based) queries.

  • The Generate Reports option is unavailable.

To use the By Metadata Field option , select one of the following metadata fields (all untokenized) and specify the appropriate format:

  • File MD5 — Searches for filemd5 field values. A document's file MD5 is a cryptographic hash that represents both the content and the embedded metadata for the document. This field is not case-sensitive for search purposes. Example: 0be5d0edc5e9d870added630f78ce091
  • Message ID — Searches for messageid field values. A message ID is a unique alphanumeric value that identifies an email message, and can be used as part of an email deduplication strategy (set using the Project Analytic Settings or Organization Analytic Settings template). This field is case-sensitive for search purposes. Example: <302354BC.2D065C93@example.com>
  • Entry ID — Searches for entryid field values. An Entry ID is extracted directly from a Microsoft Outlook PST files. It is a 24-byte value (48 characters) in which the first 4 bytes are flags and zeroed, the next 16 bytes are the Provider UID of the PST, and the next 4 bytes are the internal identifier for the entry. This field is case-sensitive for search purposes. Example: 00000000331e9d6c9614304e97a4c0bd1859251ce4042000
  • UNID — Searches for unid field values, which apply to files extracted directly from Lotus Notes NSF files. A UNID is a 16-byte value (32 characters) in which the first 8 bytes represent the file (database) component and the second 8 bytes represent the Notes component (essentially, both components are internal timestamps). This field is not case-sensitive for search purposes. Example: 615e68637a552aaa8525755b00517535
  • Dupefingerprint —Searches for dupe_fingerprint field values associated with Project Data and views of Project Data. (This search does not apply to a view for a Data Set, or all Imports.) The dupe_fingerprint value is computed according to the Email Deduplication Settings. For files that are not email, the dupe_fingerprint value is always the filemd5 value. This field is not case-sensitive for search purposes. Example: e1a97bc0fb62039fbe752f3478d863e4ject
  • Document Handle — Searches for handle field values. A handle is a unique value assigned to each document. This field is not case-sensitive for search purposes. Example: 042195748222da695ed5d190d0540961d034105154a44bd48afe18424c0d93e9
  • Document Number — Searches for docnum field values. The docnum value is a three-part number in the format C.V.N, where C =a Data Collection (Data Set) number, unique per Organization, V =a Data Collection volume ID, unique per Data Collection, and N = a document number, unique within the Data Collection volume. When searching this field, specify the entire value, since wildcards are not supported; you can also use a range search. Example: 3.0.900

Note: A Bulk Search that is performed using the By Metadata Field option has its own Search Results report, a Bulk Metadata Value Summary available from the Reports tab. (A regular Bulk Search without this option generates the standard Bulk Search Report.)

Use the Queries Box

You can use the Queries box to enter a series of search queries (one per line), or paste a series of search queries copied from a file. These may include clauses consisting of simple terms, phrases, field searches, or other forms of supported syntax. You can also populate the Queries box with queries by uploading a local text file of queries (one per line).

To clear a line in the Queries box, click the that appears at the far right of the line in the Queries box.

To delete a given line, click the icon to the right of the line.

To clear all entries in the Queries box, click Clear All. Note that if you have selected a Connector file, this option also clears the Connector file selection and location/name information from the box and returns the Queries box to its default state.

You might want to clear the Queries box if your Queries box has many searches that need editing after you click Validate and discover many invalid queries. In this case, you may want to clear the box, modify your original file and repeat the upload.

Note: When working with the Queries box, whether you type in your Queries a line at a time, cut and paste queries, or upload a file with queries, you can only edit the queries in the box one line at a time (or delete one query at a time). The active line is highlighted for you once you click in it. A horizontal scroll bar appears if a query in the box exceeds the typical line length, so that you can see the entire query. You can delete highlighted text, backspace to delete an empty line, click Return to get a new line, and use the Up and Down arrow keys to navigate among lines.

The Queries box can support up to 5000 lines as a maximum. When the maximum is reached, you see the following message, and you must decide how to proceed:

You have reached the Queries box limit. Perform the Bulk Search using a Connector File instead. If you do not have permissions to use a Connector File, contact your Administrator.

To ensure that the search includes all of your queries, you should perform the search using a Connector File instead. However, if you proceed to run the search after seeing this error, be aware that your search will only include the first 5000 lines.

Load Queries from a Local File or Connector File

Note: The Connector file option appears only if you have the appropriate permissions (the Connector Access permission).

Load from

  • Local file... — Select this option to launch a popup that enables you to navigate to a text file on your local computer or network location and upload that text file. The software will read the contents of the text file with line-delimited search queries. Each line in the file will appear in the Queries box as part of the search and will generate a Work Basket entry.
  • Connector file... (only available for users with Connector Access permissions) — Select this option to use a text file with queries or metadata field values from a selected Connector and Data Area. If you select this option, you must use the Select Connector File popup. After you select a text file and close the popup, the location/name of the text file is displayed in the Queries box. (The individual queries from the file do not appear, however, since a Connector file is generally expected to contain a large number of queries or metadata field values.) Note that a text file with queries cannot be over 100 MB in size, or you will get an error stating the file exceeds the limit. (This size limit does not apply to By Metadata Field Connector file searches.) If you plan to use this option with the Run as Combined Search using option, review the information for that option first.

Note: If you plan to perform a Bulk Search with language-specific characters, such as Spanish characters, make sure that your file uses UTF-8 encoding.

Use the Bulk Search Checkbox Options

  • Run as Combined Search using — Select this checkbox to run your Bulk Search as a Combined Search. This option is not available if you select the By Metadata Field checkbox. Both local file and Connector file upload are available for this option. If you select this option with a local file (or typed queries), there is no parent Bulk Search Report item generated in the tree; instead, the search will appear in the tree as a Freeform search. If you select this option with a Connector file, the Connector file queries are subject to chunking, so you will see the parent Bulk Search Report entry and one or more child entries for the chunked Connector file queries, by default on the system. When you select this checkbox option, you select the operator, as follows:
    • {OR | AND} — From the drop-down, select the appropriate operator to use between search clauses. The default is OR.
  • Include Families (enabled by default) — This checkbox option is enabled by default to ensure that all available family members of a Message Attachment Group (MAG) or Document Attachment Group (DAG) are included in the results of the Search operation. This includes a selected parent email, parent document, associated attachments, and embedded messages or documents. For example, with this option enabled, a search that returns a parent email in the results also includes that parent's attachments and any associated embedded files. Similarly, if a document attachment appears in the results, its other family members, such as its parent, also appear in the results. Disabling this option causes the results to include just the selected documents, not the entire family (MAG or DAG).
  • Include Metadata (enabled by default) — This checkbox option is enabled by default to expand the search of each keyword in a query to include a set of metadata fields as well as content. You can select the Search Fields you want to have searched automatically. See Using the Include Metadata Option for a list of the default fields searched. When the Include Metadata option is enabled, all individual keywords as well as the keywords in phrases are subject to expansion. By default, the Include Metadata checkbox option is Enabled for Freeform, Advanced, and Bulk Search. When this option is enabled, you can control the expansion on a per-term basis and limit a search with a given keyword to just content by specifying content::<keyword> for a given keyword or content::(<keyword1> <keyword2> <keyword3>)for a group of keywords. Note that for emails, a content:: search applies to both the email subject and the email body. Disable this option if you want to limit the entire Bulk Search to content only (for example, if you are specifying queries with language-specific terms, since Language Detection applies to content only).
  • Expand Synonyms (disabled by default) — You can decide whether you want to include synonyms for all specified individual search terms when you submit your query. By default, the Expand Synonyms option is disabled. Select the checkbox to enable it. (Note that support for Expand Synonyms is included in the installation process; contact an Administrator if this feature appears unavailable.) If you use this option, you can check the Search Results to ensure that your Search had Expand Synonyms enabled. Note that Expand Synonyms works for information in the contents, title, subject, edocsubject, comment, or comments fields. It will not work for terms accompanied by ~ or other syntax used for special searches. This option is not available if you select the By Metadata Field checkbox. Also, there is no metadata expansion performed on the synonyms that become part of the search, just on the user-provided terms.
  • Generate Reports (disabled by default) — Select this check box option to have reports generated at the time of an individual search in the Bulk Search instead of waiting until the Reports tab is selected for that search. This option is disabled (cleared) by default. Once you select the By Metadata Field option for a Bulk Metadata Search, this option is disabled and cannot be selected.

— Click this button to validate the queries in your Bulk Search before you submit the search. This is required for a standalone Bulk Search. When you click this button, each query in the Queries box is validated (whether you uploaded a file with the queries, cut and pasted queries, or added each query manually). A given validation generates a Work Basket task (Validating Bulk Search Query).

Note: When performing validation of queries shown in the Queries box, each query displays a line number to help you keep track of each query, and you will see messages about the status of the validation, as described in the following bullets. When there are queries with warnings or errors shown in the Queries box, or queries with warnings or errors in a Connector file that you have specified, the Validating Bulk Search Query task in the Work Basket will use the icon to indicate that the task completed with warnings/errors. Hovering over the icon changes it to a download icon, which you can then click to download a WARNING_DETAILS_REPORT.csv file that contains columns for the Line Number, Error, and Query. The Error column will identify the appropriate syntax error or the warning message Warning: This query contains the Unicode replacement character. If these characters are not intended, change the query or encoding of the file before proceeding. UTF-8 is expected.

  • If all queries are valid, you will see All queries are valid. You can then click the to clear the error message. This status line indicates that you can proceed with the Bulk Search knowing that your queries are constructed properly.
  • When validation reveals that there are queries with warnings and/or errors, you will see the status message <count> queries contain warnings/errors. You cannot proceed with the Bulk Search until you address the warnings and/or errors and re-validate successfully. To navigate through and fix each warning and/or error, use the (previous and next) controls shown in the message. These controls find the previous or next warning/error that is not currently visible.
  • A line with a search syntax error will have the icon, and will be highlighted in red. Hovering over the icon on a line will display information about the syntax error.
  • A line with a warning will have the icon, and will be highlighted in a pale yellow. Hovering over the icon will display the warning message for the presence of a Unicode replacement character in an uploaded local file or a Connector file. You should evaluate whether this character was intended and address it before re-validating. You may want to change the encoding of the file before proceeding. UTF-8 encoding is expected.
  • A line with both an error and a warning will display the icon (with the line highlighted in red, and hover text for the syntax error). When the syntax error is addressed and validation is performed again, the line will then display the remaining warning by showing the icon (with the line highlighted in yellow, and hover text about the presence of a Unicode replacement character).
  • When you are done addressing all of the warnings and/or errors, another message will appear with the icon and the text Please validate your modified queries. Upon successful re-validation, you can then run the search.
  • Note that Bulk Search will handle any non-breakable space characters that may be present in a supplied file so that these characters do not impact your results. Validation will not fail for these characters.

Note: If your search query turns out to be overly expansive and running it would match more terms than the current allocated memory can accommodate, you will see the error message: Search query [(<query>)] Query Expansion Exceeded Memory Limit of <value>. In this case, please review your search terms and modify your query to be less expansive. In general, it is best to avoid an overly broad use of wildcards. If your Search query is the equivalent of having a standalone wildcard, the search will generate an error with a message that identifies the first occurrence of the error in a clause: The provided query has a phrase containing a standalone wildcard, potentially due to characters treated as spaces. Please adjust and rerun the query (In Phrase: <phrase> At position <value>). This error prevents your query from expanding to more terms than the system can reasonably accommodate based on resources. If your search encounters an Out of Memory condition, the search will fail with the error message: Your search could not be performed due to insufficient memory. Please contact your System Administrator.

Select a Search Target or Search within Current Results

In a Project, you use the Navigation Tree to move to an available view based on your permissions, such as Project Data. In general, your current view serves as your default target and is shown in the target box. (If you do not have a view selected, Project Data is the default target.) If you do not want to use the current view as your search target, change to the view you want before performing the search, or use the main search bar to select a different target before you run the search. For example, you may want to search the entire Project Data, search within a current set of search results, or search the data associated with a given Custodian or a Folder you created to contain documents of interest.

To select where to search for a given view, you can use the Select Target split button from the main search bar to pick one of the following:

  • Choose Select Target...if you want to launch a popup that enables you to select from a list of available targets, such as Project Data, a Custodian, a MediaID, a Batch, a Folder, a Tag view, a search result, or a Workflow step. The list depends on your permissions. For example, if you have the appropriate Data Set permissions, you can use a Data Set under Imports as a search target. Note that if you if you perform a Metadata Bulk Search by selecting the By Metadata Field option for the Bulk Search, the Filter by Data Set/s option in the target picker does not apply. When you make your selection, click OK. (As an alternative, you can double-click a target, which selects the target and closes the dialog.You will see your selection in the box.
  • Use the down arrow shown in the button to choose Current Results if you want to search within a current set of search results. When you select this option, the target box changes to show the current results (with the query and its associated target) and enables you to enter a query as a Freeform Search. For example, if you search for the term legal in Custodian Bob, you can click Current Results and type a query such as court and then search the results view, (legal) in Bob, instead of searching the Custodian Bob view again. This option is not available for non-result views.

After you have performed validation and selected a target, click to start the Bulk Search.

Usage Notes

  • The Bulk Search Report opens automatically after the Bulk Search runs. (An individual Search result associated with a Bulk Search does not open automatically upon completion, unlike a regular Search Result.)
  • Within the Report, you can double-click a particular Search task that is part of the Bulk Search to see the results of that query.
  • When you execute the Bulk Search operation, the software issues a Search for each line in the text file or line that you have entered manually.
  • Each Search line generates an entry. You can monitor the status of the individual Search results from the Work Basket, and you can use the Work Basket entry for the Bulk Search task, which provides the Summary Report.
  • Failure of an individual search in a Bulk Search does not prevent tagging of all other valid searches in the Bulk Search (that is, you can select all searches for tagging in the Bulk Search Report, or you can tag the entire Bulk Search Report from the Work Basket task). The tagging operation will ignore any failed search in the Bulk Search. (The Bulk Search Report identifies a failed search with a value of -1 for the number of results.) Tagging does not apply to an individually selected failed search, or to a search that has zero results.

Bulk Tagging Methods

There are multiple ways for you to manage bulk tagging for a Bulk Search:

  • From the Work Basket:
    • You can select a search from a Bulk Search and click Add Tags to launch the Tag dialog and tag all of the results in that search. You can also remove Tags applied to the search.
  • From the Bulk Search Report:
    • You can use the top checkbox to Tag all of the searches using the Add Tags option.
    • You can use the individual checkboxes to select the searches you want to Tag using the Add Tags option.