Document Information for Data Set or Imports Search Results

Search History > Selected Search

If you have permissions to perform searches of a Data Set or all Imports, you can run a search of these views and see the search results under Search History. The results appear automatically as the top search under Search History after the search completes. By default, you see an All Docs tab with all documents in the results set, document information, and actions for the results.

This topic focuses on the columns and options that generally apply to a results document list for a Data Set or Imports search (that is the results of a query based search performed on a Data Set or all Imports).

Note: Within the Document Viewer for a document in a term-based search result, each term in a query is highlighted in a different color. Metadata searches do not support highlighting.

Supported Tabs in Data Set or Imports Results

You can use different tabs to display a list of all result documents, or information for specific items such as Communication Grid, Domain Grid, and Reports:

  • All Docs — A general view of all documents (depending on what you have in focus, for example, all documents in a Data Set results view (for users with permissions). You can double-click a document to open the Document Viewer either inline or in a new window. The All Docs tab offers the superset of toolbar options; the other tabs may not offer all options.
  • Communication Grid — This tab shows you a Communication Grid that helps you analyze the email communication for the current view and see how many emails were sent from a given email address to another email address. It shows the Top 50 email address FROM and TO combinations and lets you make grid selections.
  • Domain Grid —This tab shows you a Domain Grid that helps you analyze the sending and receiving email domain information for the current view and see how many emails were sent from a given domain to another domain. It shows Top 50 FROM and TO domain combinations and lets you make grid selections.
  • Reports — This tab shows you the appropriate report information for the Data Set results view.

Note: A Loading message appears while documents are being loaded into a view. In a large Data Set or view, the documents list (or sorting or report generation) may take some time to complete. If you see the Loading message for a while, you may want to go to another view, perform other operations, and return to this view later.

Document Information for Data Set or Imports Result Views

By default, the All Docs tab for a Data Set or Import results view provides the following columns, with information about each document:

  • Doc Number – A three-part number representing the Document Number in the format C.V.N, where C =A Data Collection (Data Set) number, unique per Organization, V =A Data Collection Checkpoint Value, unique per Data Collection, and N = A document number, unique within the Data Collection Checkpoint. When searching for a Document Number using the docnum metadata field, specify the entire value, since wildcards are not supported for this field. You can also use a range search. Example: docnum::[3.101.50000~~3.101.60000]. Family members (a parent and its children) have sequential document numbers.
  • Family /Thread – Enables you to identify whether a document in a Data Set or all Imports results view is part of a Family. You can open a Family by clicking the Family icon for the document. (Threads apply only to views of Project Data.) Note that this column is not included in downloaded manifests.
  • Name – This column displays key information about the file. The information displayed depends on the type of file:
    • The icon indicates that the file is a document found on disk and is followed by the filename. Note that embedded documents extracted during import are assigned a filename in the format <parentfilename>_OLE_<value>.<ext>. Embedded documents include Microsoft Excel files and text files. For example, for an Excel (xls) file that is the first OLE child linked to a Word document named spreadsheet.doc, the OLE filename would be spreadsheet.doc_OLE_1.xls. Parent OLE documents (Message attachments with Message OLE attachments or eDocs with eDoc OLE attachments) will include the text of each OLE document, and each OLE document in the parent document is separated by a row of dashes.
    • The icon indicates that the file is an email and is followed by the subject line of the email. This applies to email messages, calendar items, as well as journal entries and tasks. (Note that embedded images are not extracted for MSGs or EMLs or eDocs during import by default, but are identified in the embeddedchildren metadata field with a value of image.)
    • The icon indicates a directory, followed by the directory name. Directories, when they appear in a document list, are not useful for viewing in the Document Viewer, as they are identified as an empty file, followed by the filetype of directory and the parsing status. Moreover, a directory cannot be downloaded.
    • The icon indicates that the file is an attachment. An attachment is shown indented under its parent in a non-search results view.

Note: A document's file extension will not always reflect the document's real file type. For example, a mydoc.txt file may actually be an MBOX from which emails are extracted. You can rely on the Digital Reef software to detect the correct file type, which you can verify in the document metadata.

  • To – For emails, this is derived from the display name, if available (for example, Joe Jones), or the email address of the email sender and recipient (for example, jjones@someco.com). Each recipient is separated by a comma or semicolon, depending on the source data.
  • Size – Shows the document size on disk.
  • Date For Data Set or All Imports Results document lists, this column reports either the last modified date for files or the sent date for emails in the format yyyy-MM-dd-HH-mm-ss. The date information is shown according to the Project time zone, either the default time zone of Coordinated Universal Time (UTC), or a time zone selected using the Project Preferences.
  • Score (default sort column for search results) – This column shows how relevant a document is to the query or similarity search. Documents are sorted according to their Score with the most relevant documents (higher Score) listed first. A Score can be in the range 0 to 100, and the value appears in the Score column to two decimal places (for example, 18.33). The score is always 100 (100.00) for duplicates. Score for a Cluster rates how relevant a document is within a given Cluster. The first document listed for a given Cluster has a Score of 100 and is the seed document. Duplicate documents in a Cluster all have a Score of 100. Note that this column is now populated in downloaded manifests.
  • Group – For a Find Exact or Content Duplicates of a view operation (for a selected view, identifies groups of duplicates). The assigned identifier makes it easy for you to locate and analyze groups of duplicates. Note that this column is not currently populated in downloaded manifests.

Additional columns for the Data Set or Imports Results :

The following columns are hidden by default for a Data Set or Imports result view, but you can change your column selections to display them, and you can change the column order by dragging a column to the desired position:

  • Attachments – For an email message only, indicates the number of attachments (childcount) that this email has. This field remains blank for documents that are not email messages.
  • Is Attachment – For a document that serves as an attachment to an email message, confirms whether or not this document is an attachment by showing true or false. Note that this field will show true for any direct attachment as well as any files associated with the attachment, such as an embedded file or .zip file. This field remains blank for regular documents that are not attachments or associated files associated with an attachment of an email message.
  • Author – For a document, the author of the document, if author information is available. For an email, the person or entity responsible for sending an email (derived from the from field information).
  • Scan Date – Time stamp of when this document was scanned and added to the system in the format yyyy-MM-dd-hh-mm-ss. For example, an imported file might report 2022-05-02-20-07-43 for the scan date (based on the metadata field datescanned). You can see where the document came from by viewing its metadata.
  • Date Created – The document creation date in the format yyyy-MM-dd-hh-mm-ss (for example, 2021-03-10-20-19-25). The corresponding metadata field is createdtime, which applies to NTFS with CIFS (for example, an import from a CIFS Connector).
  • Import Path – The import path for the document, which includes the import location label and/or the method of import and archive information, if applicable.
  • File Type – The identified file type for the document (for example, Microsoft Word 2000, Adobe Acrobat (PDF), or Internet HTML).

Note: When you change column selections and/or position for a view, your current selections are retained for that type of view for the duration of your session. This enables you to keep your column preferences for a given view type in effect as you navigate to different places in the application. For example, if you make column selection and/or position changes for a view such as a Data Set or Imports result view, you can open the Document Viewer and see those selections, then close the Viewer and still see your selections. Your selections are maintained whenever you switch from one view to another view of the same type (for example, you switch from one Data Set or Imports result view to another), even if you move to another type of view in between. For example, if you change column selections and positions for a Data Set or Imports result view, then move to a Project Data-based view (which shows its column selections), and then move to another Data Set or Imports result view, you would still see your Data Set result view column selections and positions.

Note: You can change the sort order of the Documents list by clicking in any column heading to toggle the sorting order of that column or you can change which column determines the sorting order by putting your mouse pointer over a column heading and then clicking the down arrow that appears in the right corner to display the pop up menu that shows sorting options. When you sort by a column title, you affect only the documents listed in the current page.

Document Menu Options for Data Set or Imports Results

Once you select one or more documents in the Data Set or Imports result view, use the Document drop-down menu to see a list of available options based on permissions. For more information about operations and their associated permissions, see View and Manage Role-Based Permissions.

Note: If you perform an operation that adds selected documents in a Data Set or Imports results view instead of the entire view to Project Data, the software does not run the Exclusion Searches, which by default exclude archives, directories, disk images, and NIST files from Project Data. This means that if you select one or more directories, archives, disk images, or NIST files and add/save them to Project Data, or tag them, they will become part of Project Data.

For options that require an entire view, use the right-click options for the Data Set or Imports results view in the Navigation Tree.

Note: For operations that require you to select a target Folder or other view, be aware that the available target options change based on your context. For example, if you are removing documents from a Folder, you cannot create a new Folder.

  • Add Tags... – Launches the Tag dialog, from which you can select Tags to apply. You can also create a Tag and use it right away. If you Tag documents in a Data Set or Imports result view, the software adds the documents to Project Data and performs the tagging.
  • Remove Tags... – Launches the Tag dialog, from which you can select Tags to remove.
  • Add to... – Enables you to add documents to a selected Custodian, MediaID, Batch, or Folder view in Project Data based on permissions. For more information, see Add to or Remove Documents from Select Project Data Views. For more information about managing Custodian views, see Manage Custodians and Data Assigned to Custodians.
  • Add to Project Data – Adds the selected documents in the Data Set or Imports results to Project Data. The backing Data Sets for the results must be at a Content or Analytic Index level. This command is unavailable for results whose Data Set is at the File Metadata or System Metadata Index level. Note that when you select this operation, the software performs Custodian, MediaID, and Batch viewgeneration for all documents in Project Data, not just the documents to be added with the results.
  • Remove from... – Removes selected documents from a selected Custodian, MediaID, Batch, or Folder view in Project Data. The documents are still available within the Project, they just no longer reside within that view. Removing documents from a given named Custodian (or MediaID or Batch) automatically reassigns the documents to the Unassigned view of that type. (Removing documents from an Unassigned view is not permitted; if you want to assign documents from Unassigned to another view such as a Custodian, perform an Add to operation to the appropriate view. For more information about managing Custodian views, see Manage Custodians and Data Assigned to Custodians.)
  • Remove from Project Data - Removes the selected documents from Project Data entirely, including the Discard Pile, if the selected documents reside there. A Work Basket task called Removing Documents from Project Data reports the results. Documents removed from Project Data/Discard Pile are still available in the appropriate Data Set in the Project, in the event that you need to add them to Project Data again, but the documents no longer have any Project Data information that was previously applied, such as Tags.
  • Download as PDF – Enables you to download a single document, multiple selected documents on a page, or all documents in the view as a PDF to your local environment so that you can view the documents in PDF format. When you select this operation, you can select the Stamp Document Number option if you want to include a stamp with the document number (docnum) on the bottom right of each page in the PDF. If you select the top checkbox to save all documents as PDFs, you will see a Warning popup that states the following: You are attempting to download all documents in this list as PDFs. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the PDFs directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. Whether you select one, multiple, or all documents to download, the software will prepare a ZIP file, by default named <projectname>_PDFs.zip. An information popup indicates that the PDFs are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that certain file types are ignored for PDF generation, including any selected directory folders not removed from your Project during setup by your administrator, disk images, file archives, mail archives, empty files, and files for which the native is not available. A WARNING_DETAILS_REPORT.csv file identifying the files that were skipped or failed PDF generation can be downloaded from the appropriate PDF-related Work Basket task. See About Downloading Documents as PDFs and Natives for more information.
  • Download Native – From the Exceptions tab or All Docs tabs, enables you to download a single document, multiple selected documents on a page, or all documents in the view to your local environment so that you can view the documents in their native format. Any selected directory folders are ignored for the download. A WARNING_DETAILS_REPORT.csv file identifies any native files that were not downloaded. (See About Downloading Documents as PDFs and Natives for more information.) If you select the top checkbox for all documents, you will see a Warning popup that states the following: You are attempting to download all natives in this list. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the natives directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. Whether you select one, multiple, or all documents to download, the software will prepare a ZIP file, by default named <projectname>_Documents.zip. An information popup indicates that the documents are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket.
  • Update Metadata... – From an all Imports or Data Set search result only, enables you to update the Custodian (directory location), MediaID, or Batch value using the Update Metadata dialog.
  • Reprocess... (search results only, enabled for all docs or individual document selection) – Launches the Reprocess dialog, which enables you to reprocess the search results set using selected reprocess options. Reprocessing causes the software to rerun the parsing and indexing of the eligible documents based on the selected reprocess options. See How to Perform Document Reprocessing for detailed information about reprocessing. From the toolbar, you can select all the documents to reprocess, or select a subset of documents. You may want to use this option after you inspect the Warning and Errors section of the Data Set Scan Report and notice that you have many damaged, encrypted, or password-protected files that could be reprocessed after the situations have been addressed (for example, you have configured password-cracking options and supplied password files, repaired a PST, or decrypted an NSF file). You may also perform a Search and find that you need to reprocess certain documents due to a parsing change (and therefore get updated metadata information). In the first situation, you drill-through the Damaged, Encrypted, or Protected entries in the Warning and Errors section, or you can search for them using the parsing status (for example, parsingstatus:00027 for encrypted files, parsingstatus:00028 for damaged files, and parsingstatus:00029 for protected files). From the drill-through Search results, you can select files or all files and select Reprocess. After reprocessing, you check the report again. For example, you may see that a repaired PST now has children that have been added to the Index or you may have performed password cracking for encrypted/protected PDFs, ZIP or RAR files, or Microsoft Office documents. See Configure Password Cracking for Reprocessing. See Add and Manage Container Key Files.
  • Delete File(s) from Project – This option deletes the selected files or all files from the appropriate Data Set (or from Imports) result view when the files are no longer needed in the Project. Because this deletion is permanent (that is, you would have to perform an import of the affected Data Sets again), a caution is displayed for this operation. (This operation does not delete the actual files from the import location, just from the affected Data Sets in the Project.) Note that the files you select cannot be in Project Data (that is in the Project Data view, or the eDiscovery Project Discard Pile) or from a Shared Data Set; if they are, the operation will fail. See How to Delete Files from a Data Set or All Imports for more information.

You can also use the following options from drill-through search results of an entry in the appropriate Report:

You can also use the following option from drill-through search results of an entry in the OCR Candidates section of a Data Set Report, or from an all Imports Search Results Report (the option is enabled for all docs or individual document selection):

  • OCR...  – This option applies to documents that have been identified as OCR Candidates by the OCR Candidates chart in the Data Set Report or the all Imports Report (for Data Sets at the Content or Analytic Index level). This option launches the Select OCR Settings for Processing dialog and uses those settings to perform OCR processing of the selected documents or all documents in a drill-through Search results view. You can select all the documents in the drill-through Search results view or use the Document drop-down menu of the results document list to select a subset of documents. After OCR processing, check the appropriate Report Summary to see values for OCR Documents and OCR Pages. Note that OCR processing can be performed, regardless of whether the documents have been added to Project Data; the information will be updated for the affected files.
  • Extract Office 365 Data (available only if there are Office 365 Connectors available) – This option applies to documents that have been identified as Modern Attachment Retrieve Warning in the Warnings and Errors section of the Data Set Report or the all Imports Report (for Data Sets at the Content or Analytic Index level). This option is available from the drill-through Search results, as a right-click option from the Navigation Tree (for the view), or as an option in the Document drop-down menu of a document list in the results (for selection of one or more documents). It launches the Extract Office 365 Data dialog, which uses the selected Document Processing Timeout and an available Office 365 Connector for the operation.

Selected Document Options

When you select a single document in a list and right-click, a document context menu appears with a list of options:

  • Open Document Inline – Launches the Document Viewer and have it appear in place of your Document List content in the lower portion of the screen. When working inline, you can select view modes, navigate documents by using the page controls at the bottom, and perform operations such as tagging.
  • Open Document in New Window – Launches the Document Viewer in a new browser window (or tab, depending on your browser options). This version of the Document Viewer enables you to select any document in the paged Document List and see the full content of that document (or other views, such as Metadata or History). You can also launch multiple windows for different documents to perform side-by-side reviews of multiple documents. When you open the Document Viewer in a new browser window, you can select view modes in the top center portion of the screen, navigate documents by using the page controls at the bottom, and perform operations such as tagging.
  • Open Family Inline – Launches a Family-specific version of the Document Viewer for a given Family (MAG or DAG) inline and have it appear in place of your Document List content in the lower portion of the screen.
  • Open Family in New Window – Launches the Document Viewer for a Family (MAG or DAG) in a new browser window (or tab). Launching this version of the Document Viewer enables you to focus on the other family members of a selected parent email/document or email or embedded attachment. Family members are indented under their parent. MAGs are sorted by the email sent date.
  • Find Exact Duplicates of This... – Searches for documents that have exactly the same content and metadata as the selected document. An exact duplicate would have the same file MD5 value.
  • Find Content Duplications of This... – Searches for documents that have the same content as the selected document.

Navigation Tree Options for Data Set- or Imports-based Results

For a list of options that apply to an entire results set for a Data Set view or all Imports, you can use the right-click options for the results view in the Navigation Tree.

In general, options that include ... in the name indicate that they have an associated dialog. Options without ... run when you select them and do not have an associated dialog.

The right-click options for Data Set and Imports results are as follows:

  • Add Tags... – Launches the Tag dialog, from which you can select Tags to apply. You can also create a Tag and use it right away.
  • Add to... – Enables you to add the result documents to a selected Custodian, MediaID, Batch, or Folder view in Project Data based on permissions. For more information, see Add or Remove Documents or Search Results to a Folder. For more information about managing Custodian views, see Manage Custodians and Data Assigned to Custodians.
  • Remove From Project Data — For users with Project Data Add/Edit permissions, removes the result documents from Project Data entirely, including the Discard Pile, if the selected documents reside there. A Work Basket task called Removing Documents from Project Data reports the results. Documents removed from Project Data/Discard Pile are still available in the appropriate Data Set in the Project, in the event that you need to add them to Project Data again, but the documents no longer have an Project Data information that was previously applied, such as Tags.
  • Find Exact Duplicates – Enables you to search the view for Exact Duplicates. An exact duplicate would have the same file MD5 value.
  • Find Content DuplicatesEnables you to search the view for Content Duplicates. A content duplicate has the same content and content MD5 value.
  • Create Data Set from Search Results... (Available only from Data Set at a File Metadata or System Metadata Index level or Imports search results, and enabled for the entire view only) – Launches the Create Data Set from Result dialog, from which you can create a Data Set from all search results of an existing Data Set (one built with a System Metadata or File Metadata representation level). For example, you may want to create a Data Set after importing Logical Evidence (LEF) Files, imported using the representation level File Metadata and without populating Project Data.
  • Create Manifest... — Launches the Create Manifest dialog, from which you can generate a CSV or XML manifest of a view, using either the current fields or all fields. From the Work Basket task for the manifest generation, you can then right-click and select Download to download the file to a destination local to your computer. Users with permissions can also save the manifest to a server location. For download of a large manifest file (over 200 MB), the software places the manifest in a ZIP file, which you can then unzip. Note that this process can take time.
  • Download All as PDFs — Enables you to download all documents in the view as PDFs to your local environment so that you can view the documents in PDF format. When you select this operation, you can select the Stamp Document Number option if you want to include a stamp with the document number (docnum) on the bottom right of each page in the PDF. Note that this operation will also show a Warning popup that states the following: You are attempting to download all documents in this list as PDFs. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the PDFs directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. If you proceed, the software will prepare a ZIP file, by default named <projectname>_PDFs.zip. An information popup indicates that the PDFs are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that certain file types are ignored for PDF generation, including any selected directory folders not removed from your Project during setup by your administrator, disk images, file archives, mail archives, empty files, and files for which the native is not available. A WARNING_DETAILS_REPORT.csv file identifying the files that were skipped or failed PDF generation can be downloaded from the appropriate PDF-related Work Basket task. See About Downloading Documents as PDFs and Natives for more information.
  • Download All Natives — Enables you to download all documents in the view to your local environment so that you can view the documents in their native format. You will see a Warning popup that states the following: You are attempting to download all natives in this list. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the natives directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. If you proceed, the software will prepare a ZIP file, by default named <projectname>_Documents.zip. An information popup indicates that the documents are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that any directory folders are ignored for the download. See About Downloading Documents as PDFs and Natives for more information.
  • Update Metadata... (Available only from all Imports or Data Set search results)  – From an all Imports or Data Set search result only, enables you to update the MediaID value or the Custodian Directory location value using the Update Metadata dialog.
  • Reprocess...  — Launches the Reprocess dialog, which enables you to reprocess the search results set using selected reprocess options. Reprocessing causes the software to rerun the parsing and indexing of the eligible documents based on the selected reprocess options. See How to Perform Document Reprocessing for detailed information about reprocessing. From the toolbar, you can select all the documents to reprocess, or select a subset of documents. You may want to use this option after you inspect the Warning and Errors section of the Data Set Scan Report and notice that you have many damaged, encrypted, or password-protected files that could be reprocessed after the situations have been addressed (for example, you have configured password-cracking options and supplied password files, repaired a PST, or decrypted an NSF file). You may also perform a Search and find that you need to reprocess certain documents due to a parsing change (and therefore get updated metadata information). In the first situation, you drill-through the Damaged, Encrypted, or Protected entries in the Warning and Errors section, or you can search for them using the parsing status (for example, parsingstatus:00027 for encrypted files, parsingstatus:00028 for damaged files, and parsingstatus:00029 for protected files). From the drill-through Search results, you can select files or all files and click Process > Reprocess. After reprocessing, you check the report again. For example, you may see that a repaired PST now has children that have been added to the Index or you may have performed password cracking for encrypted/protected PDFs, ZIP or RAR files, or Microsoft Office documents. See Configure Password Cracking for Reprocessing. See Add and Manage Container Key Files.

The following two options apply only to an Data Set search results view:

  • Copy to External Area... (Selectable from Data Set search results only, and enabled for the entire view only) – Launches the Copy to Data Area dialog. Use this dialog to select an export data area and path as a target location for copying a view of documents that you want to process externally.
  • Load from External Area... (Selectable from Data Set search results only, and enabled for the entire view only) – Launches the Load from Data Area dialog. Use this dialog to select an export data area and path from which to load externally processed documents that you want to load back to the system.

The following Navigation Tree option applies only to the entire drill-through search results of an entry in the OCR Candidates report for either a Data Set or all Imports:

  • OCR...  – This option applies to an entire view of OCR Candidates, as reported by the OCR Candidates chart in the Data Set Report or the all Imports Report (for Data Sets at the Content or Analytic Index level). This option launches the Select OCR Settings for Processing dialog and uses those settings to perform OCR processing of the selected documents or all documents in a drill-through Search results view. After OCR processing, check the appropriate Report Summary to see values for OCR Documents and OCR Pages. Note that OCR processing can be performed, regardless of whether the documents have been added to Project Data; the information will be updated for the affected files.