Document Information for Project Data-based Search Results

Search History > Selected Search > All Docs tab

As long as you have permissions to perform searches (for example, of Project Data and views of Project Data), you can run a search and view the search results under Search History. The results appear automatically as the top search under Search History after the search completes. By default, you see an All Docs tab with all documents in the results set, document information, and actions for the results.

This topic focuses on the columns and options that generally apply to a Project Data-based results document list (that is the results of a query based search performed on a view of Project Data, or the results of a Sample, which appears under the Search History as a type of result view). This topic also generally applies to run Workflow Steps and to Saved Searches, although some options permitted on result views are not permitted on Saved Searches (for example, Reprocess and Send to Discard Pile).

Supported Tabs

Depending on the results view of Project Data, such as results of a Custodian, MediaID, Batch, Tag, or Folder view, you can use different tabs to display a list of all documents, or information for specific items such as Email Threads, Communication Grid, Domain Grid, Reports, or Clusters:

The supported tabs are as follows:

  • All Docs — A general view of all documents in the Search Results. This topic focuses on the All Docs tab view. You can double-click any document in the Search results to open and examine a document with highlighting of responsive terms or phrases in the Document Viewer. Within the Document Viewer for a document in a term-based search result, each term in a query is highlighted in a different color. Metadata searches do not support highlighting.
  • Email Threads — When Email Threading has been performed in Project Data, this tab provides a view of all Email Threads.
  • Communication Grid — This tab shows you a Communication Grid that helps you analyze the email communication for the current view and see how many emails were sent from a given email address to another email address. It shows the Top 50 email address FROM and TO combinations and lets you make grid selections.
  • Domain Grid —This tab shows you a Domain Grid that helps you analyze the sending and receiving email domain information for the current view and see how many emails were sent from a given domain to another domain. It shows Top 50 FROM and TO domain combinations and lets you make grid selections.
  • Reports — A view that shows you the appropriate report based on your Search.
  • Clusters — A view that shows you the Clusters, or logical groups of your documents, if Clustering has been performed. If not, click Build Clusters to initiate Clustering. The Clustering is for the documents in the current Project Data-based view. Note that the Clustering process can take some time.

Note: A Loading message appears while documents are being loaded into a view. In a large view, the documents list (or sorting or report generation) may take some time to complete. If you see the Loading message for a while, you may want to go to another view, perform other operations, and return to this view later.

Documents List Details

The documents list for a view such as a Project Data-based result view provides information about each document. The information depends on the view.

View Family Documents in the List

You can also use the following icons to help you identify and view family and/or thread members in the Family/Thread column:

  • identifies a document that is part of a family (message attachment group or document attachment group). You can click this icon to open the family inline.
  • identifies a document that is part of an email thread. You can click this icon to open the email thread inline.

Document Information for Project Data-based Result Views

By default, the All Docs tab for a Project Data-based result view provides the following columns:

  • Doc Number – A three-part number representing the Document Number in the format C.V.N, where C =A Data Collection (Data Set) number, unique per Organization, V =A Data Collection Checkpoint Value, unique per Data Collection, and N = A document number, unique within the Data Collection Checkpoint. When searching for a Document Number using the docnum metadata field, specify the entire value, since wildcards are not supported for this field. You can also use a range search. Example: docnum::[3.101.50000~~3.101.60000]. Family members (a parent and its children) have sequential document numbers.
  • Family /Thread – Enables you to identify whether a document is part of a Family and/or Thread. You can open a Family by clicking the Family icon and you can open a Thread by clicking the Thread icon . Note that this column is not included in downloaded manifests.
  • Tags – This column displays Tags. Each Tag has a color assigned to it. Select a Tag from the list to apply it. You can see up to 3 individual Tags applied to a document in the Tag column. To see a complete list of Tags, you can hover over the icon, which shows you the full list of applied Tags. (To Tag using the complete list of Tags, use the Tag option listed in the document entry, or the Tag option from the toolbar.) Note that this column is not included in downloaded manifests.
  • Name (always displayed) – This column displays key information about the file. The information displayed depends on the type of file:
    • The icon indicates that the file is a document found on disk and is followed by the filename. Note that embedded documents extracted during import are assigned a filename in the format <parentfilename>_OLE_<value>.<ext>. Embedded documents include Microsoft Excel files and text files. For example, for an Excel (xls) file that is the first OLE child linked to a Word document named spreadsheet.doc, the OLE filename would be spreadsheet.doc_OLE_1.xls. Parent OLE documents (Message attachments with Message OLE attachments or eDocs with eDoc OLE attachments) will include the text of each OLE document, and each OLE document in the parent document is separated by a row of dashes.
    • The icon indicates that the file is an email and is followed by the subject line of the email. This applies to email messages, calendar items, as well as journal entries and tasks. (Note that embedded images are not extracted for MSGs, EMLs, or eDocs during import by default, but are identified in the embeddedchildren metadata field with a value of image.)
    • The icon indicates a directory, if directories have not already been excluded from the Project Data. Your eDiscovery Administrator can take advantage of default exclusion queries in the Analytic Settings to exclude items such as directories, NIST files, and archive files from Project Data. If directories are included in the Project Data, then this column identifies the name of the directory.
    • The icon indicates that the file is an attachment. An attachment is shown indented under its parent in a non-search results view.
  • Note: A document's file extension will not always reflect the document's real file type. For example, a mydoc.txt file may actually be an MBOX from which emails are extracted. You can rely on the Digital Reef software to detect the correct file type, which you can verify in the document metadata.

  • To – For emails, this is derived from the display name, if available (for example, Joe Jones), or the email address of the email sender and recipient (for example, jjones@someco.com). Each recipient is separated by a comma or semicolon, depending on the source data.
  • Size – Shows the document size on disk.
  • Date – This column displays the primary date information based on the file type of the source file, displayed in the format yyyy-MM-dd-HH-mm-ss. The date information is shown according to the Project time zone, either the default time zone of Coordinated Universal Time (UTC), or a time zone selected using the Project Preferences. The value in this field is propagated from parent files to their child files (and the children will have that primary date only, not their own). Date is the default sort column for all Project Data-based views except results views, Cluster views, shown export-related views, and it enables you to see families grouped. This field displays information associated with the dateprimary field, which determines the primary date as follows (in order of precedence):
    • For eDocs – The primary date is selected by checking the following fields, in this order:

    1. datemodified

    2. lastmodifiedtime

    3. datecreated

    4. createdtime

    5. dateaccessed

    • For email messages – The primary date is selected by checking the following fields, in this order:

    1. sent

    2. received

    3. datecreated

    • For the non-email Message class (for example, Calendar items) – The primary date is selected by checking the following fields, in this order:

    1. sent

    2. received

    3. datestarted

    4. datemodified

    5. lastmodifiedtime

    6. datecreated

    7. createdtime

  • Score (default sort column for search results, Saved Searches, and Cluster Views) – This column shows how relevant a document is to the query or similarity search. Documents are sorted according to their Score with the most relevant documents (higher Score) listed first. A Score can be in the range 0 to 100, and the value appears in the Score column to two decimal places (for example, 18.33). The score is always 100 (100.00) for content or exact duplicates. Score for a Cluster rates how relevant a document is within a given Cluster. The first document listed for a given Cluster has a Score of 100 and is the seed document. Duplicate documents in a Cluster all have a Score of 100. Note that this column is now populated in downloaded manifests.
  • % Terms (similarity search results only) – Shows a percentage that identifies how much of a terms match exists between the search document and the result document in the range zero to 100 (reported to decimal places, such as 96.87). Note that a 100% terms match does not necessarily indicate identical documents, only that 100% of the search document terms were present in the result document. This percentage can be used as a risk assessment score to identify documents with the highest percentage of terms that match the document/email selected as the pivot document for a Find More Like These, Search by Synthetic Document, or Find Near Duplicates of This document search. Note that this column is not currently populated in downloaded manifests.
  • Group – For a Find Exact or Content Duplicates of a view operation (for a selected view, identifies groups of duplicates). The assigned identifier makes it easy for you to locate and analyze groups of duplicates. Note that this column is not currently populated in downloaded manifests.

Optional Columns for a Project Data-based Result View:

The following columns are hidden by default for a Project Data-based result view, but you can change your column selections to display them, and you can change the column order by dragging a column to the desired position:

  • Sent (optional field to display in Project Data-based views) –– The sent display date for emails in the format yyyy-MM-dd-HH-mm-ss (for example, 2000-02-17-06-17-13). You can sort on this column in order to see families grouped. To search by sent, always use the format YYYY-MM-DD-HH-MM-SS.
  • Received (optional field to display in Project Data-based views) – The received date for emails in the format yyyy-MM-dd-HH-mm-ss. You can sort on this column in order to see families grouped.

Note: When you change column selections and/or position for a view, your current selections are retained for that type of view for the duration of your session. This enables you to keep your column preferences for a given view type in effect as you navigate to different places in the application. For example, if you make column selection and/or position changes for a Project Data-based result view such as a search of Project Data, you can open the Document Viewer and see those selections, then close the Viewer and still see your Project Data result view selections. Your selections are maintained whenever you switch from one view to another view of the same type (for example, you switch from a Project Data result view to a Custodian result view), even if you move to another type of view in between. For example, if you change column selections and positions for a Project Data-based result view, then move to a Data Set result view (which shows its column selections), and then move to another Project Data-based result view (such as a Custodian, Tag, or Folder result view), you would still see your Project Data-based result view column selections and positions.

Document Menu Options for Search Results

Once you select one or more documents in the Project Data-based result view, use the Document drop-down menu to see a list of available options.

These options are available based on permissions. For more information about operations and their associated permissions, see View and Manage Role-Based Permissions.

For options that require an entire Project Data-based results view, use the right-click options for the view in the Navigation Tree.

Note: For operations that require you to select a target Folder or other view, be aware that the available target options change based on your context. For example, if you are removing documents from a Folder, you cannot create a new Folder.

  • Add Tags... – Launches the Tag dialog, from which you can select Tags to apply. You can also create a Tag and use it right away.
  • Remove Tags... – Launches the Tag dialog, from which you can select Tags to remove.
  • Add to... – Enables you to add documents to a selected Custodian, MediaID, Batch, or Folder view in Project Data based on permissions. For more information, see Add to or Remove Documents from Select Project Data Views. For more information about managing Custodian views, see Manage Custodians and Data Assigned to Custodians.
  • Remove from... – Removes documents from the selected Custodian, MediaID, Batch, or Folder view in Project Data based on permissions. The documents are still available within the Project, they just no longer reside within that view. Removing documents from a given named Custodian (or MediaID or Batch) automatically reassigns the documents to the Unassigned view of that type. (Removing documents from Unassigned is not permitted; if you want to assign documents from Unassigned to another view such as a Custodian, perform an Add to operation to the appropriate view. For more information about managing Custodian views, see Manage Custodians and Data Assigned to Custodians.)
  • Remove from Project Data — Removes the result documents from Project Data entirely, including the Discard Pile, if the selected documents reside there. A Work Basket task called Removing Documents from Project Data reports the results. Documents removed from Project Data/Discard Pile are still available in the appropriate Data Set in the Project, in the event that you need to add them to Project Data again, but the documents no longer have Project Data information that was previously applied, such as Tags.
  • Find More Like These... – Launches the Find More Like These dialog. This search uses selected documents or all documents in the current Project Data-based result view. The selected documents serve as comparison criteria to search for similar content. This type of search finds documents that have the most content similarity to the documents submitted as the focus of the search. It assesses whole-document similarity and reports a Score and %Terms match.
  • Download as PDF – Enables you to download a single document, multiple selected documents on a page, or all documents in the view as a PDF to your local environment so that you can view the documents in PDF format. When you select this operation, you can select the Stamp Document Number option if you want to include a stamp with the document number (docnum) on the bottom right of each page in the PDF. If you select the top checkbox to save all documents as PDFs, you will see a Warning popup that states the following: You are attempting to download all documents in this list as PDFs. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the PDFs directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. Whether you select one, multiple, or all documents to download, the software will prepare a ZIP file, by default named <projectname>_PDFs.zip. An information popup indicates that the PDFs are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that certain file types are ignored for PDF generation, including any selected directory folders not removed from your Project during setup by your administrator, disk images, file archives, mail archives, empty files, and files for which the native is not available. A WARNING_DETAILS_REPORT.csv file identifying the files that were skipped or failed PDF generation can be downloaded from the appropriate PDF-related Work Basket task. See About Downloading Documents as PDFs and Natives for more information.
  • Download NativeEnables you to download a single document, multiple selected documents on a page, or all documents in the view to your local environment so that you can view the documents in their native format. Any selected directory folders not removed from your Project during setup by your administrator are ignored for the download. A WARNING_DETAILS_REPORT.csv file identifies any native files that were not downloaded. (See About Downloading Documents as PDFs and Natives for more information.) If you select the top checkbox for all documents, you will see a Warning popup that states the following: You are attempting to download all natives in this list. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the natives directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. Whether you select one, multiple, or all documents to download, the software will prepare a ZIP file, by default named <projectname>_Documents.zip. An information popup indicates that the documents are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket.
  • Reprocess...  (For result views, not Saved Searches) – Launches the Reprocess dialog, which enables you to reprocess the search results set using selected reprocess options. Reprocessing causes the software to rerun the parsing and indexing of the eligible documents based on the selected reprocess options. See How to Perform Document Reprocessing for detailed information about reprocessing. From the toolbar, you can select all the documents to reprocess, or select a subset of documents. You may want to use this option after you perform a Search and find that you need to reprocess certain documents due to a parsing change (and therefore get updated metadata information).
  • Send to Discard Pile... (For result views, not Saved Searches, and requires Discard Pile Add/Edit Permissions) – For a search result under Search History, enables you to remove selected documents from Project Data and place a copy in the Discard Pile. When you select this option, a confirmation popup enables you to provide an optional comment before performing the operation (when you click Send to Discard Pile). A Work Basket task called Sending Documents to Discard Pile reports the results. Documents removed from Project Data can later be restored with their Project Data information (for example, Tags).

Note: If you see a CAE_ERROR with a description of PAGE_JOB:null, ask your System Administrator to check your NAS storage timing. If the NAS timing is off, you may see this error when generating certain document lists that rely on the availability of created files (for example, if you try to use View Exceptions for a data set after Project Data is populated).

Selected Document Options

When you select a single document in a list and right-click, a document context menu appears with a list of options:

  • Open Document Inline – Launches the Document Viewer inline, within your current browser window.
  • Open Document in New Window – Launches the Document Viewer in a new browser window (or tab, depending on your browser options). This enables you to select any document in the paged Document List and see the full content of that document (or other views, such as Metadata or History). You can also launch multiple windows for different documents to perform side-by-side reviews of multiple documents. When you open the Document Viewer in a new browser window, you can select view modes in the top center portion of the screen, navigate documents by using the page controls at the bottom, and perform operations such as tagging.
  • Open Family Inline – Launches a Family-specific view of the Document Viewer for a given Family (MAG or DAG) inline, within your current browser window.
  • Open Family in New Window – Launches the Document Viewer for a Family (MAG or DAG) in a new browser window (or tab). This enables you to focus on the other family members of a selected parent email/document or email or embedded attachment. Family members are indented under their parent. MAGs are sorted by the email sent date.
  • Open Thread Inline – Launches a Thread-specific view of the Document Viewer inline, within your current browser window.
  • Open Thread in New Window – Launches a Thread-specific view of the Document Viewer in a new browser window (or tab). This enables you to focus on each message in the Thread and the associated attachments, if applicable.
  • Find Exact Duplicates of This... – Searches for documents that have exactly the same content and metadata as the selected document. An exact duplicate would have the same file MD5 value.
  • Find Content Duplications of This... – Searches for documents that have the same content as the selected document.
  • Find Near Duplicates of This... – Searches for documents whose content is almost the same as a selected document. Evaluation of what constitutes a near-duplicate document includes comparison of the overall term length, but not file type or format. A Threshold setting enables you to specify the level of content match for the operation. Find Near Duplicates minimally requires an Analytic Index.

Navigation Tree Options for Project Data-based Results

For a list of options that apply to an entire results set of Project Data or a Project Data-based view (including a Workflow Step), you can use the right-click options for the results view in the Navigation Tree.

In general, options that include ... in the name indicate that they have an associated dialog. Options without ... run when you select them and do not have an associated dialog.

The right-click options for Project Data-based results are as follows:

  • Add Tags... – Launches the Tag dialog, from which you can select Tags to apply. You can also create a Tag and use it right away.
  • Remove Tags... – Launches the Tag dialog, from which you can select Tags to remove.
  • Add to... – Enables you to add the result documents to a selected Custodian, MediaID, Batch, or Folder view in Project Data based on permissions. For more information, see Add or Remove Documents or Search Results to a Folder. For more information about managing Custodian views, see Manage Custodians and Data Assigned to Custodians.
  • Remove from... – Removes the result documents from the selected Custodian, MediaID, Batch, or Folder view in Project Data based on permissions. The documents are still available within the Project, they just no longer reside within that view. Removing documents from a given named Custodian (or MediaID or Batch) automatically reassigns the documents to the Unassigned view of that type. (Removing documents from Unassigned is not permitted; if you want to assign documents from Unassigned to another view such as a Custodian, perform an Add to operation to the appropriate view. For more information about managing Custodian views, see Manage Custodians and Data Assigned to Custodians.)
  • Remove From Project Data — For users with Project Data Add/Edit permissions, removes the result documents from Project Data entirely, including the Discard Pile, if the selected documents reside there. A Work Basket task called Removing Documents from Project Data reports the results. Documents removed from Project Data/Discard Pile are still available in the appropriate Data Set in the Project, in the event that you need to add them to Project Data again, but the documents no longer have an Project Data information that was previously applied, such as Tags.
  • Save Search... — Launches the New Saved Search dialog, from which you assign a named to the Saved Search (and optional description). Both the query and the results become part of the Saved Search. This option is not available from the following types of results: a Sample, drill-through Search, Search by Synthetic Document, Find More Like These, Find Exact Duplicates, or Find Content Duplicates.
  • Find Exact Duplicates – Enables you to search the view for Exact Duplicates. An exact duplicate would have the same file MD5 value.
  • Find Content DuplicatesEnables you to search the view for Content Duplicates. A content duplicate has the same content and content MD5 value.
  • Calculate Word List — Calculates the Word List for all documents in Project Data. A task appears in the Work Basket while the Word List is being generated. When the task completes, you can view the Word List.
  • View Word List... — Launches the Word List dialog and enables you to view the calculated Word List for all documents in Project Data.
  • Create Manifest... — Launches the Create Manifest dialog, from which you can generate a CSV or XML manifest of a view, using either the current fields or all fields. From the Work Basket task for the manifest generation, you can then right-click and select Download to download the file to a destination local to your computer. Users with permissions can also save the manifest to a server location. For download of a large manifest file (over 200 MB), the software places the manifest in a ZIP file, which you can then unzip. Note that this process can take time.
  • Create Comparison... — Launches the Create Comparison dialog, from which you can set up a Comparison of two sets of data or views in the Navigation Tree so that you can see how much content is in both, or just in one of the sets versus the other.
  • Create Sample... — Launches the Create Sample dialog, from which you can set up a Sample view of data based on certain criteria.
  • Create Export Comparison Report... — For a single search result, launches a Select Export popup that enables you to select an available Export as the target of a deduplication operation. The operation compares the entire search results view against the selected Export. The operation generates an Export Comparison Report based on your selected options, with the information that would apply if you included the search result documents in an Export. After the operation completes, you can view the Report or download it.
  • Download All as PDFs — Enables you to download all documents in the view as PDFs to your local environment so that you can view the documents in PDF format. When you select this operation, you can select the Stamp Document Number option if you want to include a stamp with the document number (docnum) on the bottom right of each page in the PDF. Note that this operation will also show a Warning popup that states the following: You are attempting to download all documents in this list as PDFs. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the PDFs directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. If you proceed, the software will prepare a ZIP file, by default named <projectname>_PDFs.zip. An information popup indicates that the PDFs are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that certain file types are ignored for PDF generation, including any selected directory folders not removed from your Project during setup by your administrator, disk images, file archives, mail archives, empty files, and files for which the native is not available. A WARNING_DETAILS_REPORT.csv file identifying the files that were skipped or failed PDF generation can be downloaded from the appropriate PDF-related Work Basket task. See About Downloading Documents as PDFs and Natives for more information.
  • Download All Natives — Enables you to download all documents in the view to your local environment so that you can view the documents in their native format. You will see a Warning popup that states the following: You are attempting to download all natives in this list. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the natives directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. If you proceed, the software will prepare a ZIP file, by default named <projectname>_Documents.zip. An information popup indicates that the documents are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that any directory folders are ignored for the download. See About Downloading Documents as PDFs and Natives for more information.
  • Reprocess...  — Launches the Reprocess dialog, which enables you to reprocess the search results set using selected reprocess options. Reprocessing causes the software to rerun the parsing and indexing of the eligible documents based on the selected reprocess options. See How to Perform Document Reprocessing for detailed information about reprocessing. From the toolbar, you can select all the documents to reprocess, or select a subset of documents. You may want to use this option after you inspect the Warning and Errors section of the Data Set Scan Report and notice that you have many damaged, encrypted, or password-protected files that could be reprocessed after the situations have been addressed (for example, you have configured password-cracking options and supplied password files, repaired a PST, or decrypted an NSF file). You may also perform a Search and find that you need to reprocess certain documents due to a parsing change (and therefore get updated metadata information). In the first situation, you drill-through the Damaged, Encrypted, or Protected entries in the Warning and Errors section, or you can search for them using the parsing status (for example, parsingstatus:00027 for encrypted files, parsingstatus:00028 for damaged files, and parsingstatus:00029 for protected files). From the drill-through Search results, you can select files or all files and select Reprocess. After reprocessing, you check the report again. For example, you may see that a repaired PST now has children that have been added to the Index or you may have performed password cracking for encrypted/protected PDFs, ZIP or RAR files, or Microsoft Office documents. See Configure Password Cracking for Reprocessing. See Add and Manage Container Key Files.
  • Copy for External Imaging...(Selectable from a Search Result of Project Data or a view of Project Data, such as a Custodian or Folder view) – Launches the Copy for External Imaging dialog. Use this dialog to copy the contents of the entire Search Result view to an Export Data Area. This enables the external imaging of files outside the application.