View an Export Comparison Report

Search History > Search Result > child entry for exportcompare_<exportname> > Reports tab

Work Basket > Export Comparison Report > double-click > Reports tab

An Export Comparison report allows you to get an estimate of what would be exported based on a set of search results.

You can create an Export Comparison report as follows:

  • As a standalone Export Comparison report for a selected search result (from a search in the Search History, or from a search result).
  • As part of running a Workflow and Workflow report generation (where the Export Comparison runs against the results of the final search in the Workflow). For more information about this scenario, see Generate a Workflow Report.

When you generate an Export Comparison report, the options you select affect what you see in the generated report and the download report. These include the search term report information and the calculation of HTML/MHTML sizes.

Note: For initial creation of a standalone Export Comparison report, the generated report becomes available right away as a separate child entry under the search result (and if you do not navigate away, the operation auto-selects the Reports tab). Once the operation completes, and you navigate away and perform other tasks, the Export Comparison report is available by double-clicking the task from the Work Basket, or by navigating to the child entry under the search result. When opening the Export Comparison in this manner, you will be placed on the All Docs tab, as with any other view. You can then change to the Reports tab to view, download, or print the information from the report. When you run an Export Comparison as part of a Workflow, you can either view the report information as part of the downloaded Workflow report, or you can view just Export Comparison Report information in the UI by navigating to the Reports tab for the Export Comparison, which is available as a child entry of the final search of the Workflow under the Search History.

The Export Comparison Report shown on the Reports tab includes a timestamp, as follows:

  • Oldest report: yyyy-MM-dd HH:mm:ss – Identifies the date and time based on the oldest report generated or updated for this view. The timestamp is shown in the local time zone.
  • Title of Export Comparison Report: Deduplicate Search Result against Export Stream
  • Search Result: Identifies the query used for the Export Comparison
  • Export: Identifies the Export Stream used for the Export Comparison

The Export Comparison Report for a search result contains the following sections:

Search Deduped against Export Details

This section summarizes the comparison details (number of records, total file size in GB, % Count, and % Size):

  • Search Target ─ The total number of documents for the search target used in the comparison, as well as the total file size, %Count, and %Size information.
  • Search Hits ─ The total number of documents responsive to the search that were considered for export based on the export criteria, as well as the total file size, %Count, and %Size information.
  • Search Hits Previously Processed ─ The total number of documents that are responsive to the search but have already been exported in a previous Volume (if they still meet the export criteria), as well as the total file size, %Count, and %Size information.
  • Family Documents Added ─ The number of missing documents that would be added because they are part of the family (MAG or DAG), as well as the total file size, %Count, and %Size information.
  • Threaded Documents Added ─ The number of files (e.g., email messages) that would be added because they are part of an email thread. The associated thread and all contained items would be considered, as well as the total file size, %Count, and %Size information.
  • Email Attachments Removed ─ The number of email attachments that would be removed based on the export settings for the handling of email attachments, as well as the total file size, %Count, and %Size information.
  • OLE Attachments Removed ─ The number of OLE attachments (message or eDoc OLE ) or that would be removed based on the export settings for separate OLE attachments, as well as the total file size, %Count, and %Size information.
  • Archive Attachments Removed ─ The number of archive attachments that would be removed based on the export settings for the handling of attachments and attached archives, as well as the total file size, %Count, and %Size information.
  • Duplicates Removed ─ The number of duplicate documents that would be removed, as well as the total file size, %Count, and %Size information.
  • Total to be Exported ─ The total number of documents that would be produced (all document classes), as well as the total file size, %Count, and %Size information.

Export Statistics

The Export Statistics information identifies what would be exported based on the document class (Count and Total Size, in GB):

Note: The deduplicated counts calculated for the Export Comparison report give you the best idea of what will be exported based on the search results used for the comparison. These counts may or may not match the deduplicated counts shown in a given Document Classification report for those search results. This is because the two reports function differently. The Export Comparison report uses the dupe_fingerprint value to determine if a document is a duplicate, and the Document Classification report will consider both the dupe_fingerprint value and the family fingerprint value to determine the dedupe nature of the documents and place them in appropriate document classes for the report.

  • Email Export ─ The number of emails that would exported and their associated file size. If non-native emails are exported (e.g., HTML and/or MHTML) then this size is an estimate.
  • Email Attachment (Message & OLE) Export ─ The number of message attachments and message OLE attachments that would be exported, along their associated file size.
  • Edoc Export ─ The number of eDocs that would be exported and their associated file size.
  • OLE Export ─ The number of eDoc OLE attachments that would be exported and their associated file size.
  • Container Export ─ The number of containers that would be exported and their associated file size, if containers were included in the results and eligible for export.

Download the Export Comparison Report

When you download this report from the Reports tab of the Export Comparison Results, or right-click the Export Comparison task in the Work Basket and click Download, you can download the available report information to an XLSX, which features a multi-tab workbook of expanded information. This Download operation requires Document Reports permissions.

If you generate the search term report information, the downloaded XLSX provides key information over multiple tabs, as follows, and contains the columns enabled in the Search Settings:

  • Glossary — This tab applies to all search result views and contains both a Glossary and a Legend. The Glossary helps you follow the Search Term Hit Count information in columns of the Total tab as well as the appropriate Batch and/or Custodian tabs that apply to the Search Result view. For example, the definition for Total Dedupe Docs indicates that this value is the number of documents that hit upon a search term (deduped across all search terms). The Legend section identifies the prefixes used for each Batch and Custodian tab that applies to the search results view. For example, B1 may be the prefix used to represent a Batch called data1 (where the tab name is B1-data1), and C1 may be used to represent a Custodian called mikeg (where the tab name is C1-mikeg)
  • Summary — This tab summarizes the key counts and Search Settings for the Search used in the comparison, including the Total Records in the search target view, the Total Dupes in that search target view (where Global refers to Horizontal deduplication and Custodial refers to Vertical deduplication), the Total Search Hits (Deduped), and the Total Search Hits with Family (Deduped). The Search Settings for the Search used in the Dedupe against Export comparison include the Search Target, Query Entered, Include Families, Include Metadata, and Synonym Expansion. Special Search Deduped Against Export information highlights the key calculations for the comparison, Export Settings information identifies the key settings from the associated Export (for example, Family Documents Added, Threaded Documents Added, Email Attachments Removed, OLE Attachments Removed, Archive Attachments Removed, and Duplicates Removed), and an Export Statistics section identifies what would be exported based on the document class. This tab also reserves a section for a logo and other job and client information.

Note: If you elected to calculate HTML/MHTML Sizes are part of the report generation, the Export Size area of the Search Deduped Against Export information will reflect the detailed calculation in the Email Export size (in GB), based on the email format selected for the Export (Native, HTML, MHTML, or HTML/MHTML). As such, the calculated Email Export Size should be a close estimate under normal circumstances. Note also that the Attachment Export entry includes both Message Attachments and Message OLE Attachments. The OLE Export entry includes EDoc OLE Attachments only.

  • Total — If you elected to include the search term report information when generating the report, this tab provides the Search Term Hit Count information (per-Clause, as it appears on the Reports tab for the Search used in the Deduplicate against Export comparison). An additional TOTAL line provides the statistics for the overall search (that is, all clauses in the search, combined).
  • Per-Custodian tabs — If you elected to include the search term report information when generating the report, these tabs provide per-Custodian Hit Count details for the search results view. Custodian tabs apply only to Project Data and views of Project Data. Only Custodians that are responsive to the search will have tabs. An additional TOTAL line provides the per-Custodian statistics for the overall search (that is, all clauses in the search, combined).
  • Per-Batch tabs —If you elected to include the search term report information when generating the report, these tabs provide per-Batch (imported Data Set) Hit Count details for the search result view. Only Batches that are responsive to the search will have tabs. An additional TOTAL line provides the per-Custodian statistics for the overall search (that is, all clauses in the search, combined).