View Reports for Project Data

Project Data > Reports

Requires Project Data - View and Project - Reports - View Permissions

The Project Data Reports tab provides charts that help you get information about all documents that have become part of the Project Data (for example, by direct Custodian assignment, or by manual assignment to a Custodian).

Note: This information applies to the Reports tab for all of Project Data. For information about Reports tab for a non-results view of Project Data, such as a Custodian, Folder, Tag, or Comparison, see Get the Report for a Selected View within Project Data.

Reports Tab Toolbar

The Reports tab toolbar provides the following:

  • Oldest report: yyyy-MM-dd HH:mm:ss – Identifies the full date and time based on the oldest report generated or updated for this view (for a date other than today or yesterday). The timestamp is shown in the local time zone. If the report was generated or updated today, the report will show Today HH:mm:ss; if it was generated or updated yesterday, it will show Yesterday HH:mm:ss.
  • Update – Click this button to recalculate the reports for this view with the latest information. You should click Update on reports generated prior to a change such as a Custodian or Custodian Priority change, a Dedupe setting change, or changes in the number of documents in the view. An update will also reflect any changes to the enabling/disabling of reports by your eDiscovery Administrator.

About Report Generation

  • By default, if you have Project - Reports - View permission, a given Reports tab shows all reports that apply to that particular view. Your eDiscovery Administrator may decide to disable the generation of selected reports for Project Data views. (Therefore, if you do not see all of the reports in your Project, consult your eDiscovery Administrator.) You may want to click Update to ensure that the most recent enabled/disabled settings are reflected.
  • Reports for a large number of documents (for example, greater than 50,000) take time to generate, especially if all reports for a given view are generated. If report generation is taking a considerable amount of time, an In Progress Generate Report task appears in the Work Basket (for example, if you click the Reports tab without selecting the Generate Reports option at the time of a Freeform, Advanced, or Bulk Search). If you want, you can cancel the report generation.

Note: For certain searches, such as Freeform, Advanced, and Bulk Search, you can use the Generate Reports option to have the reports generated at the time you perform the Search instead of waiting until the Reports tab is selected. (Your search may take longer initially, but you will not see a Generate Report task in the Work Basket when you click the Reports tab, and the reports should load more quickly.)

How to Use Summary Information and Charts

You can use the charts to get detailed information about the data in Project Data.

Drill-through Support

For the Summary sections and charts, you can generally use the drill-through capabilities to get additional information, as follows:

  • Double-click an entry in a chart to perform a drill-through that generates an additional search result view focusing on the information you selected. You can drill-through a particular entry in a Summary (for example, Mail Container Errors) or a pie chart section or entry in the legend (for example, a Document Types entry). When you drill through an entry, an additional search is generated, and a task is generated in Work Basket for this new drill-through search result view. The drill-through search result view launches automatically to list the documents responsive to the search, based on the entry into which you drilled.

Note: The document count displayed for a given report entry may not always match the document count calculated for the drill-through search of that entry. Depending on the report, the drill-through search results may yield a higher count than the original report entry, since the drill-through search is generally more inclusive, looking for all records containing the associated value in some format. Note also that if you try to perform drill-through searches on the reports for a Shared Data Set (assuming you have permissions to do so), you may need to click Update to update the report information appropriately for use in the current Project.

Disabled Reports

Any report that has been explicitly disabled in the Report Settings is identified in the Reports tab as Report Disabled.

Reports without Data to Display

Any report for which there is no data to display is identified on the Reports tab as No Data to Display.

Hover Text Support

You can hover over any bar, pie, or column section to get more information about that section.

View by Count or Size

For charts that can be based on either a Count or Size, you can click to select either Count or Size, depending on which one is already active. For example, if Size is active, you can select Count.

Views with zero documents do not appear in the charts or the Details popups.

Chart Download Support

  • If you have Document Reports permission, you can click , located at the top right of a chart, to download the appropriate file (mostly CSV).

  • Some charts may take time to download. During report downloads, a message will appear at the top of the screen to notify you about the number of downloads in progress. You can hover over the in-progress message to see the individual downloads in progress.

View Chart Details

  • For most charts in this report (except Document Classification), you can click at the top right of a chart to view a Details pop-up with the document count and size information for the appropriate items. The Details pop-up for a chart supports a Document drop-down menu with Add Tags, Remove Tags, Add to, and Remove from actions. For most charts, viewing report Details shows you the name, non-zero count, and size of documents for an item, such as a Tag, Date, or Custodian. Some Multi-Tab Details, for example, Document Type Details, provide multiple tabs of information.

Note: Digital Reef .CSV files are generated using UTF-8 encoding. If the content includes non-ASCII or multi-byte characters, opening the .CSV using Microsoft Excel will not render the multi-byte characters properly (because Excel uses a different default encoding). To address this issue, change the extension of the downloaded file from .CSV to .TXT. When you open Excel and then open the file, change the File Origin to UTF-8 and add the Comma as a delimiter. For example, on Microsoft Windows, the selection is 65001: Unicode (UTF-8).

Document Classification Chart

This chart shows how document classification can help you reduce the amount of content for review. The values are based on the sum of all data added to the Project.

Document Classification Information

The Document Classification chart provides key count and size information based on document class.

Important Notes:

Note: For an all Imports or Data Set view, the values shown in this chart will reflect file MD5 deduplication, since the Deduplication Settings under Analytic Settings and calculations based on family membership apply only to views of Data (and initially calculated when data is added to Data by a user with Permissions).

  • For document classes subject to deduplication, the deduplicated count and size values are calculated and displayed according to the appropriate Deduplication setting (under Analytic Index Settings), either the default of Global or Custodial). This setting, which applies for Data and views of Data, determines the processing of email, reporting of de-dupe counts and size, and how email is handled for an Export that includes duplicates.
  • The Document Classification chart title and the title displayed in the downloaded CSV for the chart identify which deduplication setting is being used to calculated the counts (for Dedupe Setting: Global) or (for Dedupe Setting: Custodial). If you change the de-dupe mode after initially generating the report, you must click Update to recalculate based on the new setting.
  • The deduplication is calculated based on document class and family membership (MAG or DAG). For example, if the same Word document serves as a Message Attachment to two different email msgs, it is counted twice, once per message family.

Note: Keep in mind that reports will only be accurate if you keep the Families intact when running searches and saving documents to different views. This means that you must keep the Include Families checkbox enabled to ensure that Families remain intact.

  • The entries in the table now support the double-click drill-through capability.
Document Class Count Size Deduplicated Count Deduplicated Size
Container Files

The number of Container files (Archives, Message Archives, and Disk Images) in the view.

The size (by default, in bytes) of Container files in the view.

Container files are not subject to this deduplication analysis; therefore, this column is empty. Container files are not subject to deduplication.
Directories The number of directories in the view. The size (by default, in bytes) of directories in the view. Directories are not subject to this deduplication analysis. Therefore, for directories, this column will be empty. Directories are not subject to this deduplication analysis. Therefore, this column will be empty.
EDOC OLE Attachments The number of EDoc OLE attachments in the view. The size of EDoc OLE attachments in the view. The count of EDoc OLE attachments in the view after deduplication (based on family membership). The size (for example, in GBytes) of EDoc OLE attachments in the view after deduplication (based on family membership).
Message Attachments The number of message attachments in the view. The size (by default, in bytes) of message attachments in the view. The count of message attachments in the view after deduplication (based on family membership). The size (for example, in GBytes) of message attachments in the view after deduplication (based on family membership).
Messages The number of messages in the view. The size (by default, in bytes) of messages in the view. The count of messages in the view after deduplication. The size (for example, in GBytes) of messages in the view after deduplication.
Message OLE Attachments The number of Message OLE attachments in the view. The size of Message OLE attachments in the view. The count of message OLE attachments in the view after deduplication (based on family membership). The size (for example, in GBytes) of message OLE attachments in the view after deduplication (based on family membership).
NIST EDoc Files The number of EDocs that are NISTClosed The National Institute of Standards and Technology (NIST), which provides the National Software Reference Library (NSRL). The NSRL includes a Reference Data Set of digital signatures for known, traceable software applications. The list is used to identify files with no evidentiary value. Digital Reef provides the NSRL database to support detection of files with signatures (hash codes) matching those in the NSRL database upon import. DeNIST refers to the removal of any file that has a digital signature matching one in the NIST NSRL list. files in the view. The size (by default, in bytes) of EDocs that are NIST files in the view. NIST EDOC files are not subject to this deduplication analysis; therefore, this column is empty. NIST EDOC files are not subject to deduplication; therefore, this column is empty.
Non-NIST EDoc Files The number of non-NISTClosed The National Institute of Standards and Technology (NIST), which provides the National Software Reference Library (NSRL). The NSRL includes a Reference Data Set of digital signatures for known, traceable software applications. The list is used to identify files with no evidentiary value. Digital Reef provides the NSRL database to support detection of files with signatures (hash codes) matching those in the NSRL database upon import. DeNIST refers to the removal of any file that has a digital signature matching one in the NIST NSRL list. EDocs (that is, EDocs that are not NIST files) in the view. The size (by default, in bytes) of non- NIST EDocs in the view. The count of non-NIST EDocs (that is, EDocs that are not NIST files) in the view after deduplication. The size (for example, in GBytes) of non-NIST EDocs (that is, EDocs that are not NIST files) in the view after deduplication.
Total Documents The total number of documents in the view. The total size (in bytes) of documents in the view. Calculated regardless of whether the values for Directories, Container Files, and NIST EDocs are 0. Calculated regardless of whether the values for Directories, Container Files, and NIST EDocs are 0.

About the Downloaded Document Class CSV File

You can name and download the Document Classification table as a CSV file. The file contains additional entries, including a top entry for Total Documents. The CSV entries use the document class name, where applicable. Entries for directories, archives, message archives, and disk images will not show counts in views of Data as long as the uses the default set of Exclusion Searches. However, for the Imports view, or for a Data Set Scan Report view, these entries will report counts.

Note: All entries in the CSV file report Count and Size values. For document classes that support deduplicated count and size values, you will also see values populated in columns representing the Deduplicated Count and Deduplicated Size.

  • Total_Documents — The total document count and size for the view.
  • EDoc — The count and size of EDocs in the view.
  • Non-NIST_EDoc — The count and size of non-NIST EDocs in the view. For this entry, the Deduplicated Count and Deduplicated Size columns report the count and size of non-NIST EDocs in the view after deduplication.
  • EDoc_OLE_Attachment — The count and size of EDoc OLE Attachments in the view. For this entry, the Deduplicated Count and Deduplicated Size columns report the count and size of EDoc OLE Attachments in the view after deduplication (based on family membership).
  • Message — The count and size of Message Families (MAGs) in the view. For this entry, the Deduplicated Count and Deduplicated Size columns report the count and size of messages (Message Families, or MAGs) in the view after deduplication (based on family membership).
  • Non-NIST_Message_Attachment — The count and size of non-NISTClosed The National Institute of Standards and Technology (NIST), which provides the National Software Reference Library (NSRL). The NSRL includes a Reference Data Set of digital signatures for known, traceable software applications. The list is used to identify files with no evidentiary value. Digital Reef provides the NSRL database to support detection of files with signatures (hash codes) matching those in the NSRL database upon import. DeNIST refers to the removal of any file that has a digital signature matching one in the NIST NSRL list. Message Attachments in the view.
  • Message_Attachment — The count and size of all Message Attachments in the view. For this entry, the Deduplicated Count and Deduplicated Size columns report the count and size of Message Attachments in the view after deduplication (based on family membership).
  • Directory — The count and size of directories in the view (for example, the Imports view or a Data Set Scan Report view).
  • NIST_EDoc — The count and size of EDocs that are NIST files in the view.
  • Message_OLE_Attachment — The count and size of Message OLE Attachments in the view. For this entry, the Deduplicated Count and Deduplicated Size columns report the count and size of unique Message OLE Attachments in the view after deduplication (based on family membership).
  • Archive — The count and size of files that are File Archives or in compressed format in the view (for example, a GZIP, ZIP, RAR, or TAR file found on disk). This entry applies to a view that includes archive files (for example, the Imports view or a Data Set Scan Report view).
  • Message_Archive — The count and size of Message Archives in the view (for example, the Imports view or a Data Set Scan Report view).
  • NIST_Message_Attachment — The count and size of Message Attachments that are NIST files in the view.
  • Disk_Image — The count and size of Disk Images in a view that includes disk images (for example, the Imports view or a Data Set Scan Report view).

Note that NIST information is reported in the kftdesc metadata field.

Billing Summary

Note: This chart is visible only if you have the appropriate Permissions. In addition, within Project Data, it only applies to the Project view itself, not other Project Data-based views such as a Custodian, Tag, or Folder.

This report for provides information about the total included file types as well as the excluded file types in the defined File Type Exclusion Groups.

About the Billing Summary

The Billing Summary provides key count and size information based on the included and excluded File Types. By default, the following File Type Exclusion Groups contain File Types that should be excluded from billing:

  • Compressed Types
  • Disk Image Types
  • Email Archive Types
  • File Archive Types

Note: The Billing Summary does not support drill-through.

You can configure the Exclusion Groups and file types that form the Billing Report information at the Project, Organization, or System level (for System Users in a System-level role with the appropriate permissions):

  • Project Billing Reports
  • Organization Billing Reports Template
  • System-level Billing Reports Template

See Container Files for a list of the File Types included in each Container Files category.

See Supported Files for a list of the Digital Reef Supported File Types.

The Billing Summary appears in the Reports tab for the Imports, Data Set, and Project Data views and is scoped to the appropriate view:

Total Included Types — The total number of files subject to billing based on the included File Types.

  • Count (default sort column) — For the appropriate view (Imports, Data Set, or Project Data), the total number of files subject to billing based on the included File Types for the appropriate view (Imports, Data Set, or Project Data).
  • Size — For the appropriate view (Imports, Data Set, or Project Data), the total size (in GB or MB) of the files subject to billing based on the included File Types for the appropriate view (Imports, Data Set, or Project Data).

Total Excluded Types — The total number of files that will be excluded from billing based on the File Types identified in each Exclusion Group.

  • Count (default sort column) — For the appropriate view (Imports, Data Set, or Project Data), the total number of files excluded from billing, as determined by the excluded File Types in the Exclusion Groups.
  • Size —For the appropriate view (Imports, Data Set, or Project Data), the total size (in GB or MB) of the files excluded from billing, as determined by the excluded File Types in the Exclusion Groups.

  • <Exclusion Group List> — Each Exclusion Group, in alphabetical order, as defined in the Project Billing Reports or Billing Reports template. There are four default Exclusion Groups initially defined for a Project, but you can add your own and select more file types to exclude.

    • Count (default sort column) — The number files associated with a given File Type Exclusion Group in the appropriate view.
    • Size — The size (in GB or MB) of the files associated with a given File Type Exclusion Group in the appropriate view.

For this report, the following buttons are available:

  • Download button — By default, downloads the summary information and the details in an XLSX file, with a tab for the summary information and a tab for the detailed information.
  • — Displays File Type Exclusion Group Details, which includes a list of the file types within each Exclusion Group, along with the document counts and size values.

Note: You can control whether the Billing Summary appears for the Imports, Data Set, and Project Data views from the Project Reports or a Reports template. By default, this report is enabled to appear in the Imports, Data Set, and Project Data views. Note also that the Billing Summary uses the Document Types report.

Document Class

By default, this chart shows you the information based By Size (descending order) in GB, MB, or KB, depending on the size of the data.

What you see for document classes depends on the population of data for your selected view. The document classes are as follows (in display format, not the official search format):

Note: When you search for a document class using the docclass metadata field (which is not case-sensitive, but not tokenized), you must either specify the entire name of the class (for example, Message_Attachment, Message_OLE_Attachment, eDoc_OLE_Attachment, Message_Archive) or use wildcards.

  • EDoc – The total number and size of files imported that are not an email, not from an email, and not any type of archive (for example, not a file or email archive). These are files that do not fall into any of the following other document classes: emails, email archives (containers) such as PST, OST, and NSF, file archives (compressed files such as ZIP), or disk images. A Word document on disk is an EDoc, as is an Excel document found in a ZIP file at the import location, or a Word document that has an embedded email.
  • Message – The total number or size of email messages (but not their attachments). An email file on disk or an email file found in a ZIP file at the import location falls into this category. Email attachments and archive container files such as PST, OST, and NSF are not counted in this category.
  • Message Attachment – The total number or size of all imported email attachments. Examples include an image file or Word document attached to an email, or an archive such as a ZIP file attached to an email.
  • Message OLE Attachment – The total number or size of all files embedded within a Message_Attachment (or another Message_OLE_Attachment). These embedded files are extracted during import. You can drill through this entry to see a list of documents that were embedded within a Message_Attachment (or another Message_OLE_Attachment). An example of this document class is a document embedded within a Word document that is attached to an email.
  • EDoc OLE Attachment – The total number and size of all files that were embedded within an EDoc (or another EDoc_OLE_Attachment). These embedded files are extracted during import. You can drill through this entry to see a list of documents that were embedded within an EDoc (or another EDoc_OLE_Attachment). Examples include a Word document within another Word document, or even an email embedded within a Word document.
  • Message Archive – The total number or size of documents that are Email Archives (email container files such as PST, OST, NSF found on disk). By default, email archives are excluded from Data by an Exclusion Search.
  • Archive – The total number or size of documents that are file archives found on disk or in compressed format (for example, a GZIP, ZIP, RAR, or TAR file found on disk). By default, compressed and file archive types are excluded from Data by an Exclusion Search.
  • Disk Image – The total number or size of documents that are disk images, such as an Expert Witness Compression Format File (for example, for EnCase and SMART). By default, disk images are excluded from Data by an Exclusion Search.
  • Directory – The total number and size of directories present for the imported data (empty, populated, skipped, or with access errors). By default, note that directories are excluded from Data by an Exclusion Search.

— If you have permissions, you can optionally download the document class report information to a CSV file (by default, DigitalReefReport.csv). When you select this option, you can name the CSV file, and you can select where to save the file locally. The CSV also provides a Directory column with the total number and size of any directories present for the imported data (empty, populated, or with access errors).

Document Types

By default, this report shows you the information based By Size (descending order) in GB, MB, or KB, depending on the size of the data.

Note: The Document Types report provides information to the Billing Summary.

The document types are categorized as follows:

  • Disk Images – The total number or size of documents that are disk images, such as a Logical Evidence File (LEF) or Expert Witness Compression Format File (for example, for EnCase, an E01). See Container Files for a complete list of disk image types. By default, disk images are excluded from Data by an Exclusion Search.
  • Email Archives – The total number or size of documents that are email archives (email container files such as PST, OST, NSF found on disk). See Container Files for a complete list of email archive types. By default, email archives are excluded from Data by an Exclusion Search.
  • Email Messages – The total number or size of all email documents, including loose emails (such as msg or eml files), emails from an email archive, or email attachments. Documents that are not identified as emails, such as email archives (email container files), are not counted in this category.
  • File Archives – The total number or size of documents in that have a compressed type or file archive type (for example, GZIP, ZIP, RAR, and TAR). See Container Files for a complete list of compressed types and file archive types. By default, compressed and file archive types are excluded from Data by an Exclusion Search.
  • Images – The total number and size, in MBytes (MB), of files identified as image files (supported image types, such as PNG, JPEG, and TIFF). For a list of supported file types, see Supported File Types for Analysis.
  • Office Files – The total number or size of Microsoft Office documents (including Microsoft Office supported file types and versions, such as Microsoft Word, Excel, PowerPoint, Write, and Works). For a list of supported file types, see Supported File Types for Analysis.
  • PDF – The total number and size of documents in Adobe Acrobat (PDF), Adobe Indesign, or PDF Image format.
  • Other – The total number or size of documents that do not fall into any of the other categories (for example, a Text 7-Bit File, Internet HTML files, and directories (by default, directories are excluded from Data by an Exclusion Search).
  • Unknown – The total number or size of documents that are of a type not recognized by the system (for example, the Unknown format file type).

— If you have permissions, you can optionally download the detailed document type report information (the file types) to a CSV file (FILETYTPE.csv). When you select this option, you are prompted to confirm or name the CSV file, and you can select the directory to which you save the file.

  • The File Type tab provides a list of the official file types, such as Internet HTML or Adobe Acrobat (PDF). Clicking will download a FILETYTPE.csv with the information.
  • The File Extension tab provides a list of the extensions for the files (for example, txt, pdf, and docx). For text/plain and unknown file types, the file extension is the actual extension of the file; for all other file types, the file extension is the standard extension associated with that file type. Not present represents files for which there is no discernible extension (for example, a directory does not have an extension). A blank entry indicates that the file had an empty extension (for example, just a space). Clicking will download a DOCEXT.csv with the information.
  • The Exceptions tab provides a list of extension exceptions to provide information about what would be affected if you decide to change the current file extension to the recommended extension. This chart provides columns for the Current extension, the Recommended extension, as well as the Count and Size. The default sort order is by Count. Clicking for this tab will download a DOCEXT_CONFLICT.csv with the information.

OCR Confidence

For any documents in the view that have been subject to OCR processing, this table displays the calculated OCR Confidence Level (a Name or numeric range, the associated document Count, and the associated document Size).

The information in this chart is reported as follows:

  • Not present identifies documents that were not subject to OCR Processing.
  • unknown identifies documents for which the OCR Confidence level could not be determined.
  • Each OCR Confidence Level is represented using a numeric range: 0-10 (Lowest Confidence), 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90, and 91-100 (Highest Confidence).

Each letter in each page of a document has an OCR Confidence level, and an average Confidence level is computed based on all pages of a document from which text was extracted. (Pages from which no text was extracted do not contribute to the average.)

The average OCR Confidence level for a document is reported in the ocraverageconfidencelevelfield using a value in the range 0-100. A value of 0 is the lowest confidence level and a value of 100 is the highest confidence level. The lowest confidence level calculated for any page in a document is reported in the ocrlowestconfidencelevel metadata field. The ocrlowestconfidencelevel and ocraverageconfidencelevel fields use a padded 5-digit value (for example, 00010 is Confidence level 10) to support range searching. For example, to search for documents whose average OCR Confidence level is in the inclusive range 20-60, you would specify the following search (using the Standard search syntax):

ocraverageconfidencelevel::[00020~~00060]

If you want to see the average number of terms calculated per page, check the averagenumberoftermsperpage metadata field.

Click to save the OCR Confidence information to a CSV file. When you select this option, you can name the CSV file, and you can select where to save the file locally.

Custodians

This chart shows up to 10 entries for Custodians ranked By Size or By Count for the total number documents that are assigned within the current view. If all documents are assigned to one or more Custodians, then all 10 entries will be for those Custodians. If some or all documents are not yet assigned to a Custodian, the documents will be in an Unassigned entry. (The Unassigned entry for a Project Data view appears in the report as long as there is at least one other Custodian present in the system.) Remember that specifying a non-zero value for the Custodian and Media Directory locations in the Project Index Settings enables you to auto-discover Custodians to add to the Project; otherwise, you will need to add Custodians to the Project. Note that Project Data views include the Custodians report; the Custodian Directories report applies to Imports and Data Set views only.

Click to download a CSV with the Custodian information.

For this chart, you can click to see Details with a detailed document size and count for all of the Custodians with documents assigned, not just the top 10.

Sources

This chart shows the By Size or By Count of each source of data that was imported into the Project. This chart focuses on the top five Data Areas for data added to Project Data.

Click for download to a CSV with the source data information.

For this chart, you can click for Details about the document size and count information for each source of data.

Tags

This chart shows the top 5 Tags that exist in the Project (by name), as ranked by the total size or count of documents tagged with that value.

Click to download a CSV with the Tags (Tags Summary) information.

For this chart, you can click to see Details with a detailed document count and size for the Tags.

Sending Domains

This chart shows the top 5 sending Domains associated with Project Data, based on the number of email messages sent from each Domain.

Click to download a CSV with the Sending Domain information.

For this chart, you can click for details about the document size and count information for each Sending Domain.

Receiving Domains

This chart shows the top 5 receiving Domains associated with Project Data, based on the number of email messages received by each Domain. A given Domain list can have a maximum of 1000 Domains.

Click to download a CSV with the Receiving Domain information.

For this chart, you can click for details about the document size and count information for each Receiving Domain.

Dominant Languages

This chart shows the document count by dominant language (using the standard ISO 639-1 code for the language, such as en for English). The language-related charts apply if language detection was enabled at the time of import (for each Data Set under Imports).

Document count by dominant language is reported as follows:

  • The chart displays each language (Top 10) using its language code (many are two letters).
  • You can hover over an entry in the chart to see the language name and the total document count and size for a given language code.
  • If a document has multiple languages, the document will be counted for the dominant language only (instead of counted for each language detected).
  • unknown identifies documents for which the language could not be determined.
  • Not present identifies documents that were not subject to Language Detection at import, either because the feature was disabled at import for some of the data, or the documents did not have content, were identified as binary files such as images (when OCR processing is disabled at import), or were not parsed successfully.
  • Click for download to a CSV with the dominant language information. The download CSV provides the full language names.
  • Click for Details about the document count by dominant language.

Languages

This chart shows the document count per language (using the standard ISO 639-1 code for the language, such as en for English). These charts apply if language detection was enabled at the time of import (for each Data Set under Imports).

Document count by language is reported as follows:

  • The chart displays each language (Top 10) using its language code (many are two letters).
  • You can hover over an entry in the chart to see the language name and the total document count and size for a given language code.
  • If a document has multiple languages, the document will be counted for each language detected.
  • unknown identifies documents for which the language could not be determined.
  • Not present identifies documents that were not subject to Language Detection at import, either because the feature was disabled at import for some of the data, or the documents did not have content, were identified as binary files such as images (when OCR processing is disabled at import), or were not parsed successfully.
  • Click for download to a CSV with the language information. The download CSV provides the full language names.
  • Click for details about the document count by language.

See Supported Languages for Language Detection for a list of languages that can be detected when language detection is enabled, along with their codes.

Email Sent Date and Email Received Date

The Email Sent Date and Email Received Date reports enable you to see the volume of files associated with a range of email sent or received dates. This can help you make decisions about emails that need to be reviewed more carefully based on the date they were sent or received and how much email was involved (for example, you may focus on a large volumes of emails sent 9 months ago).

Note: The document count displayed for a given entry (bar) in the Email Sent Date report or Email Received Date report represents one document per family (the parent message for each family). For these reports, when you double-click to perform a drill-through on a given bar, the drill-through search will include each document in each family matching the date range (both messages and attachments) and will have a higher count than the original entry. The document count displayed for a given entry (bar) in the Project Data-based Date report is calculated differently, and includes each document in each family (messages and attachments). Therefore, a drill-through of an entry in the Date report will have the same count as the original entry.

Email Sent Date and Email Received Date Options

Both the Email Sent Date and Email Received reports provide a number of options that enable you to work with start and end ranges. You can either use the start and end dates in effect when you initially view the report (which is derived from the earliest and latest dates for the view), or you can specify your own start and end dates.

On the left:

  • – Enables you to type a start date in the box or click the Calendar icon to use a calendar to specify zero-day email sent or email received date criteria.
  • – Enables you to type an end date in the box or click the Calendar icon to use a calendar to specify zero-day email sent or email received date criteria.
    1. To use the Calendar, click the icon to the right of the Start or End box.
  1. Click the date you want, either the current date or the date you typed (highlighted for you), or select another day in the current month.
  2. Click the left or right arrows in the top corners on either side of the month name to move back and forward a month.
  3. Click the month and year in the center, and then use the arrows to go back and forward a year. You can also select another month in the year shown.
  4. Once you make a complete date selection, the Calendar closes and you see the date formatted properly in the box.
  • – Click this button to have the report reset to the originally displayed date range (the default start and end date for the report).

On the right:

  • Previous Period – Moves the histogram information to the previous period. The period of time is dictated by your Start and End zero-day dates.
  • Next Period – Moves the histogram information to the next period. The period of time is dictated by your Start and End dates.

Chart Download, Zero Day Details, and Details Options

If you hover over a date block in the Email Sent Date or Email Received Date histogram, you will get a summary of the information for the block. For example, the hover text for the bar of a 2008 entry in the Email Sent histogram might display 2008: 113. A tooltip tells you that you can click to drill down or double-click to drill through. If you click once on a sent or received date block to drill down, you can get more date information for that sent or received date block. If you double-click to drill through an item in the report, the software performs a drill-through search. You can perform the drill-through search at any drill-down level. Also, from either the Email Sent Date or Email Received Date histogram, you can click the following:

  • — Enables you to download all available information to a CSV. The CSV always contains all available data (the data initially shown in the chart based on the earliest and latest dates found in the view). The CSV content does not change based on your selected start date or end date.
  • — Displays a Zero Day Details popup that enables you to select and view zero-day dates for which there was no document processed). You have the option hide weekend days.

Date Report

The Date report enables you to see the volume of files associated with a range of dates. This Date Report is based on the dateprimary field, which populates the Date column of document lists derived from Data and accommodates the date information for different types of source files (such as dates from documents on disk or email dates).

Note: The Date report document counts are calculated differently than the Email Sent Date and Email Received Date reports. The Date Report counts include each document in each family, while the Email Sent and Received counts include one document per family.

This report can help you make decisions about documents or emails that need to be reviewed more carefully based on a date and how much data was involved (for example, you may focus on a large volumes of documents modified this year).

Date Options

The Date report provides a number of options that enable you to work with start and end ranges. You can either use the start and end dates in effect when you initially view the report (which is derived from the earliest and latest dates for the view), or you can specify your own start and end dates.

On the left:

  • – Enables you to type a start date in the box or click the Calendar icon to use a calendar to specify zero-day email sent or email received date criteria.
  • – Enables you to type an end date in the box or click the Calendar icon to use a calendar to specify zero-day email sent or email received date criteria.
    1. To use the Calendar, click the icon to the right of the Start or End box.
  1. Click the date you want, either the current date or the date you typed (highlighted for you), or select another day in the current month.
  2. Click the left or right arrows in the top corners on either side of the month name to move back and forward a month.
  3. Click the month and year in the center, and then use the arrows to go back and forward a year. You can also select another month in the year shown.
  4. Once you make a complete date selection, the Calendar closes and you see the date formatted properly in the box.
  • – Click this button to have the Date report reset to the originally displayed date range (the default start and end date for the report).

On the right:

  • Previous Period – Moves the histogram information to the previous period. The period of time is dictated by your Start and End zero-day dates.
  • Next Period – Moves the histogram information to the next period. The period of time is dictated by your Start and End dates.

Chart Download, Zero Day Details, and Details Options

If you hover over a date block in the chart, you will get a summary of the information for the block. For example, the hover text for the bar of a 2007 entry in the Date Report might display 2007: 750. A tooltip tells you that you can click to drill down or double-click to drill through. If you click once on a date block to drill down, you can get more date information for that date block. If you double-click to drill through an item in the report, the software performs a drill-through search. You can perform the drill-through search at any drill-down level. Also, from the bottom of the Date report, you can click the following:

  • — Enables you to download all available Date information to a CSV. The CSV always contains all available data (the data initially shown in the chart based on the earliest and latest dates found in the view). The CSV content does not change based on your selected start date or end date.
  • — Displays a Zero Day Details popup that enables you to select and view zero-day dates for which there was no document processed). You have the option hide weekend days.

Email Addresses Sent and Email Addresses Received

The Email Addresses Sent and Received reports enable you to see the top 10 email addresses associated with sent and received email. This can help you make decisions about emails that need to be reviewed more carefully based on the email addresses.

An Email Address Sent entry is based on the from and sender metadata fields.

An Email Address Received is based on the to, bcc, and cc fields.

From either the Email Addresses Sent or Email Addresses Received report, you can click the following:

  • — Enables you to download all available Email Address Sent or Received information to an XLSX file. The download file always contains all available data. By default, the file is called DigitalReefReport.xlsx, but you can select the appropriate name when you save the file.
  • — Displays with more information about the email addresses.