View and Manage Imports

Imports > Summary

Requires Imports - View Permissions for viewing Imports, Imports - Add/Edit Permissions to perform an Import and perform Import-related operations, and Imports - Delete Permissions to delete an Import

For users with the appropriate permissions, the Imports Summary tab provides information about all of the source data added to the Project (each imported Data Set, each Data Set added from a Load File, or each Shared Data Set).

Note: Digital Reef now restricts import and reprocessing of data to Projects using Parsing Library V2. You can no longer import, reindex, or reprocess data in a Parsing Library V1 Project.

When you add source data to your Project, you have several options. You can create a new Data Set, create a new Data Set from a Load File, or add a Shared Data Set from another Project in the Organization. To select how you want to add source data, you can use the New Data Set... option or its accompanying drop-down with the additional options from the Imports Summary tab, or you can make a selection from the Imports menu of options in the tree (available by either right-clicking Imports or by clicking the ellipses to the right of Imports). The following summarizes the options that enable you to add source data:

  • New Data Set... — Use this option to create a Data Set from one or more source locations associated with a selected Connector. Once you select a Connector, you can browse its contents to select one or more source locations (Data Areas) for import as a named Data Set at a selected Index Level (by default, an Analytic Index). You can also verify the current Project Index Settings and Pattern Settings that would apply to the import.
  • New Data Set from Load File... — Use this option to create a Data Set based on the contents of a Load File. Once you select the appropriate Connector, you can browse to select the Load File you want to import (for example, a character-delimited file such as a Concordance DAT file, or an EDRM XML load file) and select the corresponding Load File Type. You can select a Project-level Load File Import template and review it to ensure that it contains the appropriate field information and mapping for the Load File. For the Load File import, you can also verify the Index level, current Project Index Settings, and Pattern Settings that would apply to the import.
  • New Data Set for Short Message Format... — Use this option to create a Data Set for Short Message Format (currently Cellebrite) from one or more source locations associated with a selected Connector. Once you select a Connector, you can browse its contents to select one or more source locations (Data Areas) for import as a named Data Set for Short Message Format at a selected Index Level (by default, an Analytic Index). You can also verify the current Project Index Settings and Pattern Settings that would apply to the import.
  • Add Shared Data Set... — Use this option to add an existing Data Set (not Load File) that has been made available for sharing in the Organization by an originating Project. You cannot edit the Index level or Shared state for this type of Data Set in the originating Project. You can only add this type of Data Set once.

Note: Once you have added some data under Imports, and at least one Data Set is at the Analytic Index level, you can also use the Imports options Calculate Word List and then View Word List. You can select any index level for each Data Set under Imports, but be aware that having mixed levels of processing within Imports (for example, a mix of Data Sets at a File or System Metadata Index level and Data Sets at a Content or Analytic Index level) behaves as follows when you search all of Imports: metadata searches (for example, field searches, or a search run with the Include Metadata checkbox enabled) will return results that meet the metadata search query, but a search that includes a content (keyword) query (run without the Include Metadata option) will return an error message to indicate that a Content Index configuration file is not present.

The Imports summary provides a Refresh option and supports paging for viewing information for a large number of Data Sets (see the Page Controls section for more information).

Imports Summary

This Imports summary provides an entry for each Data Set or Load File Import with the following information:

  • Data Set Name— The name assigned to the Data Set when it was added. You can click a given Data Set name (shown in blue to indicate it is a hyperlink) to open that Data Set view. You can also double-click in the row to open that Data Set view.
  • Index Level — The current indexing state of the Data Set (for example, In Progress, Analytic Index, or Do Not Index). If you cancel a Data Set Import or an import fails, you will see an indexing state of Do Not Index, which reflects that no indexing was performed.
  • Date Created The date (full timestamp) on which the Data Set was added (for example, 2018-09-05 10:13:13).
  • In Project Data Displays a checkmark if one or more documents in the Data Set are present in Project Data (for example, the entire Data Set was added to Project Data, or documents from results of a Data Set search or from the Data Set doc list were saved to Project Data). In general, you can identify Adding documents to Project Data tasks in the Work Basket. The state of this checkbox does not change in response to Remove from Project Data operations when you are in a Project Data view. The state of this checkbox will change if you explicitly do a Remove from Project Data for a selected Data Set under Imports (either by right-clicking on the Data Set in the tree, or by selecting a Data Set from the Imports Summary).
  • Connector Name — The name of the Connector selected to provide the Data Set. Note that when a Copy to Document Storage operation successfully copies everything in the Data Set, you will no longer see the Connector or Data Area information displayed for the Data Set in the Imports Summary, since the Data Area import location is no longer associated with that Data Set. If the operation was a partial copy with exclusions, or files that failed to copy, you will still see the Connector and Data Area information.
  • Data Areas — One or more Data Areas selected for the Data Set at import. The list is comma-separated. Ellipses (...) indicate when there is more information to view. You can hover over the text in this column to see all of the Data Area information.
  • Sharing — The current Sharing state for a Data Set (not an imported Load File):
    • Private — In the Project owning the Data Set, indicates that Sharing is not enabled for a given Data Set. This is the default state for a Data Set created in a Project.
    • Shared — Indicates that a Data set is public and available for sharing in the Organization. Sharing is enabled for the Data Set (as set using the Share option). ) In general, processing operations such as reindexing, reprocessing, OCR, and updating metadata (from Data Set or Imports results) are not permitted once the Data Set is Shared, either in the originating Project or a Project using the Shared Data Set. However, you are permitted to update the Patterns in the originating Project for a Shared Data Set.
  • Description — A description assigned to the Data Set.
  • Batch — The Batch name/value for the Data Set, as it is stored in the index and reflected in the metadata. If you did not specify a Batch name/value as part of import, the name you assign to the Data Set is used by default.

Selected Data Set Options

You can see a menu of available options for a selected Data Set in the Imports Summary by either right-clicking the Data Set or by clicking the ellipses at the far right of the Data Set entry in the table (to the right of the Batch column) These options are also available upon right-click of a selected Data Set in the Navigation Tree:

  • Add Tags...— Launches the Tag dialog, from which you can select Tags to apply. You can also use the Tag dialog to Create a Tag and use it right away.
  • Add To... — If you have the appropriate permissions, you can either add documents from a Data Set view to all of Project Data, or to a specific view in Project Data (a selected Custodian, MediaID, Batch, or Folder view based on permissions). For more information, see Add to or Remove Documents from Select Project Data Views. Note that you can only add data at the Analytic Index or Content Index level to Project Data. You cannot add data at the System or File Metadata Index level to Project Data.
  • Add To Project Data — If the selected Data Set is at the Content or Analytic Index level, enables you to add the new data set to Project Data automatically. By default, this option is cleared. You cannot add data at the System or File Metadata Index level to Project Data. Note that when you perform an import at the Analytic Index level with this option (or select Add to Project Data as a right-click option for the Data Set after import), the software performs Custodian, MediaID, and Batch view generation for all documents in Project Data, not just the documents to be added with this Data Set batch.
  • Remove from Project Data — Removes all Data Set documents that are in Project Data, if applicable. The documents will still reside in the Data Set, just not in Project Data.
  • Find Exact Duplicates — Searches for documents in the Data Set that have exactly the same content and metadata as the selected document. An exact duplicate would have the same file MD5 value.
  • Find Content Duplicates — Searches for documents in the Data Set that have the same content as the selected document.
  • Calculate Word List — Calculates the Word List for all documents in the Data Set. A task appears in the Work Basket while the Word List is being generated. When the task completes, you can view the Word List.
  • View Word List... — Launches the Word List dialog and enables you to view the calculated Word List for all documents in the Data Set.
  • Create Manifest... For a Data Set selected from the Imports Summary, launches the Create Manifest dialog, from which you can generate a CSV or XML manifest for a Data Set, using either the current fields or all fields, or fields from a template. From the Work Basket task for the manifest generation, you can then right-click and select Download to download the file to a destination local to your computer. Users with Server Access permissions can also save the manifest to a server location. For download of a large manifest file (over 200 MB), the software places the manifest in a ZIP file, which you can then unzip. Note that this process can take time. (You can also issue Create Manifest for eligible tasks in the Work Basket, including the Create representation task for a Data Set and search result views with results.)
  • Download All as PDFs — Enables you to download all documents in the view as PDFs to your local environment so that you can view the documents in PDF format. When you select this operation, you can select the Stamp Document Number option if you want to include a stamp with the document number (docnum) on the bottom right of each page in the PDF. Note that this operation will also show a Warning popup that states the following: You are attempting to download all documents in this list as PDFs. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the PDFs directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. If you proceed, the software will prepare a ZIP file, by default named <projectname>_PDFs.zip. An information popup indicates that the PDFs are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that certain file types are ignored for PDF generation, including any selected directory folders not removed from your Project during setup by your administrator, disk images, file archives, mail archives, empty files, and files for which the native is not available. A WARNING_DETAILS_REPORT.csv file identifying the files that were skipped or failed PDF generation can be downloaded from the appropriate PDF-related Work Basket task. See About Downloading Documents as PDFs and Natives for more information.
  • Download All Natives — Enables you to download all documents in the view to your local environment so that you can view the documents in their native format. You will see a Warning popup that states the following: You are attempting to download all natives in this list. Depending on the size of the documents, this could take considerable time and/or render the browser unresponsive. Consider creating a new export stream to produce the natives directly to an export location instead. At this point, you must either confirm the operation by clicking Continue, or click Cancel instead. If you proceed, the software will prepare a ZIP file, by default named <projectname>_Documents.zip. An information popup indicates that the documents are being prepared for downloading, and once finished, the archive (ZIP) can be downloaded from the Work Basket. Note that any directory folders are ignored for the download. A WARNING_DETAILS_REPORT.csv file identifies any native files that were not downloaded. See About Downloading Documents as PDFs and Natives for more information.
  • View Configuration... — For a selected Data Set, displays the set of index settings that were in effect when the Data Set was processed. See View Configuration for more information.
  • Update Patterns — For a Private Data Set or Shared Data Set in the originating Project that is at the Content or Analytic Index level, this option submits a request to update the Patterns for the Data Set (that is, Patterns that have been modified since import). You cannot update Patterns for a Data Set at index levels other than Content or Analytic (for example, File Metadata or System Metadata). This option is grayed out and not available for a Shared Data Set in a sub-Project (that is, a Project using the Shared Data Set), but it is available for a Shared Data Set in the originating Project. If you update Patterns in the originating Project for a Shared Data Set, all sub-Projects using the Shared Data Set will receive the Pattern changes for that particular Data Set. Sub-Projects will be blocked while the Pattern update is in progress.Enabled Pattern matches will be stored in the pattern metadata field; Pattern matches for enabled Patterns with Store Value enabled will be stored in the patternvalue field. This option will act on the text representations that already exist from initial import, reprocessing, or OCR. If you see a Warning icon () for this operation in the Work Basket, one or more documents did not have their Patterns updated. You can then click the icon to request a download or use the right-click Download option for the completed Work Basket task to download a WARNING_DETAILS_REPORT.csv file that identifies the reason why the Patterns were not updated for certain files. (See How to Update Patterns for more information about the exceptions that apply to this operation.)
  • Reindex... — Submits a request to reindex a Private Data Set at an available level (for example, to go from Do Not Index to a selected level for a failed or canceled Import, or to go from Content to Analytic Index level). If the Data Set is already at an Analytic Index level, this option will be grayed out and unavailable. You cannot reindex a Shared Data Set, even in the originating Project. The Index Level for the Data Set, shown in the Data Sets Summary, will change to In Progress if the operation is permitted. This option is grayed out and unavailable for a Shared Data Set in a sub-Project (that is, a Project using the Shared Data Set).
  • Share | Unshare — For a Data Set eligible for sharing in the Organization, makes the Data Set Public (Shared). A Data Set is initially Private. Note that imported Load Files cannot be shared. If you want to share a Private Data Set with other Projects in your Organization, make sure that you have the Index Level that you want, and have performed any OCR or other reprocessing before you share the Data Set. (Reindexing, OCR, and reprocessing are unavailable once the Data Set is Shared.) When you are satisfied with the Index Level and processing of the Data Set, select the Share option for the Data Set if you want to share the Data Set. A popup confirms that the named Data Set has been shared. Note that once a Data Set is Shared, you will get an error if you try to change the Shared state by clicking Unshare if the Data Set is in use by any another Project. Only originating Projects can unshare the Shared Data Set when it is not in use by other Projects. Projects that add the Shared Data Set (that is, they are a sub-Project but not the originating Project) cannot use either the Share or Unshare options for that Data Set; these options will be grayed out and unavailable.
  • Copy to Document Storage... (available only for Data Sets that have not already been copied or were partially copied) – For a Data Set created in your Project, this option enables you to review the Copy to Document Storage Exclusion Options and then copy the source files from an imported Data Set's import location to the Organization’s designated Document Storage. (A System Administrator manages the Document Storage for an Organization using the Admin interface.) This helps free the storage associated with the import location. Some document metadata fields (dahandle and darelativepath) will be updated to reflect the new location of the documents after the copy. If this option appears grayed out for a Data Set, everything in the Data Set has already been copied to Document Storage. When you perform this operation, a copy task (Copy to Document Storage <Data Set Name>) appears in the Work Basket. If necessary, you can cancel this task from the Work Basket. If you see a Warning icon () in a completed Copy to Document storage task in the Work Basket, one or more files were either excluded from the copy or failed to copy. You can then use the right-click Download option for the Work Basket task to download a WARNING_DETAILS_REPORT.csv file that identifies the reason why the file was not copied. (See How to Perform a Copy to Document Storage for more information about the exclusions or errors that apply to the copy operation.) Since the Data Set is considered partially copied, you can then select the Copy to Document Storage option from the Imports Summary again if you want to retry the copy to potentially copy more of the files previously excluded or that failed to copy.You can also perform a copy of the Data Set source files to Document Storage as part of a given import, although doing so will impact your import time. This option does not apply to a Load File and is grayed out and not available for a Shared Data Set in a sub-Project (that is, a Project using the Shared Data Set). Note that if the originating Project performs this operation with a copy of all files, you will no longer see the Connector or Data Area information displayed for the Data Set in the Imports Summary of a sub-Project.

Note: The Copy to Document Storage operation has the ability to preserve the staging used by certain file types (as long as their associated document classes are not excluded by the operation), such as Forensic Image file types, multi-part RAR files, and Bloomberg Message Dump files. For more information about Copy to Document Storage, see How to Perform a Copy to Document Storage Operation.

  • Edit... — For users with the appropriate Imports - Add/Edit permissions, launches the Edit Data Set dialog to permit editing of the name or description of the Data Set. This option is grayed out and unavailable for a Shared Data Set in a sub-Project (that is, a Project using the Shared Data Set). Note that if the originating Project edits the name of a Shared Data Set, the updated name will appear in a sub-Project.
  • Delete — For users with the appropriate Imports - Delete permissions, initiates a request to delete a Data Set and its documents. You can also use this option to delete a Data Set based on an imported Load File or detach from a Data Set that you have added to your Project from the Organization. When you select this option, confirmation is required; select OK to process or Cancel to cancel the operation. Proceeding with the deletion generates a Work Basket task that you can monitor for progress of the deletion. Your ability to delete a Data Set from a Project is based on whether the Data Set is currently in use within the Project (for example, in Project Data, or a Folder). If the Data Set is not in use, the operation deletes documents and references to the selected Data Set from the Project. For a Data Set owned by the Project (Private), the operation also deletes the Data Set from the Organization and frees all associated resources.

Note: If you are in the originating Project that Shared a Data Set and happen to Delete that Data Set, the Data Set will be removed from the Project , but any other attached Projects will still be able to use the Data Set. This is because the Organization level technically owns the Shared Data Set and internally provides an Organization default "home" for the Shared Data Set. If you are in a Project that has added a Shared Data Set, you can use the Delete option to detach the Shared Data Set instance from the Project (and the Project will no longer appear in the attached Project list under Organization > Shared Data Sets). The detached Shared Data Set can be added to that Project again, or added to another Project.

Page Controls

For multi-page lists, you can select a page to display. By default, a given page will display 100 items.

The paging area shows a range and enables you to enter the page number in the box or use the appropriate arrows to navigate.

You can perform an immediate refresh by clicking the icon on the Page Control bar.