Add Data from a Load File

Imports > New Data Set from Load File

Requires Imports - Add/Edit Permissions

Note: Digital Reef now restricts import and reprocessing of data to Projects using Parsing Library V2. You can no longer import, reindex, or reprocess data in a Parsing Library V1 Project.

Users in a role with the appropriate permissions can import data from a load file. This process enables you to use a selected Connector (such as NFS or CIFS, Microsoft Exchange, or SharePoint) to import a supported load file and build an index representation based on the information in the load file.

Supported Load Files include the following:

  • A Concordance DAT file (character-delimited)
  • A CSV file (character-delimited)
  • An XML file (assumed to be EDRM-compliant)

Note: An imported Load File cannot be Shared across Projects in the Organization.

Character-Delimited File Usage Notes

Prior to importing a character-delimited load file (DAT or CSV), you must define Load File Import Settings as part of your Project Settings to serve as a Project Load File Import template for the Load File type. You can also define Load File Import information in an Organization-level Load File Import template.

The Project Load File Import template you configure would contain all the information needed to accommodate the format of the Load File type and its field mappings. You must supply Load File Field information in the Project-level Load File Import template screen.

Note: For a successful Load File import, you must supply the correct information. For example, you must specify the appropriate encoding type and delimiter for a DAT or CSVfile,, which are required to parse data from the Load File. You must also provide the required mapping entries based on your Load File. For example, if the Preserve Families option stays enabled to preserve family information, you must supply the required mapping entries in the Load File Fields. Moreover, if you want to process Load File Images, you must also select the Process Load File Images option.

For information on adding and managing a Project-level Load File template, see Manage Load File Import in a Project-level Template. For adding and manage an Organization Load File Import Settings template, see Add and Manage Load File Import Settings in a Template.

When you are ready to perform the import, you must select a Connector and select the load file at the volume level. Then be sure to select the correct Load File Import template based on the mapping and settings you need.

EDRM XML Usage Notes

  • You must use an NFS or CIFS Connector for an EDRM XML Load File.
  • When selecting the EDRM XML file location associated with the Connector, you can select one Data Area only.
  • You can browse for the Data Area. The Data Area you select must contain the load file and the data, as specified in the EDRM XML load file. Be sure to select the right location that contains the data and the EDRM load file (for example, you might have a load file directory called XML_load_example, which has folders with the data under it and the load file itself, for example, XMLLoadfile1.xml).
  • You must make sure that the correct load file name is specified.
  • You cannot perform an index update (reindex) of a data set based on an EDRM XML load file.
  • The system does not process any relationship information provided in the EDRM XML load file.
  • The EDRM XML import process uses the native file if it is provided. If no native file type is provided, the process uses the image and text data.
  • The EDRM XML import process supports a multi-page TIFF document. A combined image is created using the naming convention docid-combined-img.tif. If the TIFF images have any corresponding text, that text is combined as well, and associated with the ocrpath metadata field of the document.
  • The ocrstatus metadata field for the document will be marked as EXTOCR if OCR processing took place externally, outside the application. If the document has an ocrstatus of EXTOCR, it is not subject to OCR processing again when the load file is added.
  • To see a sample mapping of EDRM fields to Digital Reef fields, see About EDRM XML Load File Content.

How to Add a New Data Set from Load File

The New Data Set from Load File screen is divided into several areas that enable you to set up an import of a Load File. You do not have to follow any particular order (that is, it does not matter if you name the Load File before selecting the Connector).

Select a Connector

From a list of available Connectors, select a Connector by clicking the entry for the Connector in the table. Each Connector is shown with the following information:

  • Connector Name — The name assigned to the Connector.
  • Description — The description for that Connector.
  • Type — The type of Connector (for example, CIFS, NFS, Exchange, or SharePoint). For information about the information used to create a Connector, see Create a Connector.
  • Mode — The Connector mode, either Read or Read/Write. For import, either is valid. The Connector mode is determined when a Connector is created.
  • Server — The IP address or URL associated with the Connector.
  • Path — The mount point for the Connector, which determines what you see in the Folder area. (You should see what is available from the mount point on down.)

Navigate to Select Your Load File

As soon as you select a Connector, the appropriate Folder (Path) information appears below the Connector list. In this area, you can browse the Connector contents to view the available folders and load files.

At the appropriate location (for example, the Volume level for a DAT file), click to select the appropriate Load File, which populates the Path field. You can edit the Path field, if necessary.

Assign a Name to the Load File Import and Optional Description

In this area, you supply a name and optional description for the Load File import.

  • Name - You must assign a name that will be used to represent the Load File. The name must be unique within the Organization and can include alphanumeric characters, spaces between characters in the name (leading and trailing spaces are ignored), and some supported characters (such as a hyphen, underscore, and apostrophe), as well as characters from some foreign languages (for example, Korean characters). However, the following characters are not supported for Data Set names:

! " # $ % & * + . / : ; < = > ? @ [ \ ] ^ { | } ~ “ ”

Note: These character restrictions apply to most tree items, such as Imports, Exports, Tags, Folders, Saved Searches, Workflows, Comparisons, Samples, and Synthetic Documents. To support auto-discovery of Custodians based on staging, a Custodian name has fewer restrictions regarding invalid characters.

  • Description - Optionally assign a helpful description of this Load File.

Select a Load File Template

You must use the Load File Template field to select a Project-level Load File template from the drop-down. (From the Project Settings, select Load File Import and the appropriate Load File template to define the Load File Fields and other Load File information.)

Select an Index Level

Select the appropriate Index representation level. For Index Level:, select the appropriate Index level. Use the default of Analytic Index if you want to have a Load File take advantage of all analytic capabilities. The different levels of Indexing are as follows:

  • Do Not Index – Does not perform any Indexing of the files.
  • System Metadata – Restricts users to a system (structural metadata) view and a restricted subset of related operations. The Metadata List identifies the system (structural) metadata fields.
  • File Metadata – Restricts users to a metadata-only view of file (embedded) metadata as well as system metadata. This type is also associated with a restricted subset of related operations. When you select File Metadata, RAR, TAR, and ZIP archives are expanded by default to reveal the file metadata for the archive content, but you have the option to disable the expansion of RAR, TAR, and ZIP archives. File Metadata mode always supports the identification and import of Forensic Images (for example, EWF Files that collectively form a disk image).
  • Content Index – Gives users a view of document content and document metadata, thereby providing operations that enable analysis of both content and metadata. This is the only Index level you can later upgrade to an Analytic Index.
  • Analytic Index (default) – Enables users to take advantage of the additional analytic operations such as Document Similarity and Clustering. With this Indexing type, you can use a Project Analytic setting to ignore or include Stop Words for Document Similarity operations and Clustering, if applied.

Select Index Settings

You can view or change the current Index Settings for the Project by clicking Edit. This launches the Project Settings screen, from which you can control the Index Settings for the Load File import.

Review Pattern Detection Settings

When you create a Load File, you can review and manage your Pattern Detections Settings for the Load File Import. By default, Pattern Detection is enabled, which enables you to click Editand view the current Patterns screen for the Project. You can then use the Patterns screen to control the Patterns for the Load File import.

Assign an Optional Batch Name

If you want, you can specify a Batch name or number for the new Data Set. If you do not set a Batch name or number as part of import, the Data Set name is used. To verify the Batch name or number after import, you can view the batch field in the document metadata eDiscovery Properties after import, the Data Set Report > View Details > Scan History after selecting the import from the Scan History tab, or theView Configuration information for a selected Data Set in the Imports Summary. It also appears in the appropriate file manifest upon export

View Other Legal Discovery Options

The optional Batch name or number is one of the Legal Discovery options you can set for a Load File. To define the other available eDiscovery options, select Other Legal Discovery Options. In the Other Legal Discovery Options popup that appears, you can view or set the complete set of eDiscovery options.

Note: If you do not set a Batch name or number for the Load File at import, the Load File name is used as the Batch value.

Submit or Cancel the New Data Set from Load File Operation

When you have finished the setup of the Load File Import, click New Data Set from Load File to complete the process and return to the Data Sets Summary. If you do not want to perform the operation, click Cancel instead.

When the import is complete, the new Load File import appear in the Imports Summary.

You can monitor your import task in the Work Basket. Right-click the task and select View Details when the task is in progress to see the state of the task, the various system components, and the configuration settings you used.

Load File Import Failures and Warnings

If the encoding of the provided Load File does not match the encoding selection on the given Load File template, the entire Load File import fails with an error message "The encoding of the provided load file does not match the selection on the corresponding Load File Import Setting."

A Warning icon () for the Work Basket task indicates that the Load File Import task completed with exceptions. You can then right-click the task and use the Download option to download the line-by-line errors in a CSV file. (Lines that do not have errors are imported.)

Note: If any child within a given family has an error, the entire family will fail to be imported.

Examples of errors include the following:

  • Native file not found at specified path.
  • Text file not found at specified path.
  • Error in formatting for line of document (for example, the number of parameters for a given line do not match the number of parameters in the load file header).
  • Duplicate extbegdoc value encountered. (Typically, load files contain unique values for a DocID column, which should be the field mapped to extbegdoc.)
  • Blank value for required field extbegdoc.

  • Unexpected error encountered during import of a document.