Add Data from a Load File

Imports > New Data Set from Load File

Requires Imports - Add/Edit Permissions

Note: Digital Reef now restricts import and reprocessing of data to Projects using Parsing Library V2. You can no longer import, reindex, or reprocess data in a Parsing Library V1 Project.

Users in a role with the appropriate permissions can import data from a load file. This process enables you to use a selected Connector (such as NFS or CIFS) to import a supported load file and build an index representation based on the information in the load file.

The process of adding a body of data to a Project assumes that a service provider or enterprise System Administrator has made data available to your Organization using one or more Connectors, such as NFS or CIFS, Microsoft Exchange, or SharePoint.

Supported Load Files include the following:

  • A Character-Delimited File, such as a Concordance DAT file or a CSV file
  • An EDRM XML load file

Note: An imported Load File cannot be Shared across Projects in the Organization.

A specific Load File type has usage notes you should review.

Character-Delimited File Usage Notes

Prior to importing a character-delimited file such as a DAT load file, you must define Load File Import Settings as part of your Project Settings to serve as a Project Project Load File Import template for the Load File type. You can also define Load File Import information in an Organization-level Load File Import template.

The Project Load File Import template you configure would contain all the information needed to accommodate the format of the Load File type and its field mappings. You must supply Load File Field information in the Project-level Load File Import template screen.

Note: Your Load File import will fail if you do not supply the correct information. For example, you must specify the appropriate encoding type and delimiter for the DAT file, since these are required to parse data from the Load File. You must also provide the required mapping entries based on your Load File. For example, if the Preserve Families option stays enabled to preserve family information, you must supply the required mapping entries in the Load File Fields. Moreover, if you want to process Load File Images, you must also select the Process Load File Images option.

For information on adding and managing a Project-level Load File template, see Manage Load File Import in a Project-level Template. For adding and manage an Organization Load File Import Settings template, see Add and Manage Load File Import Settings in a Template.

When you are ready to perform the import, you must select a Connector and select the load file at the volume level. . Then be sure to select the correct Load File Import template based on the mapping and settings you need.

EDRM XML Usage Notes

  • You must use an NFS or CIFS Connector for an EDRM XML Load File.
  • When selecting the EDRM XML file location associated with the Connector, you can select one Data Area only.
  • You can browse for the Data Area. The Data Area you select must contain the load file and the data, as specified in the EDRM XML load file. Be sure to select the right location that contains the data and the EDRM load file (for example, you might have a load file directory called XML_load_example, which has folders with the data under it and the load file itself, for example, XMLLoadfile1.xml).
  • You must know the file name before you begin the load process. You must specify EDRM_XML as the Load File type (selectable in the drop-down next to the Load file name). You must make sure that the correct load file name is specified, and select EDRM_XML from the drop-down representing the load file type.
  • You cannot perform an index update (reindex) of a data set based on an EDRM XML load file.
  • The system will not process any relationship information provided in the EDRM XML load file.
  • The EDRM XML import process uses the native file if it is provided. If no native file type is provided, the process uses the image and text data.
  • The EDRM XML import process supports a multi-page TIFF document. A combined image is created using the naming convention <docid>-combined-img.tif. If the TIFF images have any corresponding text, that text is combined as well, and associated with the ocrpath metadata field of the document.
  • The ocrstatus metadata field for the document will be marked as EXTOCR if OCR processing took place externally, outside the application. If the document has an ocrstatus of EXTOCR, it is not subject to OCR processing again when the load file is added.
  • To see a sample mapping of EDRM fields to Digital Reef fields, see About EDRM XML Load File Content.

How to Add a New Data Set from Load File

The New Data Set from Load File screen is divided into several areas that enable you to set up an import of a Load File. You do not have to follow any particular order (that is, it does not matter if you name the Load File before selecting the Connector).

Select a Connector

From a list of available Connectors, select a Connector by clicking the entry for the Connector in the table. Each Connector is shown with the following information:

  • Connector Name — The name assigned to the Connector.
  • Description — The description for that Connector.
  • Type — The type of Connector (for example, CIFS, NFS, Exchange, or SharePoint). For information about the information used to create a Connector, see Create a Connector.
  • Mode — The Connector mode, either Read or Read/Write. For import, either is valid. The Connector mode is determined when a Connector is created.
  • Server — The IP address or URL associated with the Connector.
  • Path — The mount point for the Connector, which determines what you see in the Folder area. (You should see what is available from the mount point on down.)

Navigate to Select Your Load File

As soon as you select a Connector, the appropriate Folder (Path) information appears below the Connector list. In this area, you can browse the Connector contents to view the available folders and load files. By default, All Files are shown at a given location, but you can select Load Files from the drop-down to display only Load Files (.dat, .csv, etc.).

At the appropriate location (for example, the Volume level for a DAT file), click to select the appropriate Load File, which populates the Path field. You can edit the Path field, if necessary.

Assign a Name to the Load File Import and Optional Description

In this area, you supply a name and optional description for the Load File import.

! " # $ % & * + . / : ; < = > ? @ [ \ ] ^ { | } ~ “ ”

Note: These character restrictions apply to most tree items, such as Imports, Exports, Tags, Folders, Saved Searches, Workflows, Comparisons, Samples, and Synthetic Documents. To support auto-discovery of Custodians based on staging, a Custodian name has fewer restrictions regarding invalid characters.

Select a Load File Type

You must select a Load File Type for the Load File import. You can choose one of the following types of Load Files:

  • Character-delimited file (.dat, .csv., etc.) (the default) - This can be any character-delimited file, such as a DAT file or CSV.
  • EDRM_XML (.xml) - Select this type for an EDRM Load file (.xml).

Select a Load File Template

Use the Load File Template field to select the Project-level Load File template from the drop-down. You can use the System Created template (Project-level) if you have populated it with the correct information, or select another template you have created, but whichever template you choose must contain the appropriate information to accommodate the load file mappings. From the Project Settings, select Load File Import and the appropriate Load File template to define the Load File Fields and other Load File information.

Select an Index Level

Select the appropriate Index representation level. For Index Level:, select the appropriate Index level. Use the default of Analytic Index if you want to have a Load File take advantage of all analytic capabilities. The different levels of Indexing are as follows:

  • System Metadata – Restricts users to a system (structural metadata) view and a restricted subset of related operations. The Metadata List identifies the system (structural) metadata fields.
  • File Metadata – Restricts users to a metadata-only view of file (embedded) metadata as well as system metadata. This type is also associated with a restricted subset of related operations. When you select File Metadata, RAR, TAR, and ZIP archives are expanded by default to reveal the file metadata for the archive content, but you have the option to disable the expansion of RAR, TAR, and ZIP archives. File Metadata mode always supports the identification and import of Forensic Images (for example, EWF Files that collectively form a disk image).
  • Content Index – Gives users a view of document content and document metadata, thereby providing operations that enable analysis of both content and metadata. This is the only Index level you can later upgrade to an Analytic Index.
  • Analytic Index (default) – Enables users to take advantage of the additional analytic operations such as Document Similarity and Clustering. With this Indexing type, you can use a Project Analytic setting to ignore or include Stop Words for Document Similarity operations and Clustering, if applied.

Select Index Settings

You can view or change the current Index Settings for the Project by clicking Edit. This launches the Project Settings screen, from which you can control the Index Settings for the Load File import.

Review Pattern Detection Settings

When you create a Load File, you can review and manage your Pattern Detections Settings for the Load File Import. By default, Pattern Detection is enabled, which enables you to click Editand view the current Patterns screen for the Project. You can then use the Patterns screen to control the Patterns for the Load File import.

Assign an Optional Batch Name

If you want, you can specify a Batch name or number for the new Data Set. If you do not set a Batch name or number as part of import, the Data Set name is used. To verify the Batch name or number after import, you can view the batch field in the document metadata eDiscovery Properties after import, the Data Set Report > View Details > Scan History after selecting the import from the Scan History tab, or theView Configuration information for a selected Data Set in the Imports Summary. It also appears in the appropriate file manifest upon export

View Other Legal Discovery Options

The optional Batch name or number is one of the Legal Discovery options you can set for a Load File. To define the other available eDiscovery options, select Other Legal Discovery Options. In the Other Legal Discovery Options popup that appears, you can view or set the complete set of eDiscovery options.

Note: If you do not set a Batch name or number for the Load File at import, the Load File name is used as the Batch value.

Submit or Cancel the New Data Set from Load File Operation

When you have finished the setup of the Load File Import, click New Data Set from Load File to complete the process and return to the Data Sets Summary. If you do not want to perform the operation, click Cancel instead.

When you click New Data Set from Load File, you will see a message if you have not yet supplied the information for a required field, such as Name. When the import is complete, you will see your new Load File import appear in the Imports Summary. Unless you have a very small Load File, you will see that the Index Level appears as In Progress while indexing is in progress. This changes to the appropriate level when the indexing completes.

You can monitor your import task in the Work Basket. Right-click the task and select View Details when the task is in progress. This will show you the state of the task, the various system components, and the configuration settings you used.

Load File Import Failures and Warnings

If the encoding of the provided Load File does not match the encoding selection on the given Load File template, the entire Load File import will fail with an error message "The encoding of the provided load file does not match the selection on the corresponding Load File Import Setting."

A Warning icon () for the Work Basket task indicates that the Load File Import task completed with exceptions. You can then right-click the task and use the Download option to download the line-by-line errors in a CSV file. (Lines that do not have errors are imported.)

Note: If any child within a given family has an error, the entire family will fail to be imported.

Examples of errors include the following:

  • Native file not found at specified path.
  • Text file not found at specified path.
  • Error in formatting for line of document (for example, the number of parameters for a given line do not match the number of parameters in the load file header).
  • Duplicate extbegdoc value encountered. (Typically, load files contain unique values for a DocID column, which should be the field mapped to extbegdoc.)
  • Blank value for required field extbegdoc.

  • Unexpected error encountered during import of a document.