Managing Project Load File Import Templates

Project > Settings > Load File Import > selected Load File Import template

A Load File template is used to map the fields in a Load File to Digital Reef metadata fields when creating a new Data Set from a Load File.

Users in a role with the appropriate permissions can manage the Load File Field information needed for a Load File Import. At the Project level, this is done in a Load File template. Users with the appropriate permissions can also manage Load File information in an Organization Load File template.

When you select the Project-level Load File Import, a user with the appropriate permissions can use the top-level Templates context menu to perform the following action:

  • Create a Project-level Load File Import template by clicking the (New Template) option, which launches the New Template properties dialog.

For a selected Project-level Load File Import template, a user with the appropriate permissions can click the ellipses and use the context menu to perform the following actions:

  • Save to Template – Launches the Save to Template dialog, which enables you to save current settings to an available template based on your permissions. For example, at the Project level, you can save to a Project-level or Organization-level template, depending on your permissions, or select New Template, which launches the New Template dialog.
  • Load from Template – Launches the Load from Template dialog, which enables you to load the settings from a template. If you are at the Project level, you can load the settings from a Project-level or Organization-level Load File Import template you select using the option (if you have the appropriate permissions). The loaded settings then appear and are saved automatically. You must have Add/Edit permissions for Load File Import in the Project to use the Load from Template option to load settings from a selected template (from a list of available templates).
  • Select Set As Default, which marks the selected template as the default template. This is not available for the Default template of a given type, or for any other template already set as the Default.
  • Edit – Launches the Edit Template dialog, which enables you to edit the template name and/or description of the selected template.
  • Delete – Delete a template, which causes the display of a popup asking you to verify the deletion of the template from the Project.

Note: Save to and Load from operations for this setting observe an "overwrite" behavior. For example, for a Load from operation, your current settings are replaced by the settings from the selected template/settings. Note that some settings, such as Patterns, Tags, Domain Lists, Alias Lists and Excluded Content observe an "append" behavior instead.

Creating a Load File Import Template

Because Load Files can vary widely, it is important to thoroughly familiarize yourself with the contents and characteristics of the Load File in which you will discover fields before creating a Load File Import template.

Once you have created a Load File Import template, you can select it when you import a Load File into Digital Reef using the Imports > New Data Set from Load File option, as long as the Load File to be imported is of the same type as the one in which you discovered fields.

To add mappings and create a template, follow these steps:

  1. Click Discover Fields on the toolbar to discover the fields in a Load File and add them to the Load File Fields list on the left. The DR Metadata Fields list on the right is automatically populated. Both lists have filter boxes to help you find the field you need more quickly.
  2. Populate the Field Mappings section by dragging a field from the Load File Fields list into the Load File Field column, dragging a field from the DR Metadata Fields list into the DR Metadata Field column, and if necessary selecting a transformer in the Transformer column between them. By default no transformer is selected, which means that values of the field in the Load File are placed in the mapped DR metadata field in the new Data Set without alteration. If required, you can select among transformers for date/time, numeric, or text field values; for example, for a text field you can replace a specific character in values with a different character.
  3. Repeat the previous step until you have created all the needed mappings. (Be sure to see Mapping Requirements and Guidelines for information about fields that must be mapped.) To speed the process, you can select multiple fields in the Load File Fields list and drag them together into the Field Mappings section; you can also delete fields in the Load File Fields list or Field Mappings section by selecting them and clicking Delete Selected.
  4. If desired, select Preserve Families (see Mapping Needed to Preserve Families) and/or Process Load File Images (see Mapping Needed to Process Load File Images).
  5. Save your changes. You will be reminded if your changes do not include one or more required field mappings.

Mapping Requirements and Guidelines

When the load file includes natives or text versions, you typically need at least one mapping entry to identify the location of the files. Depending on the discovered fields, entries may exist for one or both of the following:

  • The location of native files, using a Load File field such as NativeLink/NativeFile (hereafter referred to as NativeLink).
  • The location of text files from OCR processing, using a Load File field such as TextLink/OCRPath (hereafter referred to as TextLink).

In general, you must also have a mapping entry forextbegdoc to identify the external starting document with a document number. This mapping entry is required for any load file import setup.

If the load file is set up for image processing only, you might not have a mapping entry to darelativepath or ocrpath at all, but you will need to provide the required mapping entries, as described in Mapping to Process Load File Images.

Use the following mapping guidelines:

  • If you are providing only one mapping entry to identify the location of the files (whether it is NativeLink for native files or Textlink for text files from OCR processing), you must map to the Digital Reef darelativepath field (categorized under Digital Reef Properties). If your load file uses the DR\ prefix for the mapping entry, you can use a text transformer to trim the first three characters of the DR Base Path (DR\). If your load file does not use the DR\ prefix (and does not require any other adjustment), leave the mapping using a Pass-through Transformer.
  • If you are supplying more than one mapping entry for the location of the files, you can use a Load File field such as NativeLink for native files and map it to darelativepath, and then use a Load File field such as TextLink (for the OCR-processed text files) and map it to the Digital Reef ocrpath field (categorized under Digital Reef Properties). Supplying more than one mapping entry for the location of the files provides additional information.
  • As long as you provide a field mapping to identify the location of a native file and the native is available at that location, the filetype field for the document will report the appropriate filetype; if not, the filetype field will report an Unknown format because there is no native.
  • You must have a mapping entry to the Digital Reef extbegdoc field to ensure that the external starting document has a document number.
  • You may want to supply a mapping entry to the Digital Reef filename field to ensure that you see the expected filename.
  • You cannot provide mapping entries to the following fields (listed under the appropriate properties): contentmd5, docext, filemd5, and filetype. For restrictions concerning the mapping of family-related fields, see Mapping Needed to Preserve Families .

Note: Digital Reef can accommodate the situation in which the Load File provides natives for some files, but not all. In this case, a document without a populated NativeLink (or equivalent) field might have a populated TextLink (or equivalent) field that points to extracted text. To ensure that the software can identify the darelativepath information if the NativeLink (or equivalent) field is not populated, you can include an entry that maps the TextLink (or equivalent) field to the Digital Reef ocrpath field.

Viewing Load File Source Information

After import, you can verify Load File source information for a given document in the metadata field loadfiledocsource.This field either provides a semicolon-delimited list of the sources that were used to identify the document (for example, NATIVE;TEXT;IMAGE) or the value NONE if the Load File import did not provide any of those items for a document. Note the following:

  • NATIVE indicates that the native file for a given document was available based on Load File field mapping such as NativeFile/NativeLink to the darelativepath field.
  • TEXT indicates that a text file (for example, an OCR-processed text file) for a given document was available based on Load File field mapping such as TextLink or OCRPath to the ocrpath field.
  • IMAGE indicates that Load File images for a given document were available and processed separately based on an OPT or LFP Image Load File (available at the same location as the DAT Load File upon import).

The values in the loadfiledocsource field determine how representations for a document are made available:

  • When a native file is available (that is, the loadfiledocsource field includes the value NATIVE), that native is used for the native representation. Otherwise, an empty file is generated to represent the native.
  • When a text file is available (that is, the loadfiledocsource field includes the value TEXT), that text file is used for the text representation. Otherwise, text will be derived from the native.
  • When images are available (that is, the loadfiledocsource field includes the value IMAGE), those images are used to provide an image representation in PDF format.

Mapping Needed to Preserve Families

The Preserve Families option lets you require mapping of family information in an imported DAT file; when it is enabled, you must provide mappings to the extbegdoc and extbegattach DR metadata fields, as follows:

  • extbegdoc (generally required) — The external starting document with a document number (under Digital_Reef_Properties).
  • extbegattach (required)  — The external starting attachment document with a document number (for example, for an email). This field is typically populated for the beginning attachment for an entire family, but a standalone document may also have this field populated. This field is categorized under Digital_Reef_Properties.

The following fields are also recommended to provide optimal family handling:

  • extenddoc (recommended) — The external ending document with a document number (under Digital_Reef_Properties). If this field is populated, the software will also populate the extdocrange and extdocattachrange fields.
  • extendattach (recommended) — The external ending attachment document with a document number (under Digital_Reef_Properties). If this field is populated, the software will attempt to preserve families when a family would otherwise be considered broken and generate an error during the Load File import.

If you clear this option, you are not required to map these fields, but you still have the option of doing so and preserving the family information. Without the required fields, all documents are treated as individual documents (without any MAG or DAG relationships).

Mapping Needed to Process Load File Images

To process the images for an OPT or LFP Image Load File (available at the same location as the DAT Load File upon import), you must enable and apply the Process Load File Images option on the Load File Settings tab, and you must supply a mapping entry for the following:

  • extbegdoc (generally required) — The external starting document with a document number (under Digital_Reef_Properties).

Optionally, you can provide a mapping entry for the following Digital Reef Properties:

  • extenddoc (optional) — The external ending document with a document number (under Digital_Reef_Properties).
  • extnumpages (optional) — The number of pages detected for a multi-page document (the number of pages for an attachment in a Load File Image PDF). A Load File field such as PgCount can be mapped to this field.

Upon successful processing of the images for an OPT or LFP Image Load File, the software creates a PDF that represents all of the images associated with a given Load File document.

After a successful import of a Load File with processed images, you can check the stored_image metadata field, which identifies External to indicate that the images were generated externally. You can also view the PDF created to contain the images for a document in the Image tab/view mode of the Document Viewer. The Image tab only appears for a document that has stored images as a result of a Load File Import with processed images, External Image Import, or an Export that requests PDFs.

Sample Load File Field Mappings

The following lists sample Load File Field Mappings (with no Transformers included).

Load File Field ---> DR Metadata Field

NativeFile ---> darelativepath

BegDoc# ---> extbegdoc

EndDoc# ---> extenddoc

BegAttach ---> extbegattach

Filename ---> filename

OCRPath ----> ocrpath

Next Steps: Add Data from a Load File

After you have defined your Load File information, you can import a  Load File. See Add Data from a Load File for more information.