Manage Project Export Settings

Home > selected Project > Settings > Export Settings
Project > Settings drop-down > Project Settings > Export Settings

When you create a new Export Stream, most of the initial settings in the Export dialog are loaded from the Project's Default Export Settings template, which is maintained on the Project Settings > Export Settings page. The settings can then be modified as needed for the individual Export.

Export Settings templates can also be created and maintained at the Organization and System levels, and these can be selected in pace of the Default Export Settings template when creating a new Export Stream. Maintaining a set of templates that address an organization's typical Export use cases greatly simplifies the task of setting up an Export.

The Project's Default Export Settings template is initially created from the template selected by the Organization Administrator when creating the Project. Typically, this is the Organization's Default Export Settings template.

Note: When you create a new Export Volume for an existing Export Stream, no Export Settings template is used; instead, the initial values for the Export Settings are inherited from the Export Stream.

Modifying, Creating, and Loading Export Settings Templates

If you have the required permissions you can modify the Project's default template. It may be more useful, however, to instead create a new template at the Organization level, which is then available to all Projects within the Organization.

After you make changes to the Default Export Settings template, you can save the changes or discard them). You can also save your Export Settings to an existimg template at the Organization level by selecting the Save to Template option from the context menu to the right of Export Settings in the Project Settings tree. This displays the Save to Template dialog, which lets you select one of the existing Organization templates to save to (overwrite), or instead select New Template, which launches the New Template dialog, allowing you to create a new template.

You can also select Load from Template to display the Load from Template dialog, from which you can load settings from one of the existing Organization or (with the required permissions) System-level templates. (Doing so overwrites the current settings without saving them.)

Export Template Settings

The following sections cover the settings included in Export Settings templates at all levels -- the Project's Default Export Settings template as well as those created and managed at the Organization and System levels.

Template Type

If your Project is integrated with a review platform, you can select one of two options for creating a new Export Stream: New Digital Reef Export Stream for creating a standard Export Stream, and one for creating the Export Stream on the review platform, for example New Reef Review Export Stream. Because the Export Settings template selected for an Export must match the Export's destination, you must choose each template's type.

Suppose, for example, that your Digital Reef site is connected to Reef Review, and all of your Organization's Projects are integrated with Reef Review -- that is, the creating user uses the Project Setup control to select Digital Reef and Reef Review, enabling direct export of data to Reef Review. In this case you might change the type of the Defaul Export Settings template to Reef Review. If only some of your Organization's projects are integrated with Reef Review, however, you might want to leave the default template set to Digital Reef Only and create an Organization Export Settings template with type Reef Review for users of integrated Projects to select when exporting to Reef Review.

Differences in the Export settings when exporting to a review platform rather than Digital Reef are noted in the following.

File Output Options

Documents per Folder sets a maximum number of files per folder. This is a matter of preference and in some instances, perhaps, of optimizing performance.

Image Exceptions

This section provides three default queries to search for documents that have certain types of text and flags so that they can later be addressed separately at Export, as such documents may require special handling before they can be produced in PDF format. The queries are initially run when you add documents to Project Data, when each query generates a Work Basket task. A document flagged as an image exception will also have the image_exception metadata field populated with true in the Document Viewer.

The default queries find comments and other annotations, hidden content, and Microsoft Word track changes flags, as described below; you can modify or delete these if you choose .

  • Comments and Other Annotations - Identifies documents with comments (Word, Excel, or PDF comments), Word Revisions, Excel Track Changes, and other Excel annotations, such as Excel Auto Filter, Excel Protected Sheets, and Excel Protected Workbook.

  • Hidden Content - Identifies hidden text in Word documents, hidden PowerPoint slides, or hidden content in Excel documents such as Excel Hidden Columns, Excel Hidden Rows, Excel Hidden Worksheets, and Excel Very Hidden Worksheets.

  • Word Track Changes - Identifies Word documents with Track Changes enabled.

You can add one or more queries of your own; the maximum query length is 1024 characters. All Image Exception queries you have added or modified are validated by Digital Reef when click Save or use the Save to Template option. Errors in metadata field names or query syntax are indicated by an error icon or button; click the indicator to display information about the error.

Other Options

This section provides several options that affect export, as described below.

Export Database

The Organization Settings >Export Databases option lets you set up and manage one or more MS SQL databases to support the export of load file information from Export Streams in the Project. If you use the Export Database drop-down to select one of the Organization's existing export databases, a user creating a new Export Stream based on the Export Settings template you are creating or modifying can then select the SQL DB output format and provide the database table name to supplement the exported load file with export to this database.

For more information about all configuration steps required to export to an MS SQL database, see How to Export to a Database.

Note: Once you have selected an export database in a template and the template has been used to create Export Streams, do not set the Export Database drop-down to blank; if no database is selected in the template, all subsequent Volume Exports to existing database-enabled Export Streams will fail.

Include Numeric Values in Near-Duplicate Processing

To exclude numerics from near-duplicate processing, clear this checkbox. Near-duplicate processing is an Advanced Analytics operation that assesses document similarity but behaves differently in Export than it does in near-duplicate search operations.

Enable File Extension Correction

By default, file extensions of Native files are corrected (if necessary) at export based on the intended file type. To exclude this step, clear the checkbox.

When a file's doc extension does not match its file type, the origdocext field. Correction is applied during export to documents that have a populated origdocext field as well as a populated docext field. Your extension conversion setting also determines how the NativeLink field is populated in the manifest at Export.

For example, a file with the Extensible Markup Language (XML) file type may have a docext of twb (for Tableau) and the origdocext field may contain xml to indicate what the extension should have been based on the file type. Converting the extension at Export will ensure that the file can be opened properly in a downstream review tool. Another example is a tab-delimited file with a Text 7-bit File type and a docext of xml that can be converted to the origdoctext of txt.

By default, the Extension Conversion setting is enabled, which means that an If you disable the Extension Conversion setting in your Project Export settings, the next Export produces the native files based on the docext field, which is the extension seen on disk.

For an existing Export, you can use the Update operation with Redo Entire Production if you want to redo your export production after enabling the setting.

Show Export dialog for Export to Reef Review

This checkbox displays only for Projects uploaded from Reef Express, on systems connected to Reef Review (or Relativity Server). Selecting it displays a limited Export Settings dialog when such a Project is exported to Reef Review (or Relativity Server). When it is not selected, the export to Reef Review is made automatically according to options selected at Project creation in Reef Express.

Family/Thread Options

When creating a new Export Stream, you can select one of the following email or document family options to control the scope of files flagged for export:

  • Selected Documents Only — Exports only those files, attachments, or emails that were explicitly tagged (for example, as Potentially Responsive). An explicitly tagged email attachment or document attachment is always included in the Export when this option is enabled, so the Separate Email Attachments, Separate PDF OLE Attachments, and Separate OLE Attachments options are selected and locked. Selected Documents Only does not export the associated email of a tagged attachment unless that email has been explicitly tagged as well. Using Selected Documents Only means that family relationships will not be maintained in the appropriate load files upon Export (that is, metadata fields such as AttachmentID, AttachmentRange, and BegAttach will be blank).
  • Associated Family Docs ) — For each item being exported, exports the other contents of its associated document family. Remember to set Separate Email Attachments and Separate PDF OLE Attachments (with or without its nested option) if you want to Export email attachments and PDF OLE attachments (with or without other OLE attachments) as separate files.
  • Associated Threads — For each file being exported, exports the other contents of its associated thread (for example, all associated messages and contained attachments).

Additional family options are as follows and apply to the initial Export of an Export Stream only:

  • Separate Email Attachments — When selected (with the Associated Family Docs mode), the export process includes email attachments of a parent email as separate files; when clear, attachments are instead embedded within their parent email (for example, an EML). Note that this option must be enabled if you want to use the PDF option Highlight Search Terms.
  • Separate PDF OLE Attachments — When selected on its own without its nested option, restricts the export of separate OLE Attachments to just PDF OLE attachments (files embedded within parent PDF files, called PDF Portfolios). You can control this and its nested option when Associated Family or Associated Threads is enabled. These modes expand the initial document selection to include any missing parent or child documents prior to applying this option. When Selected Documents Only is enabled, this option is fixed as enabled. Make sure that the Separate Email Attachments option is selected to ensure expected results regarding the export of separate OLE attachments attached to emails.
    • Separate OLE Attachments —Exports copies of files embedded within other files through Object Linking and Embedding. For example, if you export a Word document that contained an Excel spreadsheet with Separate OLE Attachments enabled, you would also export a discrete copy of the spreadsheet. You can control this option when Associated Family Files or Associated Threads is enabled. These modes expand the initial document selection to include any missing parent or child documents prior to applying this option. When Selected Documents Only is enabled, this option is fixed as enabled.
  • Include Container Reference — This option ensures that the load file contains a record for each exported document's container (for example, a PST), if containers have not been removed from the Project (for example, with exclusion searches). When this option is set, each container for an exported document will be assigned a Doc ID so that it can be referenced in the ParentContainer metadata field. Note that container references added by this option appear in the load file only (they are not produced). Even without this option, you may see the ParentContainer field populated (for example, if an exported document's container file is part of the export and already has a Doc ID).
  • Remove Attached Archives — Removes any successfully parsed archives that are family members (that is, part of the MAG or DAG). This option applies to File Archive types, Disk Image types, Message Archives, and Compressed types that have been successfully parsed and have a docclass of Message_Attachment, Message_OLE_Attachment, or EDoc_OLE_Attachment. (Only archives that have been successfully parsed can be removed.) This option does not apply if the Family scope is Selected Documents Only.

Duplicates Processing

Export supports Duplicate processing and Near-Duplicate processing (see Analytic Processing). When creating a new Export Stream, , you can select one of the following methods of handling duplicates:

  • No Duplicate Removal — For the initial export, you can optionally keep all exact duplicates of a document flagged for export. Exact duplicates are assigned unique IDs and can be referenced as a group in the appropriate load file if they are included.
  • Remove Duplicates from Export ) — Removes the duplicates from export but maintains records for the duplicates in the appropriate load file. Only one duplicate within a set of duplicates is physically placed in the appropriate export file location. You can use the Separate Duplicates option, which places the non-duplicate entries in one load file and the duplicates without produced documents in another load file.
  • Remove Duplicates from Export and Load File — Excludes duplicates from both the export process and the export load file. This means that duplicates are neither exported nor tracked.

The handling of email duplicates at export is determined by the Email deduplication settings defined under Project Settings > Analytic Settings. For example, a set of duplicates is per Custodian if your Project deduplication settings are Custodial.

Analytic Processing

For a new Export Stream or a subsequent volume Export of an Export Stream, you can select the following options.

Group Near Duplicates — Select this option to enable Near-Duplicate processing, under which the scope of the processing is restricted to the documents meeting the Documents to Export criteria. Near-Duplicate processing includes the calculation of pivot documents and the identification of the compliant Near-Duplicate documents. When exporting to an integrated review platform only this option is available, with the other Analytic Processing settings locked.

If a subsequent Export of an Export Stream enables Near-Duplicate processing, any newly added or newly Tagged documents that meet the criteria are evaluated. If you select the Group Near-Duplicates option, you must supply values in the appropriate range for the threshold and the minimum terms:

  • Threshold (required) — Specifies the similarity threshold (80 by default) used for Near-Duplicate processing. You can specify any value in the range 0-99, where 0 specifies detection of a nonzero amount of similarity or commonality. To require a higher degree of similarity or commonality, select a higher value. In general, the lower the threshold, the more results you will see, since you are requiring less similarity or commonality.
  • Minimum Terms <value> (required) — Specifies the minimum number of terms for Near-Duplicate Processing, 25 by default. The permitted range is 0 to 9999.
  • Process Attachments (optional) — Specifies whether email or OLE attachments are processed as part of Near-Duplicate processing. By default, attachments are not processed independently for Near-Duplicate Processing. This option is available when Separate Email Attachments and/or Separate OLE Attachments are selected.\

Bear in mind that near-duplicate processing during Export differs from the near-duplicate processing performed as part of searches for near-duplicate documents, and is characterized by the following:

  • Handling of Numerics based on Export Settings — By default, the Project Export Settings include Numeric values for Export near-duplicate processing. If you want, you can have Numeric values ignored for Export near-duplicate processing by changing the Export Settings, but you must make this change before you perform any Export near-duplicate processing.
  • Removal of Tokens — For documents processed prior to Release 4.3.11.0, Digital Reef always removes Tokens for Export near-duplicate processing. For regular document processing during Import prior to Release 4.3.11.0, Tokens are used to identify the type of content in a document, errors, and supported Patterns (regular expressions).
  • Inclusion of Stop Words — Stop Words are always included in Export near-duplicate processing, that is, the Stop Words list is not used.( Stop Words are always included for Indexed operations such as Term Searches and are ignored by default for Clustering and similarity comparisons, including a search for near duplicates of a document.)
  • Export near-duplicate processing observes a minimum term length setting of 1 character and a maximum term length setting of 64 characters.
  • Use of Shingling for 2 Adjacent Terms — Export near-duplicate processing employs shingling for each set of 2 adjacent terms (that is, it evaluates and partially overlaps adjacent terms, two terms at a time). This imposes an order on the terms (and means that the two terms are not considered in the reverse order). For example, with two-term shingling, an occurrence of The quick brown fox has The quick in one set, quick brown in a second set, and brown fox in a third set.
  • Near Duplicates are calculated at the end of the Export Prepare stage (after duplicates have been handled for the Export, and the Export contents have been determined).
  • A near-duplicate Work Basket task shows the potentially long-running Near Dupe task, which you can cancel if necessary. If all of your Export data is not backed by an Analytic Index, this Work Basket task displays a failure.
  • If you export many documents, be aware that the Near-Duplicate processing can be a time-consuming process.

ThreadGroup includes Attachments — When selected, this option enables the ThreadGroup fields (ThreadGroupID, ThreadGroupIndent, and ThreadGroupSort to be populated for attachments that are part of the Thread Group (and exported separately using the Separate Email Attachments and Separate OLE Attachments options). In the hierarchy reported in the ThreadGroupSort field, the attachments are with their associated parent message in the appropriate position (for example, DOC0000000011.1!A00000001). By default, the ThreadGroup fields are not populated for any attachments that are part of the Thread Group.

Output Options

  • Export formats - The following output formats (or a combination thereof) can be selected for any Export of an Export Stream, except that the SQL DB option can be chosen only when creating an Export Stream and not for subsequent Volume Exports. You can also clear all formats and export files only, which may be suitable in some cases.

    • DAT  — Exports tagged and associate files, or all files, from a view with support for the LexisNexis Concordance® format for eDiscovery to a .DAT file. A Concordance DAT file provides all Digital Reef metadata fields subject to export; the metadata list has more information. When exporting to an integrated review platform, this option is automatically selected, with the other Output Options locked.
    • LST — Exports tagged files or all files from a view to a Relativity LST file. The LST file includes a small number of pertinent metadata fields such as DocID and TextLink.
    • DII — Exports tagged files or all files from a view to a CT Summation Document Image Information (DII) file. This file contains the Digital Reef fields mapped to standard DII tokens (for example, bcc becomes @BCC). User-selected fields and fields that do not have standard mappings are identified by a custom token, such as @C.
    • CSV — Exports tagged files or all files from a view to a comma separated value (CSV) file serving as a manifest of files. The CSV file includes all Digital Reef metadata fields subject to export. Select this option to generate search reports.
    • EDRM XML — Exports tagged files, or all files, from a view to an XML file supporting the Electronic Discovery Reference Model (EDRMClosed The Electronic Discovery Reference Model defines a standard for eDiscovery products and services so that data can be easily exchanged between organizations and eDiscovery products. The supported version is currently 1.1.)and containing EDRM metadata as well as all Digital Reef metadata subject to export. This enables an EDRM-compliant, third-party application to import the exported files for further analysis.
    • SQL DB — For the initial Export of an Export Stream only, you can choose to supplement the files exported by also exporting load file information to an active MS SQL database. This database must have been created using the Organization Settings > Export Databases option and must be specified in the Export Settings template you have selected for the Export Stream; if no database is specified, the DB Table Name setting is unavailable. The Project Export Fields template specified for the Export Stream determines which Metadata fields are populated in the database, but renamed fields and field reordering are not recognized. For more information about the configuration steps required to export to an MS SQL database, see How to Export to a Database.

      Note: If a database connection fails during Export, the entire Export fails and the Export remains in a Staged state; when the error has been resolved, you can click the Run Export button to perform the Export.

      • DB Table Name - When you have selected the SQL DB format and an active database is specified in the selected Export Settings template, as described above, this setting specifies the name of the database table in which the exported data is to be stored. You can use the provided default table name, in the format DR_projectname_streamname, or you can specify your own. The table name can be a maximum of 100 characters and cannot contain spaces or a leading digit; characters other than underscore, a-z, A-Z, and 0-9 are converted to underscores automatically. If a table with the specified name does not exist, it is created during the export. For subsequent Volume Exports, the last table name used for the Export Stream is filled in, but you can specify a different one. Note that two additional tables are also generated when the Export Stream is created to provide information about the Export Settings and the status of each exported Volume; for more information about the tables in the database schema, see How to Export to a Database.
  • Include Full Text Text — Includes all text from the text files subject to Export in the load file. This option is available for any Export of an Export Stream and generally requires selection of the Extracted Text option, described below. When the nested option Exclude Email Headers is not selected, this option ensures that full email header information is included in the load file. This option applies to the following outputs:
    • DAT and EDRM load file formats - The exported DAT load file includes a field called inlinetext1 to hold the included text, while the EDRM XML load file populates the EDRM XML element <InlineContent. The included text can be up to 12 MB of data per document; if this limit is exceeded, the text is not included, but the load file includes a reference to the extracted file on disk.
    • Export to an MS SQL database - The extracted_text field holds the included text and can be up to 2 GB per document; if this limit is exceeded, the field is empty and the text_link field provides a reference to the extracted text file on disk.
  • The Exclude Email Headers option (available when Include Full Text is selected) excludes email header information from the included text of the text files subject to Export, including only the email body in the load file.
  • Separate Duplicates — This option is available for any Export (or Load File Generation) and places the entries for non-duplicates and duplicates into separate load files (for example, VOL0001.csv and VOL0001-duplicates.csv). If you use the default setting of Remove Duplicates from Export, having a separate duplicates load file segregates records with no TextLink and NativeLink information into a separate load file (for example, for manual loading to Relativity). This option does not apply if you remove duplicates from both the export and the load file.
  • Duplicate Overlays — This option is available for any Export (or Load File Generation) and triggers the generation of overlay manifest files containing any updated records for previous volumes due to processing of the current volume (for example, due to new or changed DuplicateCustodian metadata). DAT is the default output format, but you can select another format, such as CSV. For example, if VOL1 contains an original document, and duplicates of that file appear in VOL3 and VOL5, you would see entries in the CSV files VOL0001.csv, VOL0003-overlay.csv, and VOL0005-overlay.csv. When you select this option, the export also includes two load files in your selected format named PreviousMasterDupes and CurrentMasterDupes. These files respectively list the ExportedVolNameand DocID field values of master duplicates from volumes prior to the current one and the master duplicates new to the current export volume. For the first volume in an export stream, the PreviousMasterDupes file is empty. If a load file is generated for the whole stream, all master duplicates are listed in the PreviousMasterDupes file. If you also select the Separate Duplicates option for an export or load file generation, the export includes an overlay file for the duplicates CSV (for example, VOL0003-duplicates-overlay.csv). Overlay manifests are also generated automatically as part of the Generate Search Reports option.
    • Include All Master Duplicates (enabled when you select Duplicate Overlays for Export or Load File Generation): When you opt to generate overlay manifest files, the default behavior is to limit the master duplicate records from prior volumes in the stream and include only those with updates to metadata values based on corresponding duplicates added to the most recent volume. Existing master duplicate records without corresponding duplicates added to the most recent volume are therefore excluded from the overlay manifest files by default. If you want the overlay files to include all master duplicate records instead, select the Include All Master Duplicates checkbox.
    • Export Fields for Overlays (enabled when you select Duplicate Overlays for Export or Load File Generation): When you opt to generate overlay manifest files, you can use the associated drop-down menu to select an available Export Fields template to use for the overlay manifest files. This enables you to configure and then use a custom Export Fields template specifically for the overlay file, perhaps one with a smaller subset of fields. If you do not specify a custom Export Fields template for the overlay file, then your designated Export Fields System Created Template for the Project is used.
  • BegAttach starts with – For any Export of an Export Stream, you can specify this option with one of the following for the starting attachment (or embedded document) value:
    • Parent Email — Uses the parent email or document ID to represent the beginning attachment range (BegAttach value) for an entire family, which may include email attachments or embedded documents (members of a MAG or DAG). For example, if doc1.doc with an ID of 00001 has three embedded documents (embed1.doc with ID 00002, embed2.doc with ID 00003, and embed3.doc with ID 00004), the BegAttach value contains parent ID 00001 for all members of the family. When exporting to an integrated review platform, Parent Email is automatically selected and cannot be changed.
    • First Attachment — Uses the first email attachment ID or first embedded document ID to represent the beginning attachment range (BegAttach value) for an entire family (members of a MAG or DAG). For example, if doc1.doc with an ID of 00001 has three embedded documents (embed1.doc with ID 00002, embed2.doc with ID 00003, and embed3.doc with ID 00004), the BegAttach value contains first embedded document 00002 for all members of the family.
  • Max Records Per File — For any Export of an Export Stream, you can set this option to a non-zero value of up to 12 digits to generate load file batches (chunks) based on a maximum number of records per batch. This eases the loading process in a downstream review tool, and helps reviewers get started with loaded batches while others are loading. Keep in mind that the value you specify sets the upper boundary for each batch, the number of records in a given batch may be less, for example to prevent a family from being split across batches. However, any family that is larger than the Max Records Per File value will be split. This option is not available for the LST and DII output formats.
  • Export Volume Encryption — For any Export of an Export Stream, you can, with the appropriate configuration in place, enable Export Volume Encryption for the Export, which then enables you to select an available Export Location. Export Volume Encryption is based on a supplied Key. You supply this key for the Project using the Volume Encryption Key option in the Project Export Settings.
    • Export Encrypted Files To: <Drop-down to select an Export Location> If you select Export Volume Encryption, this associated drop-down becomes active, and you must specify an Export location, typically a different Export Location (Export Data Area) for the encryption (versus the regular Export Location). Select an available Export Location from the drop-down list (in alphabetical order).

If you want to use Export Volume Encryption, contact Digital Reef Support, since the feature requires additional configuration and software.

Formatting Settings

These options include the following:

  • Time Zone — For any Export of an Export Stream, you can select a time zone and adjust the exported date and time metadata accordingly. Coordinated Universal Time (UTC) is the default, except when exporting to an integrated review platform, in which case the default is Eastern Standard Time (EST). You can select from the displayed subset of the most common time zones, select Other ... from the drop-down to display an expanded, filterable list of other available time zones, or enter your own time zone using the standard time zone name (for example, America/New_York). The selected time zone affects document conversion to PDF, HTML, or TXT, and the export load file contents.

    Note: Some PDFs, and potentially other document types, contain a time zone offset; this is automatically taken into account when adjusting the exported date and time metadata to your selected time zone.

  • Date Format — For any Export of an Export Stream, you can select a date format. The default complete date and time format is MM/dd/yyyy HH:mm:ss. The Date Format drop-down box enables you to select a format for the date, or type in your own date format using the guidelines for custom date formats:
    • MM/dd/yyyy
    • MM/dd/yy
    • yy/MM/dd
    • yyyy-MM-dd
    • dd-MMM-yy
  • Delimiter – If you type in your own format (for any Export of an Export Stream), you must select a separator (space by default) to separate the date information and the time information. Other common delimiters are a semicolon or a hyphen. You are not limited to a single character; you can supply a text string.
  • Time Format – For any Export of an Export Stream, you can use a drop-down box to select a format for the time, or type in your own time format using the guidelines for custom time formats:
    • HH:mm:ss
    • HH:m:s
    • hh:mm:ss a
    • h:mm:ss a

    Guidelines for Specifying Custom Date/Time Formats:
    When supplying your own Date/Time format, see Formatting Characters for Custom Date/Time Formats, or consult sites such as https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/text/SimpleDateFormat.html to learn about the accepted formatting characters used to create date and time patterns. If you type any text other than formatting characters in the Date or Time format boxes and you want that text to be preserved, you must place the text in single quotes. For example, type the Time format hh 'o' 'clock' a in the Time Format box to produce a Time that preserves o'clock. This format (where a is the formatting character for AM or PM), may yield a Time such as the following: 12 o'clock PM. An example of a Date and Time format with the word at as the Separator is yyyy.MM.dd G at h:mm a. In this example, you would type yyyy.MM.dd G in the Date Format box (where G designates an era), the word at (no single quotes are needed for text typed in the Separator box), and h:mm a in the Time Format box. This format may yield a date/time format such as 1996.07.10 AD at 12:08 PM.

    The specified format affects the load file format of the date-only export metadata fields (for example, DateCreated), time-only fields (for example, TimeCreated), and the fields that represent combined dates and times for load files other than EDRM XML. EDRM XML has its own format and does not observe your date/time format.

  • Unit of measure — You can select the desired unit of measure for any Export of an Export Stream: Bytes (the default), KB, MB, or GB.
  • DAT Encoding — For DAT load files only, you can set one of the following options for any Export of an Export Stream:
    • ASCII/UTF-8 — Produces an ASCII-delimited file with UTF-8 encoded values. UTF-8 and ASCII are identical for ASCII values only; for any non-ASCII value (for example, in file names, metadata values, or content), multiple bytes encoded according to the UTF-8 encoding rules will be used to represent the character. In this case, the DAT file would contain multibyte characters. Note that if you use this encoding type and want to import the DAT file back into the system, your Load File Import Settings must use the encoding type MIXEDMODE, which accommodates the ASCII/UTF-8 mix.
    • Unicode — Produces the DAT file using UTF-16 LE encoded values. Note that if you use this encoding type and want to import the DAT file back into the system, your Load File Import Settings must use the encoding type UTF16LE. When exporting to an integrated review platform, this option is automatically selected and cannot be changed.

Production Settings

You can manage most of the configurable Production Settings for any Export of an Export Stream. This section starts with the different elements that make up the full path of the Export Volume, as reported in the appropriate export metadata fields (NativeLink, TextLink, and/or PDFLink) in a load file. You can configure many portions of the reported path.

The Volume Label and Document ID Prefix are initially derived from the selected Export Settings template.

The following Production Settings are not available when exporting to an integrated review platform:

  • Base Path — Enables you to specify the base path that you want reported in the load file fields NativeLink, TextLink, and/or PDFLink, which are populated when you include the production of native, text, and/or PDF versions. You can use the default base path (DR), specify your own base path, or omit the base path completely (for example, if you plan on importing a DAT file back into the system and do not want to have to trim the base path in the Load File Import Settings). Note that what you put in the base path determines what appears in the NativeLink, TextLink, and/or PDFLink export fields in the load file (if you include native, text, and/or PDF versions). This base path does not affect what appears at the physical export location after export, just the reporting of the path in the appropriate load file fields. The Base Path can include up to 50 alphanumeric characters, with the hyphen, period, and underscore also allowed, but the following are not allowed: ! " ' # $ % & * + / : ; < = > ? @ [ \ ] ^ { | } ~ “ ”
  • Include Volume Label and # in Path Fields — Select to include the Volume label and Volume # (see below) in the appropriate Export metadata fields (NativeLink, TextLink, and/or PDFLink) of the load file. Output Directory (optional) may appear after the Volume # based on whether one was specified for Native, Text and/or PDF in the Production Options section.
  • Folder # — Displays the current 5-digit folder number that will be part of the path. You cannot configure this value.

The following Production Settings are available only when exporting to an integrated review platform.

  • Document Folder — Lets you select an existing folder within the Project's dedicated Export Data Area as the Export's destination, or create a new destination folder under an existing folder. Not if you select Export Name.

  • Export Name — Enables name-based foldering, in which Reef Review creates a top-level folder named for the Export Stream and a destination subfolder named for the Volume Label and Volume #, for example ProjA-Export2\VOL0001.

  • Folder by — Distributes exported documents into subfolders of the selected destination folder based on the selected field, for example by Custodian.

The following settings are common to all Projects regardless of review platform integration:

  • Volume Label — Specifies a starting production Volume Label (prefix), VOL unless changed in the Export Settings. The Volume Path can include up to 50 alphanumeric characters, with some special characters also allowed, but the following are not allowed: ! " ' # $ % & * + / : ; < = > ? @ [ \ ] ^ { | } ~ “ ”

  • Volume # — Displays the current volume number that will be appended to the Volume Label (for example, VOL0001). You cannot configure this value.
  • ID Prefix (required Document prefix) — Enables you to enter the prefix you want to use for a Document ID, or select an existing prefix from drop-down; as you enter characters in the text box, the drop-down is automatically filtered so you see only the existing prefixes containing the character sequence you have entered. The default ID Prefix for the first Volume Export in an Export Stream is DOC (or the prefix configured in the Project Export Settings). For subsequent Volume Exports in a given Export Stream, the prefix shown in the dialog is the last ID Prefix used (as reflected in the Stream Export Settings). Your prefix selection then determines the Starting ID, which will be set to the appropriate value based on the existing prefix selected, or reset to 1 (for example, to 0000000001) for a new prefix for the Stream. Note that the Volume Settings will reflect your ID Prefix and Starting ID selections, with the Starting ID reflecting the next-available ID. In addition, the Export Stream Documents tab will reflect the ID Prefix and Starting ID used by each Volume in the Stream. ID prefixes other than the default prefix are specific to a given Export Stream. For restrictions on the contents and length of an ID Prefix, see Volume Label above.
  • Starting ID — Enables you to specify a starting production Document ID for a given export. The default starting ID in a new Export Stream is a 10-digit starting ID, 0000000001. A separator follows the Doc ID, followed by the Page ID (if document-level numbering is used) and then the document extension. You can specify a value greater than (or equal to) the shown starting ID, but not a smaller value than that shown. As you add Export Volumes within a Stream for a given prefix, the starting ID value (and value shown on the Settings tab) will reflect the next-available ID.
  • Separator drop-down(applies only for document-level numbering, not page-level numbering) — Enables you to use the default separator (_, underscore) or select a period or a hyphen as the separator between the Starting Doc ID and the Page ID.
  • Page ID (applies only to document-level numbering, not page-level numbering) — Enables you to specify a starting production Page ID for a given export that includes PDFs. The default is a 4-digit Page ID, 0001. The document extension is displayed at the end of the ID (for example, .pdf).
  • Page-Level Numbering — When enabled, this option uses incremental numbering to assign each page of a document its own Doc ID instead of using document-level numbering, in which the same Doc ID supports suffixes for the different Page IDs. (You cannot change this option after the initial export of an Export Stream.) If you select this option, the Page ID part of the path no longer applies. In addition Doc ID, page-level numbering will affect a number of Export metadata fields, as follows:
    • Fields that report starting values — AttachmentID, BegAttach, BegDoc, NativeLink, NearDupePivotDocID, OLEChildID, OLEParentID, ParentContainer, ParentID, PDFLink, TextLink, ThreadGroupID, ThreadGroupSort, ThreadID, ThreadIDOrphanRef, and ThreadIDParentRef.
    • Fields that report ending values — EndAttach and EndDoc.
    • Fields that report document ranges — AttachmentRange and DocumentRange.
    • PageCount field — If included in the list of Export Fields for your selected Export Fields template, this field reports the number of pages produced for each document subject to export using page-level numbering.

    An export that uses page-level numbering fails if it encounters missing images (for example, because of a conversion failure). In this case, the Work Basket task shows the error message "Documents that could not be numbered at a page-level were encountered in the export. Please download the errors file for this task for more information." You can then click Download for the Work Basket task to download the errors file, WARNING_DETAILS_REPORT.csv. This file provides a list of document handles and the associated error for each (for example, CONNECTOR_FAILURE, CONNECTOR_READ_ERROR, CONVERSION_FAILURE, NATIVEFILE_NOT_FOUND, or UNKNOWN). An export using page-level numbering also fails if the numbering of a staged volume no longer reflects the page count of the volume once it is actually being exported. In this case, the Work Basket task shows the error message "The image page count of this volume has been altered since staging. Please recreate the volume in order to update the page-level numbering."

  • Doc ID Pad Size — Enables you to specify a pad size for the Document ID. The default is 10, and you can specify a different value of 1-9. You cannot change the value of this option after the initial export of an Export Stream.
  • Page ID Pad Size — This option, which is available for document-level numbering only, enables you to specify a pad size for the Page. The default is 4, and you can specify a different value of 1-4. You cannot change the value of this option after the initial export of an Export Stream.

Native, Extracted Text, and PDF Production Options

For any Export of an Export Stream, you can specify these additional Production options to export Native, Text, and/or PDF versions of the files marked for export.

When exporting to an integrated review platform,, you must select both the Native and Extracted Text options, and you cannot specify a target Output Directory (files are always exported to the volume directory).

  • Native – Select to include Native versions of the files.
    • Output Directory – Enables you to specify a target directory for the native files that are exported. If you do not specify a directory, the files will be exported to the volume directory. Output Directory names are truncated if longer than 50 characters, and cannot contain the following characters: ! " # $ % & * + . / : ; < = > ? @ [ \ ] ^ { | } ~ “ ”

    The Extension Conversion setting in the Project at the time of Export determines whether extension conversion occurs for that Export. When the Extension Conversion setting is On, an Export producse native files with the origdocext file extension, which is based on the intended file type instead of the file extension seen on disk. If you change the Extension Conversion setting to Off in your Project Export settings, the next Export produces the native files based on the docext field, which is the extension seen on disk. Your Extension Conversion setting determines how the NativeLink field is populated in the manifest at Export. It is important to note that Digital Reef will not use a document extension during native file production if that extension contains any of the following characters: \ / : * ? " < > | or ASCII characters 0 through 31. In this case, the produced native file will not have an extension.

    • Email Format: (requires Separate Email Attachments option):
      • Native — Exports the associated parent email body using native EML or MSG format.

      • HTML — As long as the Export includes Native files (using the Native production option), this option converts all successfully parsed files that are eligible to HTML. Eligible files include those with a filetype of email, a file type of vCalendar (Lotus Notes Calendar items), or an auxfiletype of msg (for example, items from MSGs, such as those with an msgclass of calendar, journal, todo, or contact). Files with a filetype of vCard (Lotus Notes contacts) are not eligible for conversion. Embedded images are not included in the Export.
      • MHTML — As long as the Export includes Native files (using the Native production option), this option converts successfully parsed files that are eligible to MHTML. Eligible files include those with a filetype of email, a file type of vCalendar (Lotus Notes Calendar items), or an auxfiletype of msg (for example, items from MSGs, such as those with an msgclass of calendar, journal, todo, or contact). Files with a filetype of vCard (Lotus Notes contacts) are not eligible for conversion. With MHTML conversion, embedded images are included in the exported mht files. (Embedded images are not subject to extraction.)
      • HTML/MHTML — As long as the Export includes Native files (using the Native production option), this option converts successfully parsed files that are eligible as follows: emails with embedded images (embeddedchildren::image) are converted to MHTML, and all other successfully parsed files that are eligible are converted to HTML. Eligible files include those with a filetype of email, a file type of vCalendar (Lotus Notes Calendar items), or an auxfiletype of msg (for example, items from MSGs, such as those with an msgclass of calendar, journal, todo, or contact). File with a filetype of vCard (Lotus Notes contacts) are not eligible for conversion.
      • PDF (available as of 5.4.2.0) — As long as the Export includes Native files (using the Native production option), this option converts only successfully parsed emails that are eligible to PDF format. Note that adjusting the time zone for the Export will not apply to the generated PDFs, unlike when you choose HTML or MHTML as the Email Format.
    Upon export, the appropriate converted version of an email with both sender and from field values contains them in the format "sender value on behalf of from value". Encrypted files with a parsing status of 00027 ENCRYPTED are not converted and always exported in Native format. Files with a parsing status of 00015 NO_EMAIL_BODY are eligible for conversion, even if the only portion converted is the email header. Note that you can control the Separate Email Attachments option when either the Associated Family Docs or Associated Threads option is selected, but not when the Selected Documents Only option is selected.
  • Extracted Text –Select to include extracted text versions. By default, text files of the equivalent Native files are not exported, and text files produced by OCR processing are not exported. Select this option if you want to export text files of the equivalent Native files, as well as the text files produced by OCR processing. Note that if you use the Include Text option to include text in a DAT or EDRM load file, or in an MS SQL Database, you must also enable the Extracted Text option.
    • Output Directory – Enables you to specify a target directory for the extracted text files that are exported. If you do not specify a directory, the files will be exported to the volume directory. Output Directory names are truncated if longer than 50 characters, and cannot contain the following characters: ! " # $ % & * + . / : ; < = > ? @ [ \ ] ^ { | } ~ “ ”
    • Exclude Email Headers – By default, email header information (metadata) is included in the produced text versions. This includes metadata from emails, calendar items, tasks, and journal entries. Set this option if you want the produced text versions to exclude metadata from the email header and include only the email body.
  • PDF – Select to include PDF versions of native files in the Export. If you decide to perform this PDF conversion, Export will see if there are available images for the non-PDF native files (that is, images that were either imported through a Load File Import or part of an External Image Import). If not, Export will convert the non-PDF native files to PDF format. Copies of existing native PDF files are exported if you specify a separate directory for the PDFs. The native PDFs are used if you export native and PDF versions to the same directory or if the export is set up for PDF versions only. If both native and PDF formats are selected and go to the same directory, then a PDF with the naming convention <DocID>.orig.pdf also appeard. By default, selecting PDF Conversion does not convert image files.
    • Output Directory – Enables you to specify a target directory for the PDF files that are exported. If you do not specify a directory, the files will be exported to the volume directory that contains the other exported files. Output Directory names are truncated if longer than 50 character, and cannot contain the following characters: ! " # $ % & * + . / : ; < = > ? @ [ \ ] ^ { | } ~ “ ”
    • Highlight Search Terms – On a per-volume basis, when PDF is selected to export PDF versions, this option enables you to highlight search terms that match queries you supply in the Search Terms section, which displays when you select Highlight Search Terms and/or Generate Search Reports. (For details on using the Generate Search Reports option see Set Up an Export.) Note that your supplied search terms are subject to highlighting for each production of a volume (each time a PDF is generated for a document eligible for Export). This gives you the option of entering new search terms to highlight in the PDFs generated for a subsequent volume.
    • Generate Remaining Images – On a per-volume basis, enables you to export image files in PDF format. By default, the export process does not convert image files. The conversion process supports the following image formats:
      • Portable Network Graphics Format (png)
      • Tagged Image File Format (tiff)
      • Windows Bitmap (bmp)
      • Compuserve GIF (gif)
      • Progressive JPEG (jpg, jpeg)
      • JPEG 2000
      • JPEG 2000 jpf Extensionx`
      • JPEG 2000 mj2 Extension
      • JPEG File Interchange
      • Paintbrush