Export Overview

A user with the appropriate permissions can set up an eDiscovery Export to export files and/or manifests based on selected export criteria for documents in Project Data.

This topic includes the following main topics:

Procedural Overview
Export Locations, Directory Structure, and Generated Files
About the Project Export Settings

Procedural Overview (eDiscovery Export)

To perform an eDiscovery Export of data in the Project.

From Exports in the tree, select Exports, then right-click (or use the ellipsis () context menu) and select New Export Stream, or use the Summary to select the New Export Streamoption. This launches the eDiscovery Export dialog.
From the Export dialog, select a template to use for the standard load files (the default System Created Template, a Project template you created for the Export Fields, or an Organization template), and then select a name for the Export. The first time you perform an export from Project Data, you must assign a name to identify the Export Stream (up to 32 characters). The stream name represents an export performed using a set of export criteria. A newly created Export Stream uses the selected set of criteria with a Volume Number of 1; thereafter, each subsequent staged preview or Export for this Export Stream uses an incremented Volume number and the same basic Export criteria. (For additional Exports of a Stream, you can change manifest options, Export Data Areas, Export mapping, or a Load File option to split the load file to separate duplicates from non-duplicates.) You can view and manage the Volumes associated with an Export Stream.

Note: If you need to change key export criteria to prevent exposure of documents previously included in an Export Stream (such as documents associated with a specific Tag), create a new, named Export Stream with the appropriate criteria. Since the software maintains the named Export Streams for a given view, you can view and compare them if necessary, or you can delete a previous Export Stream that is no longer applicable, which deletes all associated Volumes in the Stream. You can also delete the last Export Volume in the Export Stream (Open, Staged, or Exported). You cannot delete an Export Volume other than the last Volume in the Stream. See Export Volume Document View for more information about Volume options.

As part of setting up the Export Stream, you must select an available export location (a target Export Data Area associated with an available Organization Connector). When you set up an Export Stream to define Export criteria, you must have at least one export location available. If only one Export location is available, it is selected for you.
Select Export criteria, such as Tags and export format and/or manifest type. See the eDiscovery Export dialog for details about the various Export options.)
When you are done selecting Export criteria for the Export Stream, select one of the following from the Export dialog:

Stage Export, which effectively performs the calculations for staging and enables you to examine the details of an export stream or individual Volume prior to file production:

For a new export stream, clicking Stage Export adds the named export stream under Exports in the Project Navigation Tree and adds the first Volume under the export stream. To see this new export stream, make sure that Exports is open in the Navigation Tree.

For an existing export stream, clicking Stage Export creates an additional Volume under the export stream in the tree. Since each click of Stage Export creates a Volume, you could have multiple Volumes that have been calculated for export, but not yet exported.

Note: You can selectively include or exclude a Volume from being part of subsequent Export Volumes in the Export Stream. See About Including/Excluding Volumes for more information. Excluding a Volume also affects the deduplication calculations in subsequent Export Volumes in the Export Stream. By default, an Export Volume is included in an Export Stream, and its documents are included in subsequent deduplication processing. To exclude a particular Volume from the Export Stream, and for future deduplication processing in the Export Stream, go to Exports and open an eDiscovery Export Stream in the Navigation Tree to view all of the Volumes, thenclear the checkbox next to the Volume that you want to exclude. This prevents that Volume from being part of a subsequent Volume Export, or from being part of the deduplication processing for a subsequent Export. Documents in an excluded Volume can still be produced in a subsequent Volume if they still meet the Export criteria. When you change the state of a Volume, a Work Basket task appears to confirm the state change (for example, Changing state to Excluded for <stream_name> - VOL<#> volume.

Run Export, which performs the export using the export criteria:

For a new Export Stream, clicking Run Export creates the named Export Stream, creates the first Volume under Exports, persisting the Volume to disk.

For an existing export stream, clicking Run Export creates another Volume under the Export Stream, then stages and exports any Volume for this Export Stream that has not already exported.

Note: You can view Export Stream Documents or Export Stream Settings, or view a Volume with the Export Volume Document View. Options for managing an export stream or volume depend on the status of the stream or volume (Open, Staged, or Exported).

Cancel, which cancels the export and closes the Run Export dialog.

Note: If you cancel an Export in progress, the state of the volume for the Export Stream after the cancellation depends on how far along the Export was before it was canceled. If, for example, you start a large Export and then cancel it right away, if the Export has not crossed into the Staged state, then the open volume will be deleted upon cancellation. If the Staged or Exported state has been crossed, then the volume will remain in that state. For example, if you cancel an Export at approximately 50%, the Export will be Staged and the Stream and Volume will appear populated in the tree, but no documents will be produced at the Export location. You can then right-click the Export Stream in the tree and use Export All Staged to complete the Export and produce the appropriate files.

General Usage Notes

In general, an Export includes the following:

All files, or all files that meet a qualifying Tag query, with or without selected file load files/manifests. (Tagging can be applied in bulk, to individual documents, emails, or results, or to an entire email thread or Cluster, if applicable.) Even if more than one Tag operation tags the exact same file, that file is only exported once.
By default, the Associated Family Files option ensures that the Export evaluates any email, documents, or attachments that are part of the same Message Attachment Group (MAG) or Document Attachment Group (DAG) as a qualifying tagged file. You must set the Separate Email Attachments and Separate OLE Attachments options if you want to Export email attachments and OLE attachments as separate files.
Depending on the Export Options selected, associate files identified from the set of qualifying files. Associate files can be emails and/or attachments that are part of the same email thread as a qualifying tagged file. You can optionally include all exact duplicates of a qualifying tagged file, or analyze the contents of the Export in terms of near duplicates.
Selected load files/manifests only, if the option is selected to export load files only, without export of any files. Selecting a load file-only option for CSV without any other options yields optimal performance.

Export Locations, Directory Structure, and Generated Files

Any Export that produces files (for example, an eDiscovery Export or a Review Production) requires at least one available export location (Export Data Area), which is associated with a type of Connector such as NFS or CIFS. An Export Data Area is defined as part of the Organization Settings. An Organization Administrator of the provider or Organization data center ensures availability of the appropriate Export Data Areas that define discrete data locations (subdirectories or URLs) for an Organization. Access to an Export Data Area for file retrieval after export can be negotiated between administrators.

Note: Export relies on activated Data Areas, one or more Import Areas for the source data in your Project and your selected Export Data Area that serves as the Export location. If an Import Area for the source data is unavailable at Export (for example, a directory with the source data has been renamed or removed), you will see the error: At least one Import data area used in this case is not currently available. If the Import Areas in your Case/Project are available, but your selected Export Data Area is no longer available (for example, if it has been removed), you will see an Export Data Area error upon Export: The selected export data area is not available.

For Export Data Areas following the eDiscovery format, the Exports are performed based on Project Data, a named export stream (generation), and Volume. (A named export stream or generation represents multiple exports with a given set of criteria.)

The directory structure for an Export is as follows:

<exportarea>/<casename>-<streamname>/<volume>

With each Export of an Export stream, the Volume ID is incremented. Upon assignment of a new Export Stream, the Volume ID is reset.

For example:

/export01/caseA-streamA/VOL0001

/export01/caseA-streamA/VOL0002

If the user establishes another Export Stream, potentially, with different data to export, the new Stream restarts the Volume numbering for each Export within that stream:

/export01/caseA-streamB/VOL0001

/export01/caseA-streamB/VOL0002

/export01/caseA-streamB/VOL0003

Note: If you generate a new version of an existing load file (see Generate a New Load File for an Export Volume or Stream ), the new version will appear in the appropriate Volume directory at the Export Area and will have a timestamp that enables you to track versions. The original version is preserved.

About the Processing Order for Export

Digital Reef software uses the following processing order for loose files upon Export:

If the Export has all loose files, then the order is alphabetical.
If the Export has loose files and flat structured directories: the order starts with the alphabetically first folder, then, descending into that, check for folders, and then export alphabetically. Root files are exported last in alphabetical order.
If the Export has loose files and directories with one under another (hierarchical structure), the order starts with the deepest directory first with files in alphabetical order, working backwards. Root files are exported last in alphabetical order.

Digital Reef software uses the following processing order for emails upon Export:

Alphabetical for folders as mentioned above (e.g., Inbox).
Then for individual emails, by date received (date sent for Sent folder).
Within the Message Attachment Group (MAG), the attachments are numbered based on their order.

For example, if ABC00000297 is a parent email with 3 attachments, each attachment is numbered based on its order. Therefore, the first attachment is ABC00000298, the second ABC00000299 and the third, ABC00000230.

About Generated Load Files

The original version of a load file for a Volume identifies the Volume prefix and ID and the load file format (for example, if the first Volume has selected formats of CSV and DAT, you will see VOL0001.csv and VOL0001.dat).
If you use Generate Load File to generate a new version of an existing load file (without the stream-level Generate Consolidated Load File option), the new version will reside under the appropriate Volume directory at the Export Location and will include a timestamp in the filename that enables you to track versions (for example, VOL0001-20130809171945.csv). The original version is preserved.
If you use Generate Load File for a selected stream and select the Generate Consolidated Load Fileoption, the Export Location will contain a consolidated load file in a Consolidated Load Files folder under the stream. The load file(s) generated will be named <stream>]-<timestamp>.<extension>, with the timestamp in YYYYMMDDHHMMSS format (for example, export1-20170120134146.csv).
If you use Update Export to replay Volumes, you will also see Volume update load files, which include the word update in the filename (for example, VOL0001-update-20130809201314).

About Family Association across Multiple Export Volumes

Family association is maintained across multiple Export Volumes, which accommodates the situation in which new children of email or eDoc parents are available after Export of the parent. For example, you may Export an email containing a password-protected ZIP file and then reprocess that ZIP file after Export to extract its children and perform another Volume Export. Maintaining family association across Volumes also addresses the situation in which some children of a parent email or eDoc are not added to Project Data until after Export.

Maintaining family association across Volumes applies to new Export Streams, or to existing Export Streams with newly exported parents. It does not apply to previously exported parents, or to existing Export Comparison results (in which case, the newly added children are treated as loose documents). To have an Export Comparison recognize the new children, produce the children in a new Volume and then perform a new Export Comparison.

The DocIDs for new children introduced in subsequent volumes reflects the appropriate parent information and their position in the family based on a numerical suffix (for example, a ZIP that is the last member of a family with Doc ID DOC0000000004 might provide new 4 family members with Doc IDs such as DOC0000000004.0001, DOC0000000004.0002, DOC0000000004.0003, and DOC0000000004.0004). If page-level numbering is used, the suffixing of Doc ID value would be based on the parent's EndDoc value (for example, if a parent document exported in a previous volume had a page range of DOC0000000001 - DOC0000000004, the parent document's subsequently exported attachment would have a starting value of DOC0000000004.00001.

In a newly generated Load File, fields including DocID, AttachmentID, BegDoc, End Doc, EndAttach, AttachRange, and DocumentRange fields will identify the children introduced in a subsequent volume.

About Including/Excluding Volumes

You can selectively include or exclude a Volume from being part of subsequent Export Volumes in the Export Stream. Excluding a Volume also affects the deduplication calculations in subsequent Export Volumes in the Export Stream.

By default, an Export Volume is included in an Export Stream, and its documents are included in subsequent deduplication processing. This is indicated for the Volume in the Navigation Tree by a checkbox shown in the enabled state .

To exclude a particular Volume from the Export Stream, and for future deduplication processing in the Export Stream, go to Exports and open an eDiscovery Export Stream in the Navigation Tree to view all of the Volumes, then clear the checkbox next to the Volume that you want to exclude. This prevents that Volume from being part of a subsequent Volume Export, or from being part of the deduplication processing for a subsequent Export. Documents in an excluded Volume can still be produced in a subsequent Volume if they still meet the Export criteria. When you change the state of a Volume, a Work Basket task appears to confirm the state change (for example, Changing state to Excluded for <streamName> - VOL<#> volume.

About the Export Settings Text File and Production Report Text File

Regardless of the combination of export load file types you select, an additional text file called<volume>-version-settings.txt is generated at export, at the Export Location, to provide the Digital Reef Version number and a summary of the export settings.

The Export Settings file also includes the Organization name and Project name, as well as the Project ID and Project GUID.

If you update or reproduce a Volume, a new version of this file will reside under the appropriate Volume directory at the Export Area and will include a timestamp in the filename that enables you to track versions (for example, VOL0001-settings-update-20120809171955.txt).

If you generate a new load file, you may also see a settings file (and settings overlay file) generated with a timestamp. These settings files will not reflect changes you made to the generate load file options; they will reflect the settings (from the Settings tab) for the associated volume export (that is, what is persisted in the database).

A production report text file (for example, VOL0001-production-report.txt), provides the exported file counts and sizes for each type of file. You will see variations of this file based on the export-related operation (for example, VOL0001-production-retry-report.txt or VOL0001-production-update-report-20160809172856.txt). The production report counts are not cumulative; the file only contains counts for the current operation.

About Virus Detection Files

Upon export of native files, the virus detection software installed on the system will automatically check the exported native files for viruses and quarantine any native files with viruses. Once the export completes, a count of the native files that were quarantined will be reported in the export volume report (that is, on the Reports tab for the volume), and in the production report text file at the Export Area.

As long as the virus detection software is available on the system, additional files will appear at the Export Area for a volume:

VirusSummary.txt, which contains a summary of the counts from the virus detection software and information about the paths of both Resolved and Unresolved virus removals.
VirusDetail.xml, which provides detailed information generated by the virus detection software. Be sure to open this file in test editor such as NotePad++ to see all of the expected content.

If the virus detection software is not installed on the system, the following file will appear at the Export Area for a volume:

VirusError.txt, which provides a message to indicate that the virus detection software is not installed on this system, and that virus detection has not been run on the exported native files.

About Duplicate Overlay Files

If you perform multiple exports of an Export Stream and have updated data in one or more of the volumes, you can use the Duplicate Overlays option to request generation of a per-volume overlay file with entries for the updated data. The overlay file is useful when you want to track duplicate file information reported for records related to items such as Custodians in the DuplicateCustodian metadata field. You can inspect information in the generated overlay files for the different volumes.

For example, if VOL1 contains an original document, and duplicates of that file appear in VOL3 and VOL5, you would see entries in the CSV files VOL0001.csv, VOL0003-overlay.csv, and VOL0005-overlay.csv (as long as you selected CSV as an output format). Note that if you use the Separate Duplicates option for an export or load file generation, you can see an overlay file for the duplicates CSV (for example, VOL0003-duplicates-overlay.csv). Overlay manifests are also generated automatically as part of the Export Generate Search Reports option issued for a given volume of an Export Stream.

About the Export Exceptions CSV

Regardless of which combinations of export load file types you select, an additional .CSV file called <volume>-exceptions.csv is generated at export, at the Export Location, to provide more information about export exceptions that may have occurred. It includes columns for the following metadata fields:

DocID
custodian
importpath
osfolder
MailStore
mailfolder
filename
docext
filetype
size
Duplicate
Exception (Y will appear for an Exception)
ExceptionDescCode (one of the following from the table in Export Exceptions)

In the Work Basket, you will see a Warning icon () if the Export task completed with exceptions. You can then download the WARNING_DETAILS_REPORT.csv file from the Work Basket.

At the Export Area, you can examine information in the Volume-specific exceptions file, <volume>-exceptions.csv.

The following table lists the possible Export Exceptions that can be reported in the Export metadata field ExceptionDescCode of a load file and Volume exceptions file, along with the associated import (parsing and OCR) warnings/errors, where applicable. Note that not all exceptions are associated with a parsing status and are marked as NA.

Export Exception Code	Description	Associated Import Warnings/Errors
CONNECTOR_FAILURE	Indicates a Connector failure (for example, the file could not be retrieved using the Connector and location). This failure may also occur if the Connector itself could not be read.	NA
CONNECTOR_READ_ERROR	Indicates that either the document or directory could not be read, or the document or directory has illegal (invalid) characters in the name.	NA
CONVERSION_FAILURE	Indicates a failure at Export during conversion to HTML, MHTML, or HTML/MHTML, depending on what is selected for the Export. This error may also occur for a Special File (that is, a file that is not a directory or regular file).	NA
CORRUPT	Indicates a file identified as corrupted or damaged in some way during the parsing process.	00028 FILE_DAMAGED
ENCRYPT	Indicates an encrypted or password-protected file.	00009 FILE_READ 00027 ENCRYPTED 00029 PROTECTED 0102 PARTITIONS_ENCRYPTED 01215 OCR_IMF_PASSWORD_WARN
INVFILETYPE	Indicates a file identified as having an undetermined file type or unsupported file type during parsing.	00019 FILE_TYPE_UNDETERMINISTIC 00021 FILE_NOT_SUPPORTED (Parsing Library V1 only); 00068 FILE_ID_ONLY for Parsing Library V2 00040 PARTIAL_TEXT_EXTRACTION 00043 UNSUPPORTED_LOTUS_NOTE 00044 UNKNOWN_LOTUS_NOTE 01001 SPECIAL_FILE 01008 LOTUS_NOTES_NOT_INSTALLED 01009 LOTUS_NOTES_NOT_LICENSED
NATIVEFILE_NOT_FOUND	Indicates that the native file could not be found (that is, the document or directory does not exist and the document is considered missing).	01006 CONNECTOR_RETRIEVE_ERROR
NO_TEXT_FOUND	Identifies all files that did not have text extracted during processing. These have a parsingstatus of NODATA.	00005 NODATA
OCR_ERROR	Indicates an OCR error. For a complete list of OCR errors, see the OCR Errors section of the View a Scan Report for a Data Set topic.	01203 OCR_LOAD_IMAGE_ERROR 01204 OCR_RECOGNITION_ERROR 01205 OCR_PREPROCESSING_ERROR 01206 OCR_LOW_CONFIDENCE 01207 OCR_TIMEOUT
OCR_INVIMAGE	Indicates an OCR processing error, most likely because the image was in a format that could not be handled.	01209 OCR_IMF_NOTSUP_ERR 01210 OCR_IMF_TAGMISSING_ERR 01211 OCR_IMF_COMP_ERR 01212 OCR_IMF_IMGFORM_ERR 01213 OCR_IMF_FILEFORMAT_ERR 01214 OCR_IMF_COLOR_ERR 01216 OCR_NO_TXT_WARN 01217 OCR_ZONE_NOTFOUND_ERR 01219 OCR_IMG_RECT_ERR 01220 OCR_IMG_DPI_ERR 01221 OCR_IMG_NOTFOUND_ERR 01222 OCR_IMG_COMPRESSED_ERR 01223 OCR_IMG_BITSPERPIXEL_ERR 01224 OCR_IMG_SIZE_ERR
PARSING_ERROR	Indicates the appropriate import (parsing) error, as identified in the Scan Report. For a complete list of the import errors, see the Warnings and Errors section in the View Data Set Reports topic.	00012 ICU_CONV_ERROR 00015 NO_EMAIL_BODY 00024 SYSTEM_ERROR 00025 PROCESS_ERROR 00026 TIMEOUT 00031 PARSER_LIB_ERROR 00037 PERMISSION_ERROR 00038 FILE_NOT_FOUND 00041 ATTACH_OPEN 00042 ATTACH_SAVE 00046 MISSING_ATTACHMENT 00048 ATTACH_MANIFEST_ERROR 00049 NESTED_DUPLICATE_ARCHIVE 00050 UNSUPPORTED_PDF_FORM 00051 INVALID_TRANSPORT_HEADER 00052 FILE_TYPE_MAP_FAILURE 00053 CHILD_ARCHIVE_EXTRACTION_ ERROR 00056 NSF_PARTIAL_EXTRACTION 00057 THIRD_PARTY_SOFTWARE_NOT_INSTALLED 00058 ATTACH_ENCRYPTED 00059 ATTACH_DAMAGED 01005 FILE_DECOMPRESS 01010 ARCHIVE_EXTRACTION_ERROR 01012 CONNECTOR_OPERATION_NOT_SUPPORTED 01013 EWF_FILE_MISSING 01014 EWF_FILE_INVALID_NAME 01015 EWF_FILE_ERROR 01016 NO_PARTITIONS 01017 PARTITIONS_SKIPPED 01018 PARTITION_ERRORS 01023 DISK_IMAGE_EMPTY_FILENAME
PDF_HIGHLIGHT_WARNING	Indicates that search term highlighting failed during generation of a PDF for export. The PDF was still generated, just without highlighting.	NA
SETUP_ERROR	Indicates an installation or setup error (for example, that Lotus Notes or OCR is not installed or licensed).	01019 FUSE_NOT_INSTALLED 01201 OCR_INIT_FAILED 01202 OCR_NOT_LICENSED
SKIPPED	Indicates excluded, or skipped, files, identified with the Excluded category in the Scan Report. This exception means that archive files (.zip, .tar, and .pst) were excluded from the archive extraction process during import.	01000 SKIPPED_FILE 01020 DIRECTORY_SKIPPED
SUCCESS	Indicates success.	00000 SUCCESS 00005 NODATA 00017 FILE_ZERO_LENGTH 00045 MAPPED_LOTUS_NOTE 00047 ARCHIVE_EMPTY 00054 LOTUS_NOTES_ARCHIVE_STUB
SYSTEM_ERROR	Indicates a system error.	01011 DIRECTORY_ACCESS_ERROR 01021 FILE_NAME_INVALID_CHARACTERS 01218 OCR_IMG_NOTENOUGHMEMORY_ERR
UNEXPECTED_CONVERSION_FAILURE	Indicates an unexpected conversion-related issue at Export. For example, if you select PDF for Export, the software assesses the images to see if there are any missing or if there are any image file types not supported for conversion. If you do not also select Generate Remaining Images for the Export, you will see this error.	NA
UNKNOWN	Indicates an unknown error or unexpected error.	00001 FAILURE 00016 FILE_OPEN 01200 OCR_FAILURE

Note: After an Export that includes PDFs, you can search for stored_image::<exists> to confirm that documents had stored images available for use during Export. In general, the stored_image field (one of the Digital Reef Analytic Properties) is populated once images have been generated for Export (Internal) or imported as baseline images as part of an External Image Import (External). The field contains one of the following values: Internal (generated internally), Internal - <timezone> (for a document affected by the Export time zone selected for Export, such as emails or Calendar items), External (imported images), or Error (for a document that failed conversion, for example, if the software could not create a PDF from the provided images during the import of external images).

About the Warning Details CSV for Exports Using Page-Level Numbering

An export that uses page-level numbering will fail if the export encounters missing images (for example, because of a conversion failure). In this case, the Work Basket task will show the following error message:

"Documents that could not be numbered at a page-level were encountered in the export. Please download the errors file for this task for more information."

You can then right-click the task in the Work Basket and click Download to download the errors file, WarningDetails.csv. This file provides a list of document handles and one of the following exceptions:

CONNECTOR_FAILURE
CONNECTOR_READ_ERROR
CONVERSION_FAILURE
NATIVEFILE_NOT_FOUND
UNKNOWN

SKIPPED will also be in the file for documents not eligible for conversion, such as a ZIP attached to an email.

About the Tag Details CSV

Regardless of the combination of export load file types you select, an additional .CSV file called <volume>-TagReasonCodes.csv is generated at export to provide more information about how exported Tags were applied (for example, to a Work Basket Search task). This .CSV file contains the following fields:

SearchID, which is an identifier for each Tag that is part of a Tagging operation.
User, which identifies the user who performed the operation.
Date, which provides the date and time that the operation was performed.
Description, which provides details about the operation. This field is populated for a tagging operation such as a Search Results view or tagging of a family. Example 1: Tag View Searching for money in batch1 occurred in CaseData. Example 2: Tag MAG event occurred in CaseData. When multiple tags are applied during an operation, the same description appears for each SearchID entry (one for each Tag).
Comment, which identifies any comments added during the Tagging operation. When multiple tags are applied during an operation, the same comment appears for each SearchID entry (one for each Tag).

Note: In the appropriate load file, a given document may report one or more Tags in the TagID field and one or more Search IDs in the SearchID field (one for each Tag applied).

About the Project Export Settings

A user with the appropriate permissions can configure Project Export Settings. These Export Settings serve as a file export template and appear in the load file (manifest) after export:

<Volume_Prefix><4-digit value>/<optional output dir>/<5-digit value for folder>/<Document_Prefix><10-digit value>

For an export, the appropriate manifest identifies a volume ID, which consists of a Volume Prefix (by default, VOL) and a four-digit value. The Export Setting is called Volume Prefix.
Each folder in an export can contain up to 2500 files by default. The Export Setting for configuring the maximum size of a folder is called Folder Size.
Within each folder are the exported documents, each named and assigned a document ID, which consists of a Document Prefix, and a 10-digit value. The Export Setting is called Document Prefix. Duplicate files (tracked per custodian) have unique IDs.

Export Requirements and Guidelines

If email threading information is desired as part of the export, make sure that email threading for Project Data is enabled ensure that the contents of an email thread can be identified. The export will not contain email thread information if the data in the view has not been threaded.
Near Duplicate detection and export applies to an Analytic Index only. If you do not have data at the Analytic Index level, the export process will not include options for the export of near duplicates.
You have the option to export all documents in a view with or without export file manifest, export documents meeting a Tag query, or export manifests only, not documents in the view. A user with Organization Administrator privileges can ensure that the correct Tags are supported. A predefined set of System Tags supports typical eDiscovery tagging, but Custom Tags can be created as well.
If an export is set up for many documents, be aware that the Near-Duplicate processing of all documents in a large view of documents can be a time-consuming process. Near Duplicate processing for Export will generate its own Work Basket task.
Any user can apply one or more available Tags to a set of Search results from the Work Basket, or to one or more files in a results window. Users can also Tag an entire email thread, or a particular email message.

Document Changes

Exporting an Export Stream multiple times does not re-export any previously exported files; it will export any additionally qualified or added, qualified files that have not been exported previously.
You can add documents to the next volume of an Export Stream in one of the following ways:
- Tag additional documents with a Tag specified for the Export criteria.
- Add documents to Project Data, which ensures that those documents are evaluated for the next export and will be included if they meet the export criteria.
Removing documents from Project Data does not affect the current Export Stream Volume.
If you change the assignment of documents to Custodians and want to update an Export Stream to reflect the change in Custodian information, you can right-click on an Export Stream and select Update Custodians.