View and Learn about Document Metadata
Document Viewer > Metadata
The system supports a large set of document metadata fields to generally provide information about documents after indexing and/or export. Metadata fields vary per file type, and, in some cases, are added by the system to provide additional information. A subset of the metadata fields apply to export only and assist an eDiscovery work flow.
The following summarizes how to work with metadata:
- After indexing of data, you can display and search any of the metadata fields that are part of the index.
- The indexing state (representation level) for the data determines whether the index has just system (structural) metadata, file (embedded) and system metadata, or full system/file metadata and content. If you need a valid file MD5 value (filemd5 field) for deduplication purposes, you must minimally have a File Metadata Index. (Analytic metadata also applies to an Analytic Index and can be used for searching.) You can view metadata for any document in the Document Viewer.
- In addition to providing a file's MD5 value at import, Digital Reef also provides a file's SHA-1 value (160 bits, or 20 bytes) in the filesha1 field. (SHA-1 is a Secure Hash Algorithm developed by the United States National Security Agency.)
- The metadata fields you view are determined by the list of Metadata View Fields. The default list of Metadata View Fields includes common properties (such as email and document properties), but you can establish your own Metadata View Fields list.
- Metadata fields fall into one of the following types:
- Structural metadata fields represent information about the file itself, such as the file location or file size.
- Embedded metadata fields represent information that comes out of the document content, such as the subject or author. Note that the field content for embedded metadata fields such as filetype, author, and title is preserved in its original case instead of being displayed in all lowercase. Searching for this content remains as is (that is, search is not case-sensitive for tokenized fields that treat each term individually, or for fields that require all content but are not case-sensitive).
- Analytic metadata fields represent analytic information associated with Project Data and its views, such as Cluster Views, Group Views (Folders), and Tag Views.
- Prior to export, a user with Organization Administrator permissions can use the Export Fields to dictate which metadata fields appear in the export manifest, the order of the fields, and the field name. For example, to accommodate an eDiscovery work flow, the Administrator may want to order key export-only fields ahead of other fields and rename certain fields.
- After export, an Organization Administrator can view the complete set of system metadata fields (all indexed fields and export-only fields) in the appropriate file manifest (Concordance .DAT, .CSV, or EDRM XML). An EDRM XML file manifest additionally provides EDRM-specific fields.
About the Searchable Metadata Fields
By default, any term Search is a field Search for content (which uses a default field called contents
), and the search is automatically expanded to search a subset of metadata fields.
You can also perform a metadata search for metadata field content only, in which you supply the metadata field name followed by :: (two colons) and the text or value for the field. In addition, you can also limit an entire content::
<keyword> or content::(<keyword1> <keyword2> <keyword3>)
. For example, with Include Metadata enabled, the Standard Search syntax query war OR content::peace
expands the search of the keyword war
to include a common subset of metadata fields such as subject::war author::war
, but restricts the search of the keyword peace
to content-only. Note that for emails, a content::
search applies to both the email subject and the email body.
You can search indexed metadata fields. Consider the following when performing a metadata search:
- A tokenized metadata field is not case-sensitive and enables you to specify all or just a part of the field data, with or without wildcards. For example, you can search the tokenized author field using a first name (
author::jane
), a last name (author::jones
), or the entire name as a phrase (author::"jane jones"
). - An untokenized metadata field requires you to specify all field content or use wildcards that address all field content. Some fields are also case-sensitive, such as
entryid
andmessageid
. Some fields, such aspatternvalue
, are suitable for a literal search in single quotes. - You can use wildcards in a phrase search for metadata fields that observe standard tokenization (for example, bcc, cc, to, from, author, subject, title, comment, comments), but not the fields that observe path tokenization rules (for breaks on \ or /, such as the reviewfolder field). For example, you could use
subject::"advisory boar*"
to find all emails with a subject that includes the phrase advisory board and advisory boards (because subject is standard tokenized), but you could not use"reviewfolder::"mail*"
to find a path with mail because reviewfolder is path tokenized. You could usereviewfolder::mail*
instead. - In general, if the metadata field content has spaces or special characters, you can put the entire field information in quotes. (This does not apply to fields with path information, such as the path-tokenized fields.)
- For metadata fields that take a date, you must use the complete date format
yyyy-MM-dd-HH-mm-ss
if you are performing a search other than Advanced Search (for example, Freeform Search or Current Results). Use Advanced Search to get help entering complete dates.
About the Analytic Metadata Fields
A subset of fields are dedicated to reporting analytic information for documents within Project Data; that is, the information derived from analytic capabilities such as the identification of different types of Project Data views. For example, you can perform Searches on the following view-related Analytic Metadata fields, as described in Use the Standard Syntax for More Advanced Searches:
tag_view
custodian_view
(identifies the assigned Custodian view name or Unassigned, if no assignment has been made)batch_view
(identifies the assigned batch view name or Unassigned, if no assignment has been made)mediaid_view
(identifies the assigned mediaid view name or Unassigned, if no assignment has been made)group_view
cluster_view
dupe_fingerprint
(a value that is computed based on the email deduplication settings for an email; for files that are not email, this is the file MD5 value). Thedupe_fingerprint
value is available for export by default. See Manage Analytic Settings for the Project for more information about managing the email deduplication strategy.export_view
(identifies an export stream and volume for a document subject to Export). You can search (for example, using a Freeform Search) using the formatexport_view::
<stream_name>–[<vol_name>].export_view_docid
Another Analytic Metadata field called dateprimary
provides a primary date value that accommodates dates from different types of source files (for example, documents found on disk versus email messages) in the Date column of document lists derived from Project Data.
Analytic Metadata fields support Email Survivorship, such as export_view_dupe_priority
(for example, <export_stream>-VOL0001:0002), and export_view_reason_code
(for example, <export_stream>-VOL0001:Search).
If you can view and manage Export information in your Project, an export-related Analytic Metadata field called export_view_docid
reports a per-volume document ID to help you learn about a given document's export history (that is, which export streams and volumes are associated with the document). This field uses the following format <export_stream_name>-<export_volume_name>-<per-volume-docid> to identify the document ID associated with a particular export volume of an export stream. You can search for a document using this field information (for example, export1-VOL0001-DOC0000000003
would return the document with document ID 0000000003 in export stream stream1 and export volume VOL0001). You can use wildcards for the different portions of the format, each separated by a - (for example, export_view_docid::*-*-DOC0000000003
). You can use this field in range searches as well.
Of the Analytic Metadata fields, the following can appear in a view manifest (generated for a view of Project Data using the Create Manifest option with All Metadata Fields, or in a selected metadata template that includes the fields).
batch_view
cluster_view
cluster_view_plus_similarity
custodian_view
dateprimary
dupe_fingerprint
export_view
export_view_docid
export_view_reason_code
export_view_dupe_priority
family_fingerprint
family_handle
group_view
mediaid_view
tag_view
thread_handle
For export, the custodian_view Analytic Metadata field can appear in an Export manifest (load file), if it is included in the Export Fields of the Project Export Fields (or template) used for an Export.
About the Export-Only Fields
Some metadata fields apply to Export only. In the metadata tables, the Export-only fields have a column called Export-Only Field set to Yes.Note the following about these Export-only fields:
- Export-only fields are in uppercase/lowercase (mixed case, or bumpy case) format. An example is DocID. All indexed system metadata fields are in lowercase.
- The set of Export-only metadata fields capture key document information for an eDiscovery work flow. These are considered special system metadata fields that may warrant a specific order in the appropriate load file.
- The Export-only fields are not intended for searching and apply only to the appropriate Export load file. Export-only fields are not visible or searchable in the Document Viewer.
Exported .CSV for Tag Details
Regardless of the export file manifest type, an additional .CSV file called <volume>-TagReasonCodes.csv is generated at export to provide more information about how exported Tags were applied (for example, to a Work Basket Search task). This .CSV file contains the following fields:
SearchID
, which is an identifier for each Tag that is part of a Tagging operation.User
, which identifies the user who performed the operation.Date
, which provides the date and time that the operation was performed.Description
, which provides details about the operation (for example, Search for filename:*.xls AND econom* in C1). This field is populated for when the operation is a Search, but not when Tagging is applied to a document, email Thread, or individual email message. When multiple tags are applied during an operation, the same description appears for each SearchID entry (one for each Tag).Comment
, which identifies any comments added during the Tagging operation. When multiple tags are applied during an operation, the same comment appears for each SearchID entry (one for each Tag).
In the appropriate file manifest, a given document may report one or more Tags in the TagID field and one or more Search IDs in the SearchID field (one for each Tag applied).
Metadata Fields
The following table lists the metadata fields in alphabetical order and identifies which ones are export-only. Note the following:
- System metadata-only fields are identified in the field description. These are the structural metadata fields that apply when the Index state is set to System Metadata. If a system metadata field has an MD5 value, that value is zeroed.
- Indexed field names (that is, fields populated at import) are displayed in lowercase.
- Field content for embedded metadata fields such as filetype, author, and title is displayed in its original case (for example, file types in the filetype field are displayed in mixed uppercase/lowercase). Searching this field content depends on whether the field makes each term in the field content searchable, or whether all field content must be provided (which means that the field may or may not be case-sensitive).
- Many fields in the following table are tokenized (as indicated), which means they support individual term search and you can search part or all of their content without regard to case, with or without wildcards. If a metadata field is untokenized, all of the information in the field must be accommodated in a search. Some fields are also case-sensitive when searched (for example, entryid and messageid), but many are not; review the field's description to determine this.
- As indicated, a metadata field either has a value or text.
- All date information is shown according to the Project time zone, either the default time zone of Coordinated Universal Time (UTC), or a time zone selected using the Project Preferences. Supporting fields identify the appropriate UTC offset (known as a Greenwich Mean Time offset) and standard time zone, when available. Typically, the offset is populated and the time zone is not populated. When no offset is available to indicate a local time zone, the appropriate offset field displays -0000.
- Exported Date field information is by default stored in the format mm/dd/yyyy, but Excel shows the date without any padding (for example, as 4/4/2014 instead of 04/04/2014). The Excel style is shown in all Export Date field examples.
- When you set up an Export, the date and time format selections affect the load file format of the date-only Export metadata fields (for example, DateCreated), time-only fields (for example, TimeCreated), and the general fields that represent combined dates and times for load files other than EDRM XML. EDRM XML has its own format and does not observe your date/time format.
Note: The software relies on a special field called contents
to enable any search for content. This special field is not selectable for viewing metadata or for Export Fields; it is available only in the content::
<keyword> or content::(<keyword1> <keyword2> <keyword3>)
. See About the Search Metadata Fields and Use the Standard Search Syntax for Basic Queries for more information.
Remember that tokenized fields, as described above, can be searched for part or all of their content without regard to case, with or without wildcards.
Metadata Field
(after import, or full metadata export) |
Export-Only? | Description | Tokenized? | Notes |
---|---|---|---|---|
acls |
|
|||
AllPaths | Yes | For a master parent document, this field identifies a semicolon-delimited list of unique reviewfolder field values for all duplicates of the document in the Export Stream. This field applies to all forms of Export Duplicates processing, and observes the current deduplication setting (Global or Custodial). This field reports Windows-compatible path information (that is, with \ folder separators). Example: \cust1\data1\folder1\d1;\cust2\data2\folder2\d2 | ||
altbcc | For Lotus Notes emails, this field identifies all email recipients blind copied on the email using RFC 822/2822 style email addresses, each separated by a comma or semicolon. This file also applies to Bloomberg users (Bloomberg messages or Instant Bloomberg messages) if two email addresses are defined in the user information. Example: jsmith@mybigartmuseum.org | Yes | ||
AltBcc_Address | Yes | For export of Lotus Notes emails, identifies the RFC 822/2822 style email address portion of an email recipient who was blind copied on the email (based on the altbcc field). Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
AltBcc_Name | Yes | For export of Lotus Notes emails using RFC 822/2822 format, identifies the name portion of an email recipient who was blind copied on the email (based on the altbcc field). This field is not expected to be populated often. Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
altcc | For Lotus Notes emails, this field identifies each email recipient who was copied on the email using RFC 822/2822 style email addresses, each separated by a comma or semicolon. This file also applies to Bloomberg users (Bloomberg messages or Instant Bloomberg messages) if two email addresses are defined in the user information. Example: john.c.jones@finance.someco.com, mike.a.mann@finance.someco.com | Yes | ||
AltCc_Address | Yes | For export of Lotus Notes emails, identifies the RFC 822/2822 style email address portion of an email recipient who was copied on the email (based on the altcc field). Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
AltCc_Name | Yes | For export of Lotus Notes emails using RFC 822/2822 format, identifies the name portion of an email recipient who was copied on the email (based on the altcc field). This field is not expected to be populated often. Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
altfrom | For Lotus Notes emails, this field identifies the sender of this email using an RFC 822/2822 style email address. For a Lotus Notes email sent on behalf of another person, this field value may identify the intended (impersonated) sender and may not match the altsender field value (the actual sender). This file also applies to Bloomberg users (Bloomberg messages or Instant Bloomberg messages) if two email addresses are defined in the user information. Example: bill.mark@hr.bigco.com | Yes | ||
AltFrom_Address | Yes | For export of Lotus Notes emails, identifies the RFC 822/2822 style email address portion of an email sender (based on the altfrom field). Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
AltFrom_Name | Yes | For export of Lotus Notes emails using RFC 822/2822 format, identifies the name portion of an email sender (based on the altfrom field). This field is not expected to be populated often. Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
altparticipants | For Lotus Notes emails, this consolidated field contains one or more RFC 822/2822 style email addresses, each separated by a comma or semicolon. If an email address is not complete, this field contains whatever content was discovered. The terms in the altparticipants field represent a subset of the altto, altfrom, altbcc, altcc |
Yes | ||
altsender | For Lotus Notes emails, this field identifies the sender of this email using an RFC 822/2822 style email address (the actual sender, not an impersonated sender). Example: bill.mark@hr.bigco.com | Yes | ||
AltSender_Address | Yes | For export of Lotus Notes emails, identifies the RFC 822/2822 style email address portion of an email sender (based on the altsender field). Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
AltSender_Name | Yes | For export of Lotus Notes emails using RFC 822/2822 format, identifies the name portion of an email sender (based on the altsender field). This field is not expected to be populated often. Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and the similar _Address and _Name fields in the Project Export Fields template. | ||
altto | For Lotus Notes emails, identifies each intended recipient of this email using RFC 822/2822 style email addresses, each separated by a comma or semicolon. This file also applies to Bloomberg users (Bloomberg messages or Instant Bloomberg messages) if two email addresses are defined in the user information. Example: jim.freeman@bigcoscience.com | Yes | ||
AltTo_Address | Yes | For export of Lotus Notes emails, identifies the RFC 822/2822 style email address portion of an email recipient (based on the altto field). Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
AltTo_Name | Yes | For export of Lotus Notes emails using RFC 822/2822 format, identifies the name portion of an email recipient (based on the altto field). This field is not expected to be populated often. Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
ancestordahandle | For any document that has been copied to document storage, or, in general, any child document (attachment or archive-extracted document), this field includes v and a entries (v is the parent dahandle value and a is the data area name). This field is not case-sensitive for the purposes of search for the parent dahandle value. Example for v: 0000fb2f8044ba69bc3f4754ae68a71ed699c58f and Example for a: BaseGroup1_BaseCollection | |||
ApplicationName | Yes | For export, the source application, such as Microsoft Office Word. | ||
AttachmentID | Yes | For export, a list of attachment document IDs, each separated by a semicolon. | ||
AttachmentRange | Yes | For export, the range of attachment document IDs. |
||
attachments | Lists the file names of the attachments for a parent email, each separated by a semicolon. |
Yes | ||
attachmentsmd5 | A value representing the attachment MD5 for an email. If two documents have the same attachmentsmd5 value, they have the same attachments (that is, the value represents an MD5 of all the combined attachments). This field is not case-sensitive when searched. Example: cd5389643a02e2b39a5c6d8cdab99b2c | |||
attendees | For an email Calendar item, provides a list of meeting attendees. Example: All Digital Reef | Yes | ||
author | A name identifying the author of the document. Example: Michael Bello | Yes | ||
auxfiletype | For email, an auxiliary file type that provides additional information, such as msg, eml, or emlx. The auxfiletype msg applies to any type of Microsoft Outlook MSG file that is a regular email or a Calendar, Contact, Journal, or Task entry. The auxfiletype eml applies to other types of email, such as Lotus Notes. The auxfiletype emlx applies to Apple Mail 2.0 Messages. The auxiliary file type bloomberg-attachment-archive applies to the attachment archive (application/x-gzip) for a companion Bloomberg Message Dump (XML) file. Example: msg | Yes | ||
auxparsingstatus | Identifies additional parsing information, such as warnings that apply to a file. For example, for the email parent of a Modern Attachment, this field identifies modern_attachment_retrieve_warning when a Modern Attachment could not be retrieved during processing. For an archive file that has been successfully processed by unzip instead of 7zip, this field reports unzip_fallback_processed, and for a record extracted from an archive processed by unzip, this field reports unzip_fallback_extracted. This field is not case-sensitive when searched. Example: modern_attachment_retrieve_warning | New in 5.4.2.2 | ||
averagenumberoftermsperpage | A value representing the average number of terms calculated per page. This field is populated for any document that has information in the pagecount field (for example, a PDF, or a Microsoft Word document). This field uses a padded 5-digit value. Example: 00058 | |||
batch | A System Metadata field with information that identifies a batch of imported data. When you set up an Import, you can specify a Batch name or number. If you do not specify a Batch name or number for Import, the Data Set name is used. This field is not case-sensitive when searched. Example: batch001 | |||
batch_view | An Analytic Metadata field identifying the batch view to which the document is assigned, or Unassigned if the document is not assigned to a batch. This field value matches the batch field value unless you have changed the assignment after adding the document to Project Data. If you include this field in the Export Fields used for Export, the load file will identify the document's batch view at the time the Export load file was produced. Example: batch001 | New in 5.4.0.0 | ||
bcc | Lists the email recipients blind copied on an email. You can use the bcc field as part of an email deduplication strategy (set using the Analytic Settings or Analytic Settings template). When you display this field after import, each recipient is separated by a comma or semicolon, depending on the source data (for example, for Microsoft Outlook, donna.lolly@elron.com, sara.shack@elron.com). For Lotus Notes, this field typically contains one or more fully qualified names (for example, CN=John Doe/OU=US/O=someco). For export, each recipient is always separated by a semicolon (for example, donna.lolly@elron.com; sara.shack@elron.com; moira.wren@elron.com). | Yes | ||
bcc_identifier | |
Yes | New in 5.4.1.0 | |
bcc_name | |
Yes | New in 5.4.1.0 | |
BegAttach | Yes | For export, the starting attachment document ID (for example, for an email). Example: DOC0000000011 | ||
BegDoc | Yes | For export, the starting document ID. Example: DOC0000000001 | ||
blocksize |
|
|||
bloombergdisclaimer | For a Bloomberg Message Dump (XML) file, a reference and text field associated with the disclaimerreference (disclaimer notice) of an email. | |||
bloomberglanguage | For Bloomberg messages, text or a code representing the language or Bloomberg-specific font code page used when forming a given message, such as English, French, Code 65001). This field is not case-sensitive when searched. Example: English | |||
bloombergtype | For Bloomberg messages, text representing the Bloomberg message type (CHAT, TMSG, or VMSG). Only CHAT, TMSG, or VMSG messages will contain this field. This field is not case-sensitive when searched. | |||
bloomberguser | For Bloomberg messages or Instant Bloomberg (IB) messages, detailed information about the user. A message may have many bloomberguser entries. Each section of this field is separated by a | (pipe character), and some sections may not be populated. All user entries for Instant Bloomberg messages start with ib. For Bloomberg messages from Message Dump files, the user is either identified as a recipient or a type of sender. When identifying a recipient, this field may identify additional recipients to which the message was forwarded. Recipient entries also include a delivery type, such as cc or bcc. When identifying a sender, this field may identify a user who sent a message as part of a Bloomberg shared message group (sharedmessenger), a user whose credentials were referenced to send a message (onbehalfof), or a user who originated a message (origsender). This field is not case-sensitive when searched. Bloomberg Message Example: recipient|bcc|JOHN DOE|712346|ONE MAIN STREET| 3712346 |7666225|JODOE1@Bloomberg.net|jodoe@onemainstreet.com||||. Instant Bloomberg Example: ib|DOEJ1|JOHN DOE|||3712346|7666|123456| MYBIGCO|JODOE1@Bloomberg.net|jodoe@myco.com. | |||
bloombergversion | For a Bloomberg Message Dump (XML) file or a Bloomberg IB Dump (XML) file, identifies the appropriate Bloomberg version. This field is not case-sensitive when searched. Message Dump Example: MSGXML 1.6; IB Dump Example: IBXML 1.3. | |||
bytecount | A value representing the count (in bytes) for a document. Example: 13794 | |||
category | A status property used by Microsoft Office applications (for example, a categorization of content such as letter, proposal, or resume). Example: newsletter | Yes | ||
cc | Each email recipient who was copied on the email. When you display this field after import, each recipient is separated by a comma or semicolon, depending on the source data (for example, for Microsoft Outlook, don.lol@elron.com, sam.iam@elron.com). For Lotus Notes, this field typically contains one or more fully qualified names (for example, CN=John Doe/OU=US/O=someco). For export, each recipient is separated by a semicolon (for example, dora.lolly@elron.com; sam.stark@elron.com; mo.otoole@elron.com). | Yes | ||
cc_identifier | |
Yes | New in 5.4.1.0 | |
cc_name | |
Yes | New in 5.4.1.0 | |
charcount | A value representing the number of characters in a document. Example: 7648 | |||
charcountlongestword | A value representing the number of characters in the longest word of a document. This field helps identify PDFs without searchable text. this field uses a padded 5-digit value. Example: 00009 | |||
checkedby | A property used by Microsoft Office applications to identify the person or entity responsible for checking the document. | Yes | ||
childcount | A value identifying the number of children directly extracted from a given parent file (for example, Mail Items, Mail Containers, Archives, Compressed Files, or Disk Images). this field supports range searches and is automatically padded for searches (as shown in the Query Executed). The Document Viewer displays the value without padding. Example: 35 | |||
client | A status property used by Microsoft Office applications to identify the client used. | Yes | ||
Clusters | Yes | For export, identifies the cluster to which this document belongs. This export field is disabled by default. Example: CaseData\CaseData-8 | ||
cluster_view | An Analytic Metadata field that shows the cluster hierarchy (for example, the top-level cluster_view for Project Data, and the cluster to which the document belongs).Opening the folder used to represent the cluster_view shows individual view name and similarity fields with details, as well as the cluster_view for the document. If you search for a cluster view within Project Data, use only the name of the cluster. Example of Metadata (top level): |
|||
cluster_view_plus_similarity | A special Analytic Metadata field that enables the display of the cluster view name plus the similarity field value indicating the document's similarity to the seed document for the cluster. This field is not intended to be searchable. Example: |
|||
comment | Text identifying a document comment. Example: This is a serious subject | Yes | ||
comments | For email, text identifying email comments. Example: This is an important email | Yes | ||
company | A property used by Microsoft Office applications to identify a company or Organization. Example: PPPL | Yes | ||
container | A value identifying the document handle of a type of container archive file from which a file was extracted. This field is not case-sensitive when searched. Example: fdecd32e3156bc9b311a4b40fe38258680f3765dbaa942679355e810909eecb4 | |||
containerfolder | For import as well as export, the path internal to a container (e.g., zip or .pst) where the document was found. This field is not case-sensitive when searched. This field does not support phrase search with wildcards. Example: Testing/1.1 | Uses path tokenization rules (for word breaks on \ or /) |
||
contentdescription | MIME protocol header text characterizing the document body content. Example: Mail message body | Yes | ||
contentdisposition | MIME protocol header text accompanying an email message with attachments. This field is not case-sensitive when searched. Example: inline | |||
contentid | A MIME protocol header value representing the contents of an attachment. | |||
contentlanguage | Text for a MIME Content-Language header specifying the natural language of the data. This field is not case-sensitive when searched. Example: en-US | |||
contentmd5 | For files at the Content or Analytic Index level, the contentmd5 is a hash code representing the content of a document, or the subject line and body text of an email. (By contrast, a filemd5 is a hash code based on the entire file, such as a .docx or .msg file.) You can search for content duplicates within a view, or content duplicates of a selected document. Files are a content match if they have matching content MD5 values. Document type and formatting are ignored. For example, a PDF and a Word document used to create that PDF would be a content match. This field is not case-sensitive when searched. Example: 0be5d0edc5e9d870added630f78ce091 | |||
contentransferencoding | As defined by Microsoft, shows the encoding mechanism used when sending an object's contents over the network (for example, binary, 7bit, base64, and 8bit). This field is not case-sensitive when searched. Example: binary | |||
contenttype | A status property used by Microsoft Office applications to identify the content type of the document (for example, text/plain). Can be used to identify an email alternative body. This field is not case-sensitive when searched. Example: text/plain; charset=US-ASCII | |||
conversationindex | For import as well as export, this field applies to Microsoft Outlook messages and provides a value indicating the position of the Outlook message within a given conversation. This field is not case-sensitive when searched. For details on this MAPI field value, see the Microsoft description for PR_CONVERSATION_INDEX. Example: 01B93D016A531E92128B764E31679A71774BAEC1BC25 | |||
conversationindexguid | For import as well as export, this field provides the GUID portion of the conversationindex that can be used to group emails in a downstream tool. This field is not case-sensitive when searched. Example: 1E92128B764E31679A71774BAEC1BC25 | |||
createdtime | For NTFS with CIFS, a System Metadata field identifying the file creation time: yyyy-MM-dd-hh-mm-ss. |
|||
custodian | Identifies the name of the Custodian to which this document is assigned at the time of Import. This field may also identify the value partitioned for a disk image. This field is not case-sensitive when searched. Example (regular Custodian): janedoe Example (disk image): partitioned |
|||
custodian_view | An Analytic Metadata field identifying the Custodian view to which the document is assigned, or Unassigned if the document is not assigned to a Custodian. This field value will match the custodian field value unless you have changed the Custodian assignment after adding the document to Project Data. If you include this field in the Export Fields template used for Export, the load file will identify the document's Custodian view at the time the Export load file was produced. Example: johnd | |||
dafilesource | A System Metadata field identifying the source of this file. For an initial import location, matches importpath information. For a subsequent file location, identifies where the file came from. This is an information-only field that cannot be searched, and its format is not always suitable for construction into an actual path. It is not eligible for Export. Example: Group1_FunctionalColl:files/Data/Point3/300/Task 1/my24.txt | |||
dafilestate | A System Metadata field identifying the current state of the file (imported, copied, deleted, encrypted, or converted). This field is not case-sensitive when searched. Example: imported | |||
dafilestatetime | A System Metadata field with a Digital Reef standard timestamp indicating the time of the operation that affected the document state in the format yyyy-MM-dd-HH-mm-ss, where yyyy is the 4-digit year, MM is the 2-digit month, dd is the 2-digit day, HH is the 2-digit hour, mm is the 2-digit minute, and ss is the 2-digit second value. This can be used to sort the locations of a file in a manner other than that used in the metadata index. This field is not case-sensitive. Example: 2020-06-03-16-47-26 | |||
daghandle | A value representing the document handle of the parent document representing all documents in a Document Attachment Group (DAG). This field is not case-sensitive when searched. Example: 0be5d0edc5e9d870added630f78ce0914b0cbe244d144de8a22b99994e65aaac | |||
dahandle | A System Metadata field identifying the data area handle where this instance of the file is located. This field is not case-sensitive for the purposes of search for the dahandle value. This field appears under a location entry in the Metadata panel of the Document Viewer. When this field includes v and a entries, v is the dahandle value and a is the data area name. Example: v: 0000fb2f8044ba69bc3f4754ae68a71ed699c58f and a: Group1_BaseCollection | |||
darelativepath | A System Metadata field identifying the file's path relative to the path of the data area. The mount point, data area directory, and this value can be combined to address the document. Example: files/Data/Point3/300/Task 1/my24.txt | |||
date | Identifies an email's date, which is a sent date, if available, or a received date otherwise, in the format yyyy-MM-dd-HH-mm-ss, where yyyy is the 4-digit year, MM is the 2-digit month, dd is the 2-digit day, HH is the 2-digit hour, mm is the 2-digit minute, and ss is the 2-digit second value. Values aside from yyyy are padded when necessary. Example: 2000-04-28-23-21-57 | Date fields are. | ||
dateaccessed | This field has been deprecated. Timestamp for the document’s date accessed (in Coordinated Universal Time), yyyy-MM-dd-HH-mm-ss. Example: 2000-02-28-23-22-57 | |||
DateAccessed | Yes | This field has been deprecated. For export, the date portion of the embedded metadata field dateaccessed (time is reported in another field, TimeAccessed). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
datebackedup | Timestamp for the document’s backup date (in Coordinated Universal Time), yyyy-MM-dd-HH-mm-ss. Example: 2000-01-22-23-21-57 | |||
DateBackedUp | Yes | For export, the date portion of the embedded metadata field datebackedup (time is reported in another field, TimeBackedUp). The default format is mm/dd/yyyy. . Example: 2/12/2015 | ||
datecompleted | A property used by Microsoft Office applications to provide a timestamp for the document’s completion date (in Coordinated Universal Time), yyyy-MM-dd-HH-mm-ss. Example: 2000-04-22-23-21-57 | |||
DateCompleted | Yes | For export, the date portion of the embedded metadata field datecompleted (time is reported in another field, TimeCompleted). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
datecreated | An embedded metadata field that contains a timestamp for the document's creation date, yyyy-MM-dd-HH-mm-ss. Example: 2009-04-27-15-22-00 | |||
DateCreated | Yes | For export, the date portion of the embedded metadata field datecreated (or, if datecreated is unavailable, the createdtime field). Time is reported in another field, TimeCreated. The default format is mm/dd/yyyy. Example: 10/12/2003 | ||
DateCreateSystem | Yes | For export, the date portion of the System Metadata field createdtime (for NTFS with CIFS), reporting the date on which the file was created (time is reported in another field, TimeCreateSystem). The default format is mm/dd/yyyy. Example: 11/15/2013 | ||
datedue | For an email Task, the due date for a task using the format yyyy-MM-dd-HH-mm-ss. Example: 2005-01-10-12-00-00 | |||
DateDue | Yes | For export, the date portion of the embedded metadata field datedue (time is reported in another field, TimeDue). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
dateedited | Timestamp for the document's date edited using the format yyyy-MM-dd-HH-mm-ss. Example: 2008-04-18-16-00-00. | |||
DateEdited | Yes | For export, the date portion of the embedded metadata field dateedited (time is reported in another field, TimeEdited). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
dateended | For an email Calendar item, the date a meeting ended using the format yyyy-MM-dd-HH-mm-ss. For an Instant Bloomberg message, the date a CHAT ended. Example: 2007-05-18-16-00-00 | |||
DateEnded | Yes | For export, the date portion of the embedded metadata field dateended, reporting when an email Calendar item or Bloomberg CHAT ended (time is reported in another field, TimeEnded). The default format is mm/dd/yyyy.. Example: 2/12/2015 | ||
DateEWFAcquired | Yes | For export, the date portion of the embedded metadata field ewfdateacquired, reporting the EWF acquisition time (time is reported in another field, TimeEWFAcquired). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
DateExported | Yes | For export, the date a document was Staged for export, using the fixed format mm/dd/yyyy . If a document is re-staged (for example, because a document was originally a container reference, or because a volume was disabled), then the document's DateExported will be updated with a new DateExported value. Example: 4/11/2014 |
||
DateFlagCompleted | Yes | For export of Microsoft Outlook items, the date portion of the field flagcompleted, reporting the date on which the flagged item was marked by the user as complete. The default format is mm/dd/yyyy. Example: 6/19/2018 | ||
DateFlagDue | Yes | For export of Microsoft Outlook items, the date portion of the field flagdue, reporting the due date set by the user for the flagged item. The default format is mm/dd/yyyy. Example: 6/18/2018 | ||
DateFlagStarted | Yes | For export of Microsoft Outlook items, the date portion of the field flagstarted, reporting the date on which the flag was applied to an item. The default format is mm/dd/yyyy. Example: 6/12/2018 | ||
DateLastAccessed | Yes | For export, the date portion of the NFS/CIFS System Metadata field lastaccesstime, reporting the date on which the document was last accessed (time is reported in another field, TimeLastAccessed). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
DateLastChanged | Yes | For export, the date portion of the NFS/CIFS System Metadata field lastchangetime, reporting the date on which the document was last changed (time is reported in another field, TimeLastChanged). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
DateLastMod | Yes | For export, the date portion of the embedded metadata field datemodified (or the field lastmodifiedtime, if no datemodified value is available). This field reports the date on which a document was last modified (time is reported in another field, TimeLastMod). The default format is mm/dd/yyyy. Example: 6/16/2006 | ||
DateLastModSystem | Yes | For export, the date portion of the NFS/CIFS System Metadata field lastmodifiedtime, reporting the last date on which the file was modified (time is reported in another field, TimeLastModSystem). The default format is mm/dd/yyyy. Example: 9/12/2009 | ||
DateLastPrinted | Yes | For export, the date portion of the embedded metadata field dateprinted, reporting when a document was last printed (time is reported in another field, TimeLastPrinted). The default format is mm/dd/yyyy. Example: 11/11/2011 | ||
DateMediaCreated | Yes | For export, the date portion of the embedded metadata field mediacreationtime (time is reported in another field, TimeMediaCreated). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
datemodified | An embedded metadata field that contains a timestamp for the document date modified, yyyy-MM-dd-HH-mm-ss. Example: 2006-01-26-21-01-00 | |||
dateoffset | Use this field with the date field. It identifies the offset from UTC for a dated email's time using the format sHH:mm, where: s is a + or - sign indicating whether time of day is ahead of (east of) or behind (west of) Coordinated Universal Time, HH: identifies the offset hours, and mm identifies the offset minutes. This is commonly known as the Greenwich Mean Time (GMT) offset (+ or - GMT, for example, -0400). This field is not affected by the time zone selected for the Project or for Export. Example: -0400 | |||
dateprimary | This field provides a primary date value that accommodates dates from different types of source files (for example, documents found on disk versus email messages) in the Date column of document lists derived from Project Data. The display format for this field is a full timestamp, yyyy-MM-dd-HH-mm-ss. The value in this field is propagated from parent files to their child files (and the children will have that primary date only, not their own). |
|||
DatePrimary | Yes | For export, the date portion of the dateprimary field information (time is reported in another field, TimePrimary). The default format is mm/dd/yyyy. Example: 1/14/2013 | ||
dateprinted | A property used by Microsoft application to provide a timestamp for the date printed (in Coordinated Universal Time), yyyy-MM-dd-HH-mm- ss. Example: 2005-06-25-00-56-00. | |||
DateProcessed | Yes | For export, the date portion derived from the datescanned field, reporting when the document was imported (time is reported in another field, TimeProcessed). The default format is mm/dd/yyyy. Example: 1/13/2013 | ||
DateReceived | Yes | For export, |
||
DateRecorded | Yes | For export, the date portion of the embedded metadata field recordeddate (time is reported in another field, TimeRecorded). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
datescanned | A System Metadata field with a timestamp (in Coordinated Universal Time) for the date scanned, yyyy-MM-dd-HH-mm-ss. Example: 2011-11-02-16-02-57 | |||
DateSent | Yes | For export, |
||
datestarted | For an email Calendar item, the date a meeting started using the format yyyy-MM-dd-HH-mm-ss. For an Instant Bloomberg message, the date a CHAT started. Example: 2007-05-18-14-30-00 | |||
DateStarted | Yes | For export, the date portion of the embedded metadata field datestarted, reporting when an email Calendar item or Bloomberg CHAT started (time is reported in another field, TimeStarted). The default format is mm/dd/yyyy. Example: 2/12/2015 | ||
datetaken | For image files such as JPGs, the date the image was taken using the format yyyy-MM-dd-HH-mm-ss. This field is populated for image files if the date taken information is present. Example: 2021-08-25-10-30-00 | New in 5.2.5.x | ||
DateTaken | Yes | For export, the date portion of the datetaken field information, reporting when an image (such as a JPG) was taken (time is reported in another field, TimeTaken). The default format is mm/dd/yyyy. Example: 8/25/2021 | New in 5.2.5.x | |
datezone | For email, use this field with the date field. When available, it identifies an acronym (for example, EST or EDT) representing a dated email’s time zone in Coordinated Universal Time, UTC (GMT). This field is not case-sensitive when searched. This field is not affected by the time zone selected for the Project or for Export. Example: EDT | |||
deliveredto | Identifies the person to whom this email was delivered. This field is not case-sensitive when searched. | |||
deliveryreportrequested | This field reports Y if an email requested a report for when the email was delivered. If this field is populated it will have a value of Y or N. this field applies to an MSG or an EML. | |||
department | A status property used by Microsoft Office applications to identify the author’s department. This field is not case-sensitive when searched. | |||
depotfile | A Boolean that identifies when a file resides in Document Depot (for example, expanded or uploaded files will have this field set to Y). | |||
description | A property used to describe the document. This field is not case-sensitive when searched. Example: Another notebook entry. | |||
destination | A status property used by Microsoft Office applications to identify a document’s destination. This field is not case-sensitive when searched. | |||
dirchildcount | For a given directory, reports the number of documents or directories found in the directory when the directory was scanned. A value of 0 identifies an empty directory. This field supports range searches and is automatically padded for searches (as shown in the Query Executed). This field is not case-sensitive when searched. The Document Viewer displays the value without padding. Example (partitioned disk image folder): 3 | |||
disclaimerreference | For Bloomberg messages, identifies a value representing a given Bloomberg firm or account disclaimer reference number in an email. This field is not case-sensitive when searched. Example: 112258 | |||
diskpartitions | Populated for a partitioned disk type with multiple partitions, including MBR and GPT. This field provides a semicolon-separated list of the partition types in the raw image (such as linux;extended;linuxswap;bitlocker). Example: ntfs;bitlocker | Yes | ||
diskpartitionstatus | Populated for a partitioned disk type with multiple partitions, including MBR and GPT. This field will have a semicolon-separated list identifying the processing status of each partition as success, error, or skipped (for example, success;skipped;skipped). Example: success;success;success;success;success;success;skipped | Yes | ||
disposition | A status property used by Microsoft Office applications. This field is not case-sensitive when searched. | |||
division | A status property used by Microsoft Office applications. This field is not case-sensitive when searched. | |||
docannotations | Identifies a document with annotations (Excel_Auto_Filter, Excel_Comments, Excel_Protected_Worksheets, Excel_Protected_Workbook, Excel_Track_Changes, Pdf_Comments, Word_Comments, Word_Revisions (to flag edits, even when track changes is disabled), PowerPoint_Comments, or PowerPoint_Notes. Multiple values are separated by semicolons and a space. You can search for any part of this field (for example, docannotations::word* finds all Word documents with comments or revisions). |
Yes | ||
docclass | Identifies the document classification: Message, Message_Attachment, Message_OLE_Attachment, EDoc (files that are not email, any kind of attachments, or archives, or do not fit into any of the other categories), EDoc_OLE_Attachment, Message_Archive, Archive, Disk_Image (such as EWF), or Directory. This field is not case-sensitive for the purposes of search, but you must either specify the entire name of the class or use wildcards (since the field does not treat each term individually). Example: EDoc | |||
DocDate | Yes | For export, the date that applies to the given type of document, either the date an email was sent, or the datemodified date of an eDoc (or the lastmodifiedtime date of an eDoc, if datemodified is unavailable). (DocTime reports the time information.) The default format is mm/dd/yyyy. Example: 12/24/2018. | ||
docext | Identifies the document extension portion of the filename metadata. This field is not case-sensitive when searched. This field contains the file extension in an EDRM load file based on the text that appears after the last dot in the filename. For example, for my.text.html, html is the file extension. Example: html | |||
DocID | Yes | For export, the value representing the assigned document ID. If page-level numbering is enabled, each page of a document will have its own, incrementally numbered document ID. Family members (a parent and its children) have sequential document IDs. Example: DOC0000000012 | ||
docnum | As used by eDiscovery, a System Metadata field that identifies a three-part number in the format C.V.N, where C =A Data Collection (Data Set) number, unique per Organization, V =A Data Collection Checkpoint Value, unique per Data Collection, and N = A document number, unique within the Data Collection Checkpoint. When searching this field, specify the entire value, since wildcards are not supported for this field. You can also use a range search. Example: docnum::[0.::[30[.101.50000~~3.101.60000]. | |||
documentnumber | A Microsoft Office or WordPerfect property identifying the document number. | |||
DocumentRange | Yes | For export, the range of document IDs (for example, from an email). | ||
DocTime | Yes | For export, the time that applies to the given type of document, either the time an email was sent, or the time portion of the datemodified value of an eDoc (or the time portion of the lastmodifiedtime value of an eDoc, if datemodified is unavailable). (DocDate reports the date information.) The default format is HH:mm:ss. Example: 13:11:33 | ||
dominantlanguage | A language code for the language detected as the dominant language (the largest occurrence of the languages detected) for the document. RFC 3066 and ISO 639-1 define the Language field information, primary language codes such as en for English. Many codes consist of two letters. This field is not case-sensitive when searched. To search this field, specify dominantlanguage:<language_code> (for example, dominantlanguage:ja searches for documents for which Japanese is the dominant language). See the topic Supported Languages for Language Detection for a list of languages that can be detected when language detection is enabled. Example: en | |||
dr_loc | This special Digital Reef field (not searchable on its own) provides data location information, including the Connector, the relative path at that Connector, and the time of the import. When viewing all metadata in the Document Viewer, note that |
|||
drcontaineroffset | This special Digital Reef field (untokenized) is for expanded files stored in ZIP files, as the offset into the ZIP. Example: 000000010.eml | |||
drrelativepath | This special Digital Reef field (untokenized) is for expanded files only, the file location relative to the NFS mount. This location includes the document storage (DSD) location with the project files, the project handle, and the standard darelativepath field information. Example: DSD0199938_1249805270433/project_data_files/0199948//expanded_files/0000b7e6934cb38d16e14b589a9a8cd790b9d753/5280605_0000fa0335fa0fb75bfc4707894429c8255b8f5c//expanded_1/4305_0/000000003.zip | |||
dupe_fingerprint | For email, a value computed according to the Email deduplication Settings, which applies to views of Project Data. For files that are not email, this is always the file MD5 value. Email deduplication Settings are defined as Analytic Settings at the Project level or in an Organization template. The dupe_fingerprint value is available for export by default. This field is not case-sensitive when searched. Example: 00648ad1f073ea686662a2d32554041b | |||
DupeFingerprintSha1 | Yes | For email, a SHA-1 value computed at export according to the Email deduplication Settings and populated in the appropriate export load file. For files that are not email, this is always the file SHA-1 value. Example: fa54f2533583c033633393c745552e73d043f05a | ||
Duplicate | Yes | For export, this field identifies M for the master duplicate in a group of duplicates, Y for an identified (regular) duplicate, and N for a non-duplicate document. | ||
DuplicateCustodian | Yes | For export, a semicolon-delimited list of the Custodians for which this document was a duplicate. This includes the Primary Custodian, if applicable (for example, if the Primary Custodian has multiple duplicates of this document). Note that this field applies to Global (Horizontal) deduplication only (where the software calculates duplicates across all documents in Project Data). Example: Justin;Brett | ||
DuplicateCustodianOther | Yes | For export, a semicolon-delimited list of the Custodians for which this document was a duplicate, except the Primary Custodian. Note that this field applies to Global (Horizontal) deduplication only (where the software calculates duplicates across all documents in Project Data). | ||
Duplicate_Source_DocID | Yes | For export, the document ID of a Duplicate source document. Example: DOC0000000112 | ||
E_MailClient | Yes | For export, the email Parent Type (for example, MS Outlook). | ||
editor | For Web archives, a property used by Microsoft Office applications to identify the document editor. Example: Microsoft Excel 9 | Yes | ||
edocsubject | Text that describes the document subject. Example: Summary Report on Politics in the United States. | Yes | ||
edrmdupeid | This field aids the identification of email duplicates across multiple platforms. It follows the EDRM standard, which involves use of the MD5 hash value of an email Message ID metadata field called EDRM Message Identification Hash (MIH). The calculation is based on the first messageid value in the email header. This field is not case-sensitive when searched. Example: 1D09B1C5A133A77DCA99AF9303BDD303 | New in 5.4.3.0 | ||
emailheader | Includes the entire email header for an email, including custom header fields, if the |
Yes | ||
embeddedchildren | Indicates whether a document has embedded children (documents, images, and/or web image links). For embedded documents, this field contains the value ole . For embedded images (for example, a GIF) in an image . (By default, embedded images are identified but not extracted.) Note that a given document may have multiple embeddedchildren fields if it supports more than one type (for example, one for ole and one for image). Example: ole |
|||
embeddedlink | Identifies an embedded link to a file (for example, a Microsoft OneDrive file) within an MSG embeddedlink::"https://1drv.ms/b/s!AmWYkEkHMgtSZo_zfsmH2AiMhVo" OR embeddedlink::https\://na01.safelinks.protection.outlook.com/* |
New in 5.1.0.3 | ||
embeddedparent | Identifies the document handle for the parent of an embedded document that was extracted during import (an embedded document is either an EDoc_OLE_Attachment or Message_OLE_Attachment). A parent of an embedded document can be one of the following: EDoc, Message_Attachment, EDoc_OLE_Attachment, or Message_OLE_Attachment. This field also identifies the handle for the parent (scanned image) of a child document from an Image Pro load file. This field is not case-sensitive when searched. Example: f580423a20aa5012a4d355ab496eb0f017ddf2ebbaf0418a919c9954377ddef0 | |||
EndAttach | Yes | For export, the ending Attachment Document ID (for example, for an email). Example: DOC0000000014 | ||
EndDoc | Yes | For export, the ending Document ID. DOC0000000151 | ||
entryid | Populated for files extracted directly from PST files as a 24-byte value (48 characters) in which the first 4 bytes are flags and zeroed, the next 16 bytes are the Provider UID for the PST, and the next 4 bytes are the internal identifier for the entry. This field is case-sensitive for the purposes of search. Example: 00000000331e9d6c9614304e97a4c0bd1859251ce4042000 | |||
ewfcasenumber | For EWF (Expert Witness Compression Format) files (for example, Encase or SMART), the Case Number, if one was specified during use of an EWF generation utility such as FDK Imager. All EWF segment files for a given raw image contain the same ewfcasenumber value. When the primary EWF segment file is processed, this field is propagated to all children (and their children) extracted from the EWF. Example: 200911231554MMC | Yes | ||
ewfdateacquired | For EWF (Expert Witness Compression Format) files (for example, Encase or SMART), this field reports the date (in Coordinated Universal Time, as yyyy-MM-dd-HH-mm-ss) on which the EWF files were generated by an EWF generation utility such as FDK Imager. Example: 2009-06-03-14-20-12 | |||
ewfevidencenumber | For EWF (Expert Witness Compression Format) files (for example, Encase or SMART), the Evidence Number, if one was specified during use of an EWF generation utility such as FDK Imager. All EWF segment files for a given raw image contain the same ewfevidencenumber value. When the primary EWF segment file is processed, this field is propagated to all children (and their children) extracted from the EWF. This field is also populated for Logical Evidence Files (LEF). Example: 101010EVIDENCE0126301263 | Yes | ||
ewfexaminername | For EWF (Expert Witness Compression Format) files (for example, Encase or SMART), the Examiner Name, if one was specified during use of an EWF generation utility such as FDK Imager. All EWF segment files for a given raw image contain the same ewfexaminername information. This field is also populated for Logical Evidence Files (LEF). Example: John M. Jones | Yes | ||
ewfmd5 | For EWF (Expert Witness Compression Format) files (for example, Encase or SMART), the MD5 Hash Code of the raw image calculated when the EWF files were generated. All EWF segment files for a given raw image contain the same ewfmd5 value. This field is not case-sensitive when searched. Example: Mon Nov 30 20:45:02 2009 | |||
ewfmediasize | For EWF (Expert Witness Compression Format) files (for example, Encase or SMART), the size, in bytes, of the EWF raw image. All EWF segment files for a given raw image contain the same ewfmediasize value. This field is also populated for Logical Evidence Files (LEF). This field is not case-sensitive when searched. Example: 1082068 | |||
Exception | Yes | For export, indicates whether a document has an exception, Y or N. | ||
ExceptionDescCode | Yes | For export, an explanation of the exception (for example, CONNECTOR_FAILURE). | ||
exceptiontext | This field appears with a value of Y when the software generates descriptive text for a document without any extracted text (for example, a document that reports parsingstatus: 00005 NODATA or parsingstatus: 00015 NO_EMAIL_BODY). this field will be populated for any document without text, such as container files and images, as well as PDFs or other files that have no data, and emails with no email body. For these documents, you will see the generated text NO TEXT FOUND on the Text tab of the Document Viewer. |
|||
ExportDupePriority | Yes | For duplicates at export, reports the priority of the duplicate based on the execution of Survivorship Model queries. When a document matches more than one of the queries, it is assigned the priority of the query it matched first. | ||
Exported | Yes | Indicates whether a file was expected to be exported. Values for this field are Y or N. Note that a Y in this field does not guarantee that the eligible file was exported successfully, since an unexpected exception such as a Connector failure may prevent the file from actually being exported. | ||
ExportedVolName | Yes | For export, the Volume name of an exported document (for example, V0001). | ||
export_view | An Analytic Metadata field that provides information for an Export Stream. Entries for a given export_view show the stream viewname field, and an export view for the stream name followed by a hyphen and then a volume ID, and additional export view fields such as export_view_reason_code and export_view_docid. Example of export_view with a volume name: exportpriv1-VOL0002 | |||
export_view_docid | An Analytic Metadata field that identifies the DocId for a file in an Export Volume of an Export Stream. |
|||
export_view_dupe_priority | An Analytic Metadata field that identifies the deduplication priority associated with a document subject to export. This field is populated when Survivorship queries are set with a priority for document deduplication at Export (when the Export Duplicates Processing option Remove Duplicates from Export and Load File is set). This field is not intended to be searchable in its entirety. Example: export1-VOL0001:0001 | |||
export_view_enddoc | An Analytic Metadata field that identifies the ending Doc ID of a record exported using page-level numbering. | |||
export_view_reason_code | An Analytic Metadata field that identifies the reason code associated with the exported file. This field is not intended to be searchable in its entirety. Example: export1-VOL0001:SEARCH | |||
extbegattach | The external starting attachment document with a document number (for example, for an email). For a DAT Load file document parent, this field will identify the parent document. | |||
extbegdoc | For a Concordance DAT Load File, identifies the appropriate starting document with a document number (the parent number for a parent, or the child number for an attachment). | |||
extdahandle | Identifies the handle of the data area where an externally processed file is located. This field is not case-sensitive when searched. | |||
extdarelativepath | Identifies the path to an externally processed file in the Document Storage Depot, relative to the path of the export data area. | |||
extdocattachrange | For a Concordance DAT Load File parent document, this field identifies the document numbers of the starting and ending attachments, separated by a - (hyphen). This range starts with the parent document and includes all pages of all child documents (for example, all children with multiple pages of images). | |||
extdocrange | For a Concordance DAT Load File, identifies the document numbers of the starting and ending documents, separated by a - (hyphen). For a parent Image Pro document, this includes all children and any images representing the pages of each multi-page child document. For a parent DAT file document, this is the starting and document for the parent itself, which includes the total number of pages for the parent. For any multi-page child document, this includes any images representing the pages of the child document. | |||
extendattach | The external ending attachment document with a document number. For a Load File parent document, this field identifies the last page of the last attachment. | |||
extenddoc | For a Concordance DAT Load File, identifies the ending document with a document number. For a parent document, this is the parent document itself if it is one page, or a number representing the last page of a multi-page parent document. For a child document, this is either the child document itself (if it is a single page) or the image (or document number) representing the last page of the multi-page child document. Example: LFP000013. | |||
extendedfilename | This field appears after import for each document extracted from a file archive. This field contains a nested filename, where the file archive name precedes the document path and filename (using ? as a delimiter). This field applies to the contents of a file archive or nested file archives. This field is available as an export field. To search this field, you can place a portion of the value in quotes, not including any ?, such as the portion appearing after the last ? (for example, for a field value of OfficeFiles.zip?angel/angels-pics.html.1, you can search as follows: extendedfilename:"angel/angels-pics.html.1"). Field Example with multiple ? delimiters: esmall.tar.gz?ersmall.tar?issue3/issuetext.txt |
Yes | ||
extendedproperties | For email Contacts and Calendar items, a consolidated list of metadata fields. Upon export, the load file will show the information as a new-line delimited listed of fields. |
Yes | ||
extnumpages | For a Load File document, identifies the number of pages detected for a multi-page document (for example, the number of pages for an attachment in a Load File Image PDF). | |||
extprocessed | Identifies a file that is a child of an archive processed externally. Example: True | |||
ExtReprocessed | Yes | For export, a field that indicates whether a file has been processed externally and then loaded back into a Project. (External Processing enables you to process OCR Candidates, parent encrypted/protected files, or unsupported format files outside the Project environment and then reload them.) Values for this field are Y or N, depending on whether the extdahandle field is present for the file. (The extdahandle field appears only for a file copied to an export data area for external processing.) Example: Y | ||
family_fingerprint | An Analytic Metadata Field that identifies the dupe_fingerprint value of the parent in a Message Attachment Group (MAG) or Document Attachment Group (DAG). Example: 00648ad1f073ea686662a2d32554041b | |||
family_handle | An Analytic Metadata Field that identifies the appropriate handle value of the family, either the maghandle value for a Message Attachment Group (MAG) or the daghandle value for a Document Attachment Group (DAG). Example: 0be5d0edc5e9d870added630f78ce0914b0cbe244d144de8a22b99994e65aaac | |||
fileattr | A CIFS-only field with a value representing the CIFS file attributes (for example, system, hidden, read only, and archive attributes). | |||
filemd5 | This System Metadata field contains the document's file MD5, a hash code that represents both the content and the embedded metadata for the document. This field is present but not populated with actual file MD5 values in a System Metadata Index (it is all zeros). This field is not case-sensitive when searched. Example: 0be5d0edc5e9d870added630f78ce091 | |||
filemode | For NFS only, a System Metadata field with a value for permissions. This field is not case-sensitive when searched. Example: 34295 | |||
filename | A System Metadata field identifying the name of the document. In general, you can search for any part of this field. For a directory, this is the relative path of the directory. This field is not case-sensitive when searched. Example: 003.tif | Yes | ||
filesha1 | A System Metadata field with a 40-digit value representing the document's calculated SHA-1 value. (SHA-1 is a Secure Hash Algorithm developed by the United States National Security Agency.) This field is not populated in a System Metadata Index. This field is not case-sensitive when searched. Example: d763963b81036337d3f435233f73bb7344a3f6b3 | |||
filesha256 | A System Metadata field with a value representing the document's calculated SHA-256 value. (SHA-256 is a Secure Hash Algorithm developed by the United States National Security Agency for cryptographic security.) This field is not populated in a System Metadata Index. This field is not case-sensitive when searched. Example: c767767fe79dd34ba05f3560f57403613aa014e24c7937a4510add7116039357 | New in 5.4.3.0 | ||
filesubtype | In Parsing Library V2 Projects, identifies additional information for the file type, such as the version for Microsoft Excel, PowerPoint, and Word documents. Possible values include the following:
Excel 2.0, Excel 2007, Excel 3.0, Excel 4.0, Excel 5.0, Excel 97/2003, Microsoft Graph 97/2003, Microsoft Works (4), |
Yes | New in 5.2.5.x. | |
filetype | Text representing the proper name for a given document file type |
Yes | ||
flagcompleted | For items from Microsoft Outlook, the date on which the flagged item was marked by the user as complete. This field uses the format yyyy-MM-dd-HH-mm-ss. Example: 2018-06-07-19-00-00 | |||
flagdue | For items from Microsoft Outlook, the due date set by the user for the flagged item. This field uses the format yyyy-MM-dd-HH-mm-ss. Example: 2018-06-08-19-57-58 | |||
flagstarted | For items from Microsoft Outlook, the date on which the flag was applied to an item. This field uses the format yyyy-MM-dd-HH-mm-ss. Example: 2018-06-07-18-30-00 | |||
forwardto | Identifies the person to whom the email was forwarded. | Yes | ||
from | Identifies the sender of this email or the first user in an Instant Bloomberg message. For an email sent on behalf of another person, this field value may identify the intended (impersonated) sender and may not match the sender field value (the actual sender). For Microsoft Outlook, this field may contain both a name and address (for example, Jim Brown <jbrown@mynetwork.com>). For Lotus Notes, this field typically contains one or more fully qualified names (for example, CN=John Doe/OU=US/O=someco). | Yes | ||
from_identifier | |
Yes | New in 5.4.1.0 | |
from_name | |
Yes | New in 5.4.1.0 | |
fullparticipants | This consolidated field applies to common email types and formats, and contains all information from the following fields: altbcc and bcc, altcc and cc, altfrom and from, altsender and sender, altto and to. If an email address is not complete, this field contains whatever content was discovered. In the Document Viewer, a separate entry appears for each unique email address. A manifest with this field contains a semicolon-delimited list of unique values. To search for a value this field, you can use wildcards. Search Example: fullparticipants::"all someco*" Manifest Field Content Example: Bill Smith <bsmith@someco.com>;All Someco <Everyone@someco.com> |
Yes | ||
fullpath | Identifies the full path for the file. This field is not visible in the user interface; however, it is captured in the manifest at export. | |||
greeting | For Bloomberg messages, identifies the Bloomberg greeting information (email signature). Example: bob.white@bigco.com | Yes | ||
group | A status property used by Microsoft Office applications. | Yes | ||
group_view | An Analytic Metadata field that identifies a manually created Folder that you populate with files). This field appears in the metadata display for a document in a Folder and is searchable, but it does not appear in a manifest. | |||
handle | A System Metadata field with a value representing the document handle. Every document in an imported set/collection of documents is assigned a unique document handle, which is used to identify the document. This field is not case-sensitive when searched. Example: 0be5d0edc5e9d870added630f78ce0914b0cbe244d144de8a22b99994e65aaac | |||
HasMSReviewInfo | Confirms the presence or absence of MS Review Information in one or more of these fields: msreviewcycle, msreviewemailsubject, msreviewauthoraddr, msreviewauthorname, and msreviewprevcycle. Values are Y or N. Example: Y | |||
hastransportheader | For MSGs, indicates whether a valid Message Transport Header is present for a document. Values are Y or N. If you see a value of N in this field, the Transport Header for the MSG is either not present (in which case, the emailheader field, when enabled, would not be populated) or is invalid (in which case, the emailheader field, when enabled, might be populated with some information). Example: Y | |||
hiddendata | Identifies detected hidden data (Excel_Hidden_Columns, Excel_Hidden_Rows, Excel_Hidden_Worksheets, Excel_VeryHidden_Worksheets, PowerPoint_Hidden_Slides, or Word_Hidden_Text). Multiple values are separated by semicolons and a space. You can search for any part of the field content (for example, hiddendata::excel* finds all Excel documents with hidden content). Example: Excel_Hidden_Rows |
Yes | ||
hiddenslidecount | A status property used by Microsoft Office applications to identify the number of hidden slides in the document. Example: 6 | |||
image_exception | This Analytic Metadata field contains true for a document in Project Data that is considered an Image Exception because it matches the configured Image Exception queries (default queries find different types of hidden content, document annotations, and track changes flags). Such documents may warrant special handling before any production versions are generated. Example: true | |||
importance | Identifies the email (or Calendar item, Task, or Journal entry) importance (High, Medium, Low). This field is not case-sensitive when searched. Example: High | |||
ImportArea | Yes | For export, a value (GUID) representing the Connector import data area. Example: 0000eac31a6b270880bb31c6781368d967ba0cd0 | ||
ImportDevice | Yes | For export, text identifying the Connector type and import mount location. Example: nfs://192.167.1.10/MyImports/MyData | ||
importpath | A System Metadata field identifying the import location label and/or the method of import and archive information, if applicable. To search this field, start a standard syntax search with importpath::*\: (to represent the Data Area) followed by the relative path information, either the entire relative path or a portion of it, depending on what you want to find. Remember to use the \ character to escape any colons (:) and spaces in the relative path, and use wildcards as needed. For example, to search for the file <DataArea>:johnb/Endpoint1/104/Task 1/doc24.txt, type importpath::*\:johnb/Endpoint1/104/Task\ 1/doc24.txt. You could also type importpath::*\:johnb/Endpoint1* to search for all files at Endpoint1. | |||
ImportPathInfo | Yes | An export-only field that provides the import path information without the handle. Example: pst\mythreads1\allthreads1.pst:Inbox\threadsdata\000000009.msg | ||
ImportURL | Yes | For export, combines the import mount location and import path (importpath metadata field) as a source URI. If the file is extracted from an archive, it is identified as such. | ||
inode | For NFS only, a System Metadata field (untokenized) with the inode identifier of the file. Example: 2047539 | |||
inreplyto | For email, the MessageID to which this message served as a reply. Example: <BD1F491BE5FEB942A95DC075D99F15CA12E8739CAA@exch-be-01-prod.hq.anetworks.com> | Yes | ||
isattach | This field currently applies to the Short Message Format (SMF) of Cellebrite and indicates whether an item is an attachment (True or False). This field is not case-sensitive when searched. Example: True | New in 5.4.1.0 | ||
journalemailhandle | For Microsoft Exchange journaled emails, contains the handle value of the journal wrapper email for all members in the original email family of one of these wrappers. This field is not case-sensitive when searched. | |||
journalemailparent | For Microsoft Exchange journaled emails, the top-level emails extracted from the wrapper will populate this field with the original parent handle value when the Split Journaled Emails option is enabled. This field is not case-sensitive when searched. | |||
journalemailtype | For Microsoft Exchange journaled emails, this field always identifies the type of parent journaled email |
|||
keywords | A property used by Microsoft Office applications to provide a list of keywords, typically separated by commas. Example: war, hate, mutiny | Yes | ||
KFT | Yes | For export, indicates whether the MD5 hash code matches an MD5 hash code for a file type in the National Software Reference Library (NSRL), a database provided by the National Institute of Standards and Technology (NIST). NIST files are often system-related files that provide no value for a Project. Values: Y or N | ||
kftdesc | For a known file type in the reference library (NSRL) from NIST, identifies the source. (A file may have multiple sources, but only one is captured here.) This field is not case-sensitive when searched. Example: IBM ThinkPad 240 | |||
language | A language code identifying a language detected in a document. A document may report multiple languages. RFC 3066 and ISO 639-1 define the Language field information, primary language codes (many with two letters), such as en for English. This field is not case-sensitive when searched. To perform a search of this field, specify language:<language_code> (for example, language:zh* searches for documents with traditional or simplified Chinese). See the topic Supported Languages for Language Detection for a list of languages that can be detected when language detection is enabled, along with their codes. Example: en | |||
lastaccesstime | For NFS/CIFS, a System Metadata field identifying the last time the file was read or accessed, yyyy-MM-dd-HH-mm-ss. |
|||
lastchangetime | For NFS only, a System Metadata field identifying the last time the inode information (e.g., owner or group) changed for this file, yyyy-MM-dd-HH-mm-ss. Example: 2011-05-02-15-05-30 | |||
lastmodifiedtime | For NFS/CIFS, a System Metadata field identifying the last time the file was modified, yyyy-MM-dd-HH-mm-ss. |
|||
linecount | A value representing the number of lines in the document. Example: 63 | |||
linksdirty | A status property (for custom links that are dirty) used by Microsoft Office applications. This field is not case-sensitive when searched. Example: no | |||
loadfiledocsource |
|
Yes | ||
location | For an email Calendar item, reports the meeting/appointment location. This field may also represent a CIFS location or an Instant Bloomberg Room ID. Example 1: Conference Room 1 Example 2: cifsdata_right:003.tif Example 3 (IB): PCHAT-0x10000055551A1 | Yes | ||
maghandle | Identifies the document handle of the parent email message to which a file is attached. You can Search using the maghandle to quickly find all documents in a Message Attachment Group (MAG), such as maghandle::1111. An embedded document parent that is part of a MAG inherits the maghandle. Otherwise, the embedded document parent and its children are part of a DAG and inherit the daghandle instead. A document either has a maghandle or daghandle, but not both. This field is not case-sensitive when searched. Example: e14ea7b2d289e72c2b4035f92540354fe67c280f817041ae8966d549ac1d6dc1 | |||
mailcategories | For items from Microsoft Outlook, a semicolon-delimited list of named categories. In Outlook, categories are used to group and identify sets of items. Example: mycategory1;mycategory2;mycategory3 | Yes | ||
mailcontainer | Identifies the document handle of the top-level mail container serving as the originating mailstore for a file, such as a PST, NSF, or MBOX container. This field is not case-sensitive when searched. Example: bc5e20d30ddd02dfe3ad836c618f0902181588bd739b4f3e875a0dc490faf38b | |||
mailcontainererror | Identifies a mail container that either has local access protection (it is encrypted/protected), generates an error, or is an empty archive. The text in this field makes it clear if you are not authorized to access this container, or if the container is empty. Example: Archive Empty | |||
mailflags | For MSG emails, this field can contain a single flag or a semicolon-delimited list of flags. When this field contains multiple flags, you can search using a wildcard or by placing the set of flags within quotes. This field is not case-sensitive when searched. Field Content Example, multiple values: MSGFLAG_UNSENT; MSGFLAG_HASATTACH; MSGFLAG_EVERREAD | |||
mailfolder | For import and export, identifies the appropriate folder (for example, Inbox) in the mailstore for a container file such as a PST. For an email in a PST, this could be Top of Personal Folders, which represents the top mailbox folder name. Although the field information may appear after import as a Linux-compatible path, it is always exported as a Windows-compatible path (that is, any / folder separators in the path are changed to \). This field is not case-sensitive when searched. This field does not support phrase search with wildcards. When searching this field, be sure to use the \ character to escape any colons (:) or spaces in the path. If you search for a Windows-compatible path, remember to escape the \ character (for example, use \\). Example: Inbox | Uses path tokenization rules (for word breaks on \ or /) |
||
mailfolderpath | For import and export, this field identifies the relative path to the mail NSF/PST container and the mail folder name. This field is populated for any file with a document class of Message as well as its children. It uses the format <rel_path>:<mail_folder> for messages (other than loose messages and messages from email archives that do not have folders, such as Bloomberg archives). Example (Import, for a Linux style path): Smith/personal_folder.pst:Inbox. For loose messages and messages from email archives that do not have folders, such as Bloomberg archives, mailfolderpath does not include the GUID/colon. Loose files also do not include the separator/container filename. The field information may appear after import as a Linux-compatible path, but is always exported as a Windows-compatible path (any / folder separators in the path are changed to \). Example (Export, Windows-compatible path): Smith\ personal_folder.pst:Inbox. This field is not case-sensitive for the purposes of search and does not support phrase search with wildcards. When searching this field, use the \ character to escape any colons (:) or spaces in the path (for example, mailfolderpath::nsf/MG_PWC_Internet.nsf\:Inbox). To search for a Windows-compatible path, escape the \ character (e.g., use \\). Field Content Example: mail/nsf/MG_PWC_Internet.nsf:Inbox | Uses path tokenization rules (for word breaks on \ or /) |
||
mailpriority | An email priority for both MSGs and EMLs, such as High, Normal, or Low. This field is not case-sensitive when searched. Example: High | Yes | ||
mailstop | A status property used by Microsoft Office applications to identify a mail stop location. This field is not case-sensitive when searched. | |||
MailStore | Yes | For export, the name of the mail store (email container) as it appears on disk. | ||
manager | A property used by Microsoft Office applications to identify the manager of the document. Example: JDoe | Yes | ||
markuphistory | Applies to Microsoft Office documents and PDFs. |
Yes | ||
Matter | Yes | For export, the Project/Matter identifier, as derived from the Matter Number in the Project Index Settings (if set). Example: 100 | ||
mboxfromline | For an email extracted from an MBox, the From line information from the MBox header. This field is tokenized and supports wildcard searching. Example: From kjain@myco.com Fri Oct 19 18:58:13 2007 | Yes | ||
mediacreationtime | For an identified audio, video, or image file, the appropriate creation time, yyyy-MM-dd-HH-mm-ss. | |||
mediaencoder | For an identified audio, video, or image file, the media encoder used. Example: Windows Media Video 9 | |||
mediaformat | For an identified audio, video, or image file, this field identifies the media format used. Example: Digital Camera | |||
mediaid | An ID for the media (for example, PST), derived from the eDiscovery Project Index settings that specify the directory hierarchy position relative to the importpath metadata field. This field is not case-sensitive when searched. Example: Jane Smith | |||
mediaid_view | An Analytic Metadata field identifying the mediaid view to which the document is assigned, or Unassigned if the document is not assigned to a mediaid value. This field value will match the mediaid field value unless you have changed the assignment after adding the document to Project Data. If you include this field in the Export Fields used for Export, the load file will identify the document's mediaid view at the time the Export load file was produced. Example: Jane Smith | New in 5.4.0.0 | ||
messageid | For email, a unique ID (alphanumeric value) that identifies an email message. You can use the message ID as part of an email deduplication strategy (set using the Project Analytic Settings or Organization Analytic Settings template). This field may also identify a Bloomberg-specific message identifier for a Bloomberg message. This field is case-sensitive for the purposes of search. Email example: <SPARKLIST-2571986-190368-2000.02.24-11.02.48--mike.barnett#uk.pwcglobal.com@list4.internet.com> | |||
mimeversion | The MIME version for the document. This field is not case-sensitive when searched. | |||
mmclipcount | For a Microsoft Office document, a value representing the number of multimedia clips in the document. Example: 4 | |||
modernattachmentlinks | For parent emails of Modern Attachments, this field provides links to the attachments stored in Microsoft OneDrive and SharePoint. This field supports both MSGs and EMLs. This field is not case-sensitive when searched. To find documents with this field, you can use a modernattachmentlinks::<exists> search. Example: https://1drv.ms/b/s!AeWYkEkHEgtSZo_zfseH3AiEkVe | New in 5.4.2.2 | ||
modifiedby | Text identifying an author who last modified the document. Example: William Morrison | Yes | ||
msgclass | For the file types of email, MS Outlook |
|||
msgsource | Identifies the appropriate source classification for an email: Lotus_Notes (for emails and other items extracted from Lotus Notes NSF files), Outlook (for emails and other items extracted from Microsoft PST/OST files), Bloomberg (for Bloomberg messages extracted from Bloomberg Message Dump XML files), Bloomberg_IB (for Instant Bloomberg messages), MBox (for emails extracted from RFC 822 Mailboxes), Exchange (for email and other items from an Exchange Server), msg (for loose MSGs), and eml (for loose EMLs). |
|||
msoutlookmessageclass | Identifies the appropriate Microsoft Outlook MAPI message class. Most sent and received messages are part of the interpersonal message (IPM) message class and have an IPM subclass with an IPM identifier, such as IPM.Note for a note message, IPM.Contact for a contact message, and IPM.Schedule.Meeting.Request for a message schedule meeting request. You may also see Report items, such as REPORT.IPM.NOTE.DR for a normal delivery report or REPORT.IPM.NOTE.NDR for a nondelivery report. |
|||
msoutlooktextsources |
|
New in 5.2.5.x | ||
msreviewauthoraddr | A Document property providing the email address from which this Microsoft Office document was sent as an Outlook attachment. The information in this field is derived from the Microsoft _AuthorEmail field. This field supports up to 256 characters, including the terminating NULL character. Example: jsmith@someco.com | Yes | ||
msreviewauthorname | A Document property providing the display name of the email address from which this Microsoft Office document was sent as an Outlook attachment. The information in this field is derived from the Microsoft _AuthorEmail field. This field supports up to 256 characters, including the terminating NULL character. Example: Joe Smith | Yes | ||
msreviewcycle | A Document property providing the unique identifier of a Microsoft Office document sent for review as an Outlook email attachment. The information in This field is derived from the Microsoft AdHocReviewCycleID field. Example: 3274586009 | |||
msreviewemailsubject | A Document property providing the subject of the email from which this Microsoft Office document was sent as an Outlook attachment. The information in this field is derived from the Microsoft _EmailSubject field. This field supports up to 256 characters, including the terminating NULL character. Example: Mandatory Email Review | Yes | ||
msreviewprevcycle | A Document property providing the original review cycle identifier of a Microsoft Office document previously sent and then resent for review as an Outlook email attachment. The information in This field is derived from the Microsoft_PreviousAdHocReviewCycleID field. Example: 7112356893 | |||
NativeLink | Yes | For export, the path to the renamed Native file. By default, this field uses a prefix of DR\. The path is Windows-compatible (using \ to separate folders). The format of this field is configurable in the Export dialog, so it may contain the full path of the Volume directory, including a base path, Volume label and Volume #, document ID prefix, and starting ID (using the appropriate Pad size), and then the file extension. | ||
NativeLinkBytes | Yes | For export, the size (in bytes) of the file produced by the export (that is, the size of the Native (or HTML-formatted email) file whose path is identified in the NativeLink field). Example: 1567 | ||
NativeLinkExt | Yes | For export, the file extension of the Native (or HTML-formatted email) file whose path is identified in the NativeLink field. Example: htm | ||
NearDupe | Yes | For export, whether this document is a Near Duplicate of another document, Y or N. This field can also contain ASSOC if the document is part of a Message Attachment Group (MAG), part of which is a near dupe of another document. | ||
NearDupePivotDocID | Yes | For export, the document ID of a Near Duplicate source document. This field is populated for any Near Duplicate of the source document as well as the source document itself. | ||
NearDupePivotSimilarity | Yes | For export, the calculated Near Duplicate Similarity value (the calculated similarity of the document to the pivot document (for example, 1). The similarity value will be between 0 and 1. | ||
NearDupePivotSource | Yes | For export, identifies whether this document is a Near Duplicate source document, Y or N. | ||
notescount | This field identifies the number of notes in the document. Example: 10 | |||
nsfform | For items extracted from Lotus Notes NSF files, this field identifies the appropriate Form field type. For emails, the form can be Document, Response, Memo, Reply, NonDelivery Report, Return Receipt, Delivery Report, Reply with History, ArchiveStub, or Phone Message. For Calendar items, the form can be Appointment, Notice, or (ReplyNotice). Other types include Task, JournalEntry, and Person (which is a vCard contact). If the Form field type is not present, it is listed as Unavailable. Unsupported or unknown form types will generate the appropriate parsingstatus, either Unsupported Lotus Note (error code 00043) or Unknown Lotus Note (error code 00044). Example: Memo | Yes | ||
nsftemplate | For Lotus Notes NSF files, this field identifies the name of the template used when creating the NSF file or Unavailable if the template information does not exist. This field helps characterize NSF data. Example: StdR7Mail. | Yes | ||
nsfunavailableformdetails | For Lotus Notes NSF files for which the Lotus Notes Form type is not present, this field identifies the additional information derived by the Digital Reef software. If the Digital Reef software can derive enough information (for example, MIME content) to enable processing of the file as an EML, the parsing status Mapped Lotus Note Form (code 00045) is returned. If the Digital Reef software can determine that the type is a Script, the parsing status Unsupported Lotus Note (error code 00043) is returned. | Yes | ||
NumAttach | Yes | For export, the number of attachments to the parent email or other item. | ||
numemailparticipants | For email, a Digital Reef property identifying the number of unique email participants, based on the to, from, bcc, cc, and sender fields, as well as the Lotus Notes altto, altfrom, altbcc, altcc, and altersender fields. You can search this field to help pinpoint email sent exclusively from one person to another. this field supports range searches and is automatically padded for searches. Example 1: numemailparticipants::[1~~100] Example 2: from:: joe ross AND to::jane jones AND numemailparticipants::2 | |||
numemailrecipients | For email, a Digital Reef property identifying the number of unique email recipients, based on the to, bcc, and cc fields, as well as the Lotus Notes altto, altbcc, and altcc fields. You can search this field to help pinpoint email sent exclusively to certain recipients. this field supports range searches and is automatically padded for searches. Example 1: numemailrecipients::[1~~50] Example2: to:: John Smith/Marketing/Bigco OR to::Mike Jones/Sales/Bigco OR altto::john.smith@bigco.com AND numemailrecipients::[1~~3] Example 3: altto::jane.doe@someco.com AND altcc::bob.smith@someco.com AND to::CN=John Doe/OU=US/O=someco AND numemailrecipients::3 | |||
ocraverageconfidencelevel | A value in the range 0-100 indicating the average OCR Confidence level calculated for a document subject to OCR processing. A value of 0 is the lowest confidence level and a value of 100 is the highest confidence level. this field uses a padded 5-digit value (for example, 00010 is Confidence level 10) to support range searching. For example, to search for documents whose average OCR Confidence level is in the inclusive range 20-60, you would specify ocraverageconfidencelevel::[00020~~00060] . |
|||
ocrlowestconfidencelevel | A value in the range 0-100 indicating the lowest OCR Confidence level calculated for any page in a document. This field is padded to accommodate range searches. This field uses a padded 5-digit value (for example, 00010 is Confidence level 10) to support range searching. For example, to search for documents whose lowest OCR Confidence level is in the inclusive range 0-10, you would specify ocrlowestconfidencelevel::[00000~~00010] . |
|||
ocrpath | This field identifies the external OCR path, which includes the Data Area information and the document handle for a given OCR-processed file. This field is not available for Export. | |||
ocrstatus | Text and value indicating the status of an OCR operation, either OCR (to indicate success), EXTOCR (for external OCR processing from a load file),or an exception (OCRFAIL, OCRTIMEOUT, CONVERTFAIL, NOTXT, or UNINITIALIZED). A metadata search of a value and/or text in this field is not case-sensitive. A document with a status of EXTOCR is not subject to OCR processing again during indexing. Example: OCR | Yes | ||
ocrsuspectwords | This field identifies the number of words the OCR software flags as potentially suspect (that is, they might not be valid words) after OCR processing. Example: 36 | |||
ocrtotalpages | This field identifies the total number of pages after OCR processing. Example: 120 | |||
ocrvalidpages | This field identifies the number of pages considered valid pages after OCR processing. Example: 106 | |||
ocrwords | This field identifies the number of extracted words after OCR processing. Example: 10269 | |||
office | A status property used by Microsoft Office applications. This field is not case-sensitive when searched. | |||
OLEChildID | Yes | For export, the document ID of each child embedded document of a given parent. Each ID is separated by a semicolon. | ||
OLEParentID | Yes | For export, the document ID of the parent document for a given child embedded document. | ||
OrderByDate | Yes | For export, the email Message Attachment Group Order (MAGORDER) value. | ||
origdocext | For import and export, this field is populated when the file's doc extension does not match its file type (e.g., for a doc.zip, which should have been an exe, this field will contain exe to indicate what the extension should have been, and docext will contain zip, which indicates the way it was seen on disk). This field is not case-sensitive when searched. Example: html | |||
origparsingstatus | For import as well as export, this field is populated for a file that has been reprocessed or externally processed, to report the original parsing status after import. This field is also populated after OCR processing. (Note that for export, only the parsing status text, not the code, is provided.) Example: 00029 PROTECTED | Yes | ||
osfolder | For import and export, a System Metadata field with the OS path to the document (if not part of a container). It is essentially the import path minus the folders used for custodian and mediaid. Although the field information may appear after import as a Linux-compatible path, it is always exported as a Windows-compatible path (that is, any / folder separators in the path are changed to \). This field is not case-sensitive when searched. This field does not support phrase search with wildcards. When searching this field, be sure to use the \ character to escape any colons (:) or spaces in the path. Example (after import): tests/SearchTests | Uses path tokenization rules (for word breaks on \ or /) |
||
owner | Text identifying the owner of the document. | Yes | ||
ownergroup |
|
|||
owneruser |
|
|||
pagecount | A value representing the number of pages in the document. this field can be populated for Microsoft Office documents and PDFs. Example: 10 | |||
PageCount | Yes | If page-level numbering is selected for an eDiscovery Export Stream, this field reports the number of pages produced for each document in the export. | ||
paragraphcount | This field identifies the document paragraph count. Example: 17. | |||
parent | A value representing an email parent. This field is not case-sensitive when searched. Example: e14ea7b2d289e72c2b4035f92540354fe67c280f817041ae8966d549ac1d6dc1 | |||
ParentContainer | Yes | For export, the document ID of the parent container, or blank if NA. | ||
ParentContainerType | Yes | For export, the device type indicating the email archive (for example, application/msoutlook). | ||
ParentID | Yes | For export, the document ID of each parent. | ||
parseduration | Identifies the amount of time (in milliseconds) it took to parse the document. This can aid in troubleshooting. You can search this field using a regular field search with a value, or using a range search. The value reported is in milliseconds (ms). Example: 2000 | New in 4.3.11.0 | ||
parserversion | This field identifies the release number. Example: R5.2.0.0. | |||
parsingstatus | A System Metadata field with a 5-digit value and text indicating the status of a parsing operation, such as 00000 SUCCESS or a brief description of an exception, such as 00005 NODATA. Both the code and the text appear in the metadata display in the Document Viewer and in a generated manifest that includes the field. A metadata search of a value and/or text in this field is not case-sensitive. (Note that in an export load file, this field will show only the parsing status text, not the code.) Example of field content displayed in the Document Viewer or in a view manifest with the field: 00000 SUCCESS | Yes | ||
participantdomains | For email, text identifying the domain portion of any email participant (for example, company.com). For Lotus Notes, this field will include all unique sending or receiving domains identified in the from, altfrom, bcc, altbcc, cc , altcc, |
Yes | ||
participants | This consolidated field applies to common email types. For Microsoft Outlook, it contains a list of SMTP style email addresses representing each email participant, separated by a semicolon. For Lotus Notes, this field typically contains one or more fully qualified names (for example, CN=John Doe/OU=US/O=someco). If an email address is not complete, this field contains whatever content was discovered. The terms in the participants field therefore typically represent a subset of the to, from, bcc, cc, and sender fields. This field supports wildcards for searching. Example: list*@internet.com;mike.cole@*global.com | Yes | ||
participants_identifier | This field currently applies to the Short Message Format (SMF) of Cellebrite, and provides the unique identifiers of all participants who sent or received this message. Sample values include phone numbers, email addresses, and usernames. Example: +15555555555; +19999999999 | Yes | New in 5.4.1.0 | |
participants_name | This field currently applies to the Short Message Format (SMF) of Cellebrite, and provides the names of all participants who sent or received this message. Example: Jim Smith; Jane Allen | Yes | New in 5.4.1.0 | |
password | For a document subject to password cracking as part of reprocessing, or a BitLocker-encrypted disk partition subject to password cracking or processing/reprocessing using a Container Key file, this field contains either the password used to open, parse, and render the document, or the 48-digit key or password used to decrypt and process the partition (the password or key discovered first is the one that appears in the password field). Multiple BitLocker-encrypted partitions will yield multiple password field entries, where each entry reports the password or key used. The password field entries appear for a parent EWF, Virtual Hard Drive (VHD), or file representing a decrypted partition (for example, from an LEF with a BitLocker-encrypted partition). For users with Organization Administrator permissions only, the password appears in the Metadata portion of the Document Viewer |
|||
pattern | This field applies to a document with Pattern matches to enabled Project Patterns, as long as the document is part of a new Data Set import, an update of Patterns, or subject to reprocessing. In the Document Viewer, a pattern field entry appears for each enabled Pattern (for example, one for Pattern unc, one for Pattern uri, and one for Pattern email). In a manifest, this field contains a semicolon-delimited list of Pattern names for the matching, enabled Patterns (once per enabled Pattern). To search this field, use the format pattern::<patternname> and type the Pattern name in either lowercase or uppercase format, since the software always uses lowercase format, regardless of which case you use. Search Example: pattern::email and Manifest Field Content Example: unc;uri;email | New in 4.3.11.0 | ||
patternvalue | This field applies to a document with Pattern
matches to Project Patterns that are both enabled and have values
stored, as long as the document is part of a new Data Set import, an update of Patterns, or subject to reprocessing. In the Document Viewer, a patternvalue field entry appears for each value stored (for example, one for an email address, one for a UNC path, and one for a given URL). In a manifest, this field contains a semicolon-delimited list of unique, matching values. To search for a value this field, use a literal search in which you place the full value within single quotes. Search Example: patternvalue::'bsmith@someco.com' Manifest Field Content Example: \\server\; http://www.state.gov/s/ct/; bsmith@someco.com |
New in 4.3.11.0 | ||
PDFLink | Yes | The path to a PDF converted file, if the option was selected on export. This field uses a prefix of DR\. The path is Windows-compatible (using \ to separate folders), and the PDF converted file always ends in .pdf (e.g., DR\00001\ DOC00000001.pdf). The format of this field is configurable in the Export dialog, so it may contain the full path of the Volume directory, including a base path, Volume label and Volume #, document ID prefix, and starting ID (using the appropriate Pad size), and then the file extension. | ||
PDFLinkBytes | Yes | For export with PDF versions requested, the size (in bytes) of the PDF produced by the export (that is, the size of the file whose path is identified in the PDFLink field). Example: 1799 | ||
processedstatus | This field reports the status of a reprocessed file that was loaded back to the system from an external location (export data area). Possible values are UPDATED_ORIGINAL, EXT_REPROCESSED, or EXT_ORIGINAL. If the load operation includes the Update Native option, UPDATED_ORIGINAL indicates that the original native file at the export data area was updated when the file was loaded back to the system. If the load operation does not include the Update Native option, EXT_ REPROCESSED indicates that the native file was externally reprocessed at the export data area (but the original native was not updated when loaded back to the system). EXT_ORIGINAL is reported for children of reprocessed native files. | Yes | ||
ProducedCustodian | Yes | For export, a semicolon-delimited list of the Custodians for which this document was produced. | ||
project | A status property used by Microsoft Office applications to identify the project associated with the document. This field is not case-sensitive when searched. | |||
ProjectMatterName | Yes | For export, the Project or Matter Name derived from the Matter Name (in the Project Index Settings). If you do not set a Matter Name in the Index Settings, the Project Name (assigned during Project creation) is used. Example: Proj1 | ||
publisher | Identifies the entity responsible for making the document available (a person, an Organization, or an application such as Acrobat Distiller 8.0.0). Example: Mac OS X 10.4 Quartz PDFContext | Yes | ||
rcvddomains | For all email, identifies the domain part of an email (received). For Lotus Notes, this field will also include unique received domains in the altbcc, altcc, and altto fields as well as the bcc, cc, and to fields. Multiple domains are separated by a semicolon. Example: uk.pwcglobal.com | Yes | ||
readreceiptrequested | This field reports Y if an email requested a receipt for when the email was read. If This field is populated it will have a value of Y or N. This field applies to an MSG or an EML. | |||
received | For email, this field identifies the received date of the email in the format YYYY-MM-DD-HH-mm-ss. YYYY is the 4-digit year, MM is the 2-digit month, DD is the 2-digit day, HH is the 2-digit hour, mm is the 2-digit minute, and ss is the 2-digit second value. Values aside from YYYY are padded when necessary. Example: 2000-02-24-18-10-29 | |||
receivedexists | This field identifies whether an email had its own received value. If so, this field will be populated and contain Y. If the email had no received value of its own, this field will not be populated. Example: Y | |||
receivedoffset | This field identifies the offset from UTC for an email's received time using the format sHHmm, where: s is a + or - sign indicating whether time of day is ahead of (east of) or behind (west of) Coordinated Universal Time, HH identifies the offset hours, and mm identifies the offset minutes. This is commonly known as the Greenwich Mean Time (GMT) offset (+ or - GMT, for example, -0400). This field is not affected by the time zone selected for the Project or for Export. | |||
receivedzone | When available for email, an acronym (for example, EST or EDT) representing an email’s received time zone in Coordinated Universal Time, UTC (GMT). This field is not case-sensitive when searched. This field is not affected by the time zone selected for the Project or for Export. Example: EDT | |||
receiveheader | This field identifies the email receive header information. Example: from list4.internet.com(207.250.144.10) by tee1.uk.pw.com via smap (4.1) id xma001821; Thu, 24 Feb 2000 18:12:05 +0000 (GMT) | |||
recipient | In general, this field is intended to identify an email recipient. This field currently applies to the Short Message Format (SMF) of Cellebrite and identifies a recipient. | Yes | ||
recipient_identifier | This field currently applies to the Short Message Format (SMF) of Cellebrite, and provides the unique identifier of a recipient of this message. Sample values include phone numbers, email addresses, and usernames. Example: +16666666666 | Yes | New in 5.4.1.0 | |
recipient_name | This field currently applies to the Short Message Format (SMF) of Cellebrite, and provides the name of a recipient of this message. Example: John Doe | Yes | New in 5.4.1.0 | |
recipientsmd5 | A value representing the recipients MD5 for an email. Example: 057ec6575a2fa91ddbf7721be6b7cda0 | |||
recordedby | Identifies the entity or program responsible for recording the file (for example, video data). This field is not case-sensitive when searched. | |||
recordeddate | This field identifies the document’s recorded date, yyyy-MM-dd-HH-mm-ss.Example: 2009-04-28-15-37-28 | |||
reference | A status property used by Microsoft Office applications to identify a document reference. This field is not case-sensitive when searched. | |||
references | Identifies a list of references for an email. This field is not case-sensitive when searched. Example:<BD1F491BE5FEB942A95DC075D99F15CA41765FD488@exch-be-01-prod.hq.anetworks.com> <BD1F491BE5FEB942A95DC075D99F15CA41765FD48D@exch-be-01-prod.hq.anetworks.com> <BD1F491BE5FEB942A95DC075D99F15CA41765FD48E@exch-be-01-prod.hq.anetworks.com> | |||
Responsive | Yes | Identifies why the document is part of the export, either SEARCH (tagged documents and MAGS), ND_ASSOC (documents near-duped), ND_ASSOC_MAG (MAG part of which is near-duped), or TR_ASSOC (thread associated to a tagged MAG). | ||
returnpath | For RFC-822, identifies a path (for example, address) back to the originator of the message. This field is not case-sensitive when searched. Example: spambayes-bounces@python.org | |||
reviewfolder | For import and export, a metadata field with path information in a format that facilitates folder creation in a downstream review tool (for example, Relativity). All members of an email family (MAG) or all members of an email mailbox will have the same reviewfolder information. Although the field information appears after import as a Linux-compatible path, it is always exported as a Windows-compatible path (that is, any / folder separators in the path are changed to \). This field is not case-sensitive when searched. This field does not support phrase search with wildcards. For example, reviewfolder::"*jjones*" is not valid, but reviewfolder::*jjones* is valid. In general, use this search without quotes for any single part of a path with or without wildcards, or for any consecutive parts of a path without wildcards (for example, reviewfolder::jjones/email is valid). If you need to use wildcards to find more parts of a given path, you can use importpath instead to search across the directories in the path. Example of reviewfolder field contents (after import): /D1/jjones/email/jjones.pst/Inbox | Uses path tokenization rules (for word breaks on \ or /) |
||
revision | A status property used by Microsoft Office applications to identify the document revision number (for example, 2). This field is not case-sensitive when searched. Example: 13 | |||
scale | A status property used by Microsoft Office applications (for example, for images), identifies yes or no. This field is not case-sensitive when searched. | |||
sdcontrol |
|
|||
SearchID | Yes | For export, one or more IDs (separated by a semicolon). Each ID represents an individual Tag applied (for example, to a Work Basket Search task). The <volume>-TagReasonCodes.csv contains an entry for each Search ID. Each entry identifies the User who performed the operation, the Date and time of the operation, the associated Work Basket Description, and Comments applied during the Tagging operation. | ||
searchterms | Yes | For a document that is part of an Export set up to Generate Search Reports based on a list of submitted search queries, this field provides a semicolon-separated list of the matching search terms/queries. Example: demo;newsletter;(of the) | ||
sender | Identifies the sender of this email (the actual sender, not an impersonated sender). For Microsoft Outlook, this field may contain both a name and address (for example, Jim Brown <jbrown@mynetwork.com>). For Lotus Notes, this field typically contains one or more fully qualified names (for example, CN=John Doe/OU=US/O=someco) | Yes | ||
Sender_Address | Yes | For export, identifies the address portion of an email sender (based on the sender field). Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template. | ||
Sender_Name | Yes | For export, identifies the name portion of an email sender (based on the sender field). Since there is a potential performance impact associated with populating fields that break the email address into portions, you may want to omit this and similar _Address and _Name fields in the Project Export Fields template | ||
sensitivity | For email Calendar items, Tasks, or Journal entries, identifies the sensitivity of the item. For example, for Outlook, this could be Normal, Personal, Private, or Confidential. Example: Confidential | Yes | ||
sent | For email, this field identifies the sent date of the email, yyyy-MM-dd-HH-mm-ss, where yyyy is the 4-digit year, MM is the 2-digit month, dd is the 2-digit day, HH is the 2-digit hour, mm is the 2-digit minute, and ss is the 2-digit second value. Values aside from yyyy are padded when necessary. |
|||
sentdomains | For all email, identifies the domain part of the sent email. For Lotus Notes, this field will also include all unique received domains in the altfrom or altsender field as well as the from or sender field. Multiple domains are separated by a semicolon. Example: yahoo.com | Yes | ||
sentexists | Indicates whether an email had its own sent value. If so, this field will be populated and contain Y. If the email had no sent value of its own, this field will not be populated. Example: Y | |||
sentoffset | Identifies the offset from UTC for an email's sent time using the format sHHmm, where: s is a + or - sign indicating whether time of day is ahead of (east of) or behind (west of) Coordinated Universal Time, HH identifies the offset hours, and mm identifies the offset minutes. This is commonly known as the Greenwich Mean Time (GMT) offset (+ or - GMT, for example, -0400). This field is not affected by the time zone selected for the Project or for Export. | |||
sentzone | When available, an acronym (for example, EST or EDT) representing an email’s sent time zone in Coordinated Universal Time, UTC (GMT). This field is not affected by the time zone selected for the Project or for Export. This field is not case-sensitive when searched. Example: PDT | |||
size | A System Metadata field (untokenized) that identifies the size of the document. The default unit of measure is Bytes. Example: 61604 B | |||
slidecount | This field identifies the count of slides in the document (a Microsoft PowerPoint document). Example: 20 | |||
source | A reference to a resource from which this document resource is derived, or an application, such as Microsoft Excel. This field is not case-sensitive when searched. Example: Microsoft Word 10.0 | |||
status | A status property that identifies information about the document status, if applicable. This field is not case-sensitive when searched. | |||
stored_image | Identifies whether a document has stored images available for use during Export/Production. This field is populated once images have either been generated for Export (Internal) or imported as baseline images as part of an External Image Import or Load File Import (External). It contains one of the following values: Internal (generated internally), Internal - <timezone> (for a document affected by the Export time zone selected for Export, such as emails or Calendar items), External (imported images from an External Image Import or Load File Import), or Error (for a document that failed conversion, for example, if the software could not create a PDF from the provided images during export of external images). Example: External | |||
subject | Text that describes the email subject. (For a document subject, see the edocsubject field instead.) For Instant Bloomberg Messages, the subject identifies the chat Room ID, which includes whether the Room Type is PCHAT (Persistent CHAT in which the chat room stays open even when the last person leaves) or CHAT (normal CHAT in which the chat room closes after the last person leaves). Example 1: RE: Release 1.0 Branch Bugfixes Example 2 (IB): PCHAT-0x10000055551A1 | Yes | ||
TagID | Yes | For export, one or more tags that were applied to the document separated by a semicolon (;). Example: Responsive; Privileged; Hot Doc | ||
tag_view | An Analytic Metadata field for the name of the Tag view. Example: tagpriv | |||
TextLink | Yes | Within the export hierarchy, the path to a produced text file. By default, this field uses a prefix of DR\. The path is Windows-compatible (using \ to separate folders), and the text file always ends in .txt (e.g., DR\00001\ DOC00000001.txt). The format of this field is configurable in the Export dialog, so it may contain the full path of the Volume directory, including a base path, Volume label and Volume #, document ID prefix, and starting ID (using the appropriate Pad size), and then the file extension. | ||
TextLinkBytes | Yes | For export with Text versions requested, the size (in bytes) of the Text file version produced by the export (that is, the size of the file whose path is identified in the TextLink field). Example: 8578 | ||
thread_handle | An Analytic Metadata field for the handle of the Email Thread. Example: 56af58583b40c04aec18a7ce541010f1cf39426834be403e8b186691954e633a | |||
ThreadGroupID | Yes | For export to Relativity, . |
||
ThreadGroupIndent | Yes | For export to Relativity, this field can be sorted to organize the emails in a thread by date and thread response hierarchy. This field can be used with the Relativity Thread GroupIndent to create an Indented List Relativity View. Example: 2 | ||
ThreadGroupSort | Yes | For export to Relativity, this numeric field indicates an email’s depth in a thread response hierarchy. This field can be used to control the indent level of an Indented List Relativity View sorted by ThreadGroupSort. Example: DOC0000000158.1.1.1 | ||
ThreadID | Yes | For export, the document ID followed by a period and numeric sequence representing an email in a thread. Example: DOC0000000158.0000 | ||
ThreadIDOrphanRef | Yes | For export, the list of document IDs for Orphan threads separated by a semicolon. | ||
ThreadIDParent | Yes | For export, identifies the status of an email in a thread, either Y (for parent), N, Child, or Orphan. | ||
ThreadIDParentRef | Yes | For export, the document ID for the parent of an email thread. | ||
ThreadInclusive | Yes | For export, |
||
threadindex | For email, represents the threadindex extracted from the imported email. This field is not case-sensitive when searched. Example: AcnqEFLMwQAjntPKTeqIvsKy/XcjmgAKoVrgABOMc3AAAFRW4AAABpZAAAARPEAAAA +eAAAAI+WwAAAyGYA= | |||
threadtopic | For email, a value indicating the thread topic ID. Example: Release 1.0 Branch Bugfixes | Yes | ||
TimeAccessed | Yes | This field has been deprecated. For export, the time portion of the embedded metadata field dateaccessed, reporting when the document was accessed. The default format is HH:mm:ss. Example: 13:16:33 | ||
TimeBackedUp | Yes | For export, the time portion of the embedded metadata field datebackedup, reporting when the document was backed up. The default format is HH:mm:ss. Example: 13:12:33 | ||
TimeCompleted | Yes | For export, the time portion of the embedded metadata field datecompleted, reporting when the document was completed. The default format is HH:mm:ss. Example: 3:11:33 | ||
TimeCreated | Yes | For export, the time portion of the embedded metadata field datecreated (or the field createdtime, if no datecreated value is available), reporting when the document was created. The default format is HH:mm:ss. Example: 19:16:33 | ||
TimeCreateSystem | Yes | For export, the time portion of the System Metadata field createdtime (for NTFS with CIFS), reporting when the file was created. The default format is HH:mm:ss. Example: 20:19:25 | ||
TimeDue | Yes | For export, the time portion of the embedded metadata field datedue, reporting when the document was due. The default format is HH:mm:ss. Example: 13:10:23 | ||
TimeEdited | Yes | For export, the time portion of the embedded metadata field dateedited, reporting when the document was edited. The default format is HH:mm:ss. Example: 12:12:33 | ||
TimeEnded | Yes | For export, the time portion of the embedded metadata field dateended, reporting when an email Calendar item or Bloomberg CHAT ended. The default format is HH:mm:ss. Example: 11:12:43 | ||
TimeEWFAcquired | Yes | For export, the time portion of the embedded metadata field ewfdateacquired, reporting the EWF acquisition time. The default format is HH:mm:ss. Example: 13:12:33 | ||
TimeFlagCompleted | Yes | For export of Microsoft Outlook items, the time portion of the field flagcompleted, reporting when the flagged item was marked by the user as complete. The default format is HH:mm:ss. Example: 10:12:43 | ||
TimeFlagDue | Yes | For export of Microsoft Outlook items, the time portion of the field flagdue, reporting when the due date was set by the user for the flagged item. The default format is HH:mm:ss. Example: 19:10:43 | ||
TimeFlagStarted | Yes | For export of Microsoft Outlook items, the time portion of the field flagstarted, reporting when the flag was applied to an item. The default format is HH:mm:ss. Example: 11:42:59 | ||
TimeLastAccessed | Yes | For export, the time portion of the NFS/CIFS System metadata field lastaccesstime, reporting when the document was last accessed. The default format is HH:mm:ss. Example: 13:12:33 | ||
TimeLastChanged | Yes | For export, the time portion of the NFS/CIFS System metadata field lastchangetime, reporting when the document was last changed. The default format is HH:mm:ss. Example: 14:12:43 | ||
TimeLastMod | Yes | For export, the time portion of the embedded metadata field datemodified (or the field lastmodifiedtime, if no datemodified value is available). This field reports when the document was last modified. The default format is HH:mm:ss. Example:13:15:33 | ||
TimeLastModSystem | Yes | For export, the time portion of the NFS/CIFS System Metadata field lastmodifiedtime, reporting the last time the file was modified. The default format is HH:mm:ss. Example: 20:04:27 | ||
TimeLastPrinted | Yes | For export, the time portion of the embedded metadata field dateprinted, reporting when the document was last printed. The default format is HH:mm:ss. Example: 22:15:26 | ||
TimeMediaCreated | Yes | For export, the time portion of the embedded metadata field mediacreationtime, reporting the media creation time. The default format is HH:mm:ss. Example: 12:42:33 | ||
TimePrimary | Yes | For export, the time portion of the dateprimary field information. The default format is HH:mm:ss. Example: 10:09:12 | ||
TimeProcessed | Yes | For export, the time portion derived from the datescanned field, reporting when the documented was last processed. The default format is HH:mm:ss. The export-only field DateProcessed contains the date portion. Example: 12:12:20 | ||
TimeReceived | Yes | For export, for documents that have a receivedexists value, the time portion of the received field, reporting when an email was received. T he default format is HH:mm:ss. Example: 18:19:25 | ||
TimeRecorded | Yes | For export, the time portion of the recordeddate field, reporting when the document was recorded. The default format is HH:mm:ss. Example: 11:12:33 | ||
TimeSent | Yes | For export, for documents that have a sentexists value, the time portion of the sent field, reporting when an email was sent. The default format is HH:mm:ss. Example: 13:10:20 | ||
TimeStarted | Yes | For export, the time portion of the datestarted field, reporting when an email Calendar item or Bloomberg CHAT started. The default format is HH:mm:ss. Example: 13:12:33 | ||
TimeTaken | Yes | For export, the time portion of the datetaken field, reporting when the image (such as a JPG) was taken. The default format is HH:mm:ss. Example: 10:30:00 | New in 5.2.5.x | |
TimeZone | Yes | For export of email, identifies the time zone for the exported dates. Example: UTC | ||
TimeZoneOffset | Yes | For export, the hour offset of the selected export time zone (as identified in the TimeZone field). If the document is an email, and that email has sent field metadata, the offset reflects any daylight savings time rules in effect at that sent time for the selected time zone. The offset is a whole number. Example: -6 | ||
title | Text identifying the title of the document. Example: Experimental Proposal | Yes | ||
to | For email, identifies each intended recipient of this email. For Microsoft Outlook, this field may contain both a name and address (for example, mike.barnett@uk.pwcglobal.com, mark.cole@uk.pwcglobal.com). For Lotus Notes, this field typically contains one or more fully qualified names (for example, CN=John Doe/OU=US/O=someco). When you display this field after import, each recipient is separated by a comma or semicolon, depending on the source. For export, each recipient is always separated by a semicolon (for example, jason.smith@uk.pwcglobal.com; bill.stark@uk.pwcglobal.com; mona.wall@uk.pwcglobal.com). | Yes | ||
to_identifier | |
Yes | New in 5.4.1.0 | |
to_name | |
Yes | New in 5.4.1.0 | |
trackchanges | Identifies Microsoft Word documents that have the Track Changes feature enabled. This field is populated and set to true when track changes is detected in Word documents, and you can search for trackchanges::true. For Parsing Library V2, trackchanges is supported for .docx files only (that is, XML-based Word documents), not .doc files (Word documents that are not XML-based). Note: Excel Track Changes is reported by the docannotations field. Example: true | |||
typist | A status property used by Microsoft Office applications to identify the document typist. This field is not case-sensitive when searched. | |||
unid | Populated for files extracted directly from Lotus Notes NSF files. The unid is a 16-byte value (32 characters) in which the first 8 bytes represent the file (database) component and the second 8 bytes represent the Notes component (essentially, both components are internal timestamps). This field is not case-sensitive when searched. Example: 615e68637a552aaa8525755b00517535 | |||
virus | If Virus Detection is enabled as an Index Setting, this field identifies the name of a virus detected for a document. Note that if a virus is detected for an attachment, both the attachment and its parent report the virus detected in this field. Example (for an EXE): Win.Worm.Runouce-377 | Yes | ||
VolName | Yes | For export, identifies the volume in which this document ID was assigned (not necessarily produced). It may not always match the value of the ExportedVolName field. | ||
wordcount | A value representing the document word count. this field typically applies to Microsoft Office documents. Example: 5456 |
Short Message Format Fields
The following table lists the Short Message Format (SMF) fields, which were introduced in the 5.4.1.0 release to support Cellebrite.
Metadata Field
(after import, or full metadata export) |
Export-Only Field | Tokenized? | Description | Notes |
---|---|---|---|---|
smf_channel_name | Yes | A Short Message Format (SMF) field that identifies the channel name for MS Teams Data. Example: Tier 1 Management Team | New in 5.4.3.0 | |
smf_chatid |
Example (Cellebrite Chat Message): ab3ac5c6-d7d7-4066-b03d-ec4a2aaedce3 Example (Instant Message): 8be502b9-1ab6-46e2-8b1c-235722049b61-CC3ABEBD553470250C4F3C7E85264D07 Example (Teams Chat): 19:711c7237a7b241b3bc26e3a5f7122137@thread.v1 |
|||
smf_datatype | A Short Message Format (SMF) field that identifies the data type or format of an item such as a Cellebrite |
|||
smf_deleted | A Short Message Format (SMF) field that identifies whether an object (for example, a Cellebrite object) was Deleted or is Intact. This field is not case-sensitive when searched. Example: Intact | |||
SMF_DeletionDate | Yes | For export, a Short Message Format (SMF) field that identifies the date portion of the smf_deletiondatetime field. The default format is mm/dd/yyyy . Example: 3/10/2022 |
||
smf_deletiondatetime | A Short Message Format (SMF) field that identifies the full timestamp on which a file was deleted. The format is yyyy-MM-dd-HH-mm-ss. Example: 2023-07-23-08-20-08 | |||
SMF_DeletionTime | Yes | For export, a Short Message Format (SMF) field that identifies the time portion of the smf_deletiondatetime field. The default format is HH:mm:ss. Example: 20:11:23 | ||
smf_direction | A Short Message Format (SMF) field that identifies the direction of the communication (for example, Outgoing, Incoming, or Missed). This field is not case-sensitive when searched. Example: Incoming | |||
smf_event_attendees | Yes | A Short Message Format (SMF) field that identifies the list of attendees (participants) for an event. Example: John Doe <John.Doe@someco.com>; Jane Smith <Jane.Smith@someco.com>; Bill Jones <Bill.Jones@someco.com> | ||
smf_event_availability | A Short Message Format (SMF) field that identifies the user availability during the event (for example, Busy or Free). This field is not case-sensitive when searched. Example: Free | |||
smf_event_category | A Short Message Format (SMF) field that identifies the type of event for a Cellebrite_CalendarEntry (for example, US Holidays or Birthdays). This field is not case-sensitive when searched. Example: US Holidays | |||
smf_event_details | Yes | A Short Message Format (SMF) field that identifies the user-provided details for an event (for example, a link to a site or information about the event). | ||
smf_event_location | Yes | A Short Message Format (SMF) field that identifies the location for a calendar event, as provided by the user during event creation (for example, a conference room or street address). Example: New York, NY, United States | ||
smf_event_repeatday | A Short Message Format (SMF) field that provides information about the day(s) of the week that this event repeats, if any. This field is not case-sensitive when searched. Example: Monday, Second | |||
smf_event_repeatinterval | A Short Message Format (SMF) field that identifies a value representing how many times the RepeatRule interval must repeat between events. Example: 3 | |||
smf_event_repeatrule | A Short Message Format (SMF) field that identifies how frequently this event repeats (for example, Weekly, Monthly, or Yearly). This field is not case-sensitive when searched. Example: Yearly | |||
smf_fileid | A Short Message Format (SMF) field that identifies the standalone file ID that corresponds to an attachment or voicemail. This field is not case-sensitive when searched. Example: 33af6421-7d31-333b-cd33-03bbcac313e3 | |||
smf_filepath | A Short Message Format (SMF) field that identifies the path for a given file. Example: /mobilephone/containers/Application/myapp/Documents/5aaf5f12-9163-4780-a895-5e10d14f6943/123.pic | Uses path tokenization rules (for word breaks on \ or /) |
||
smf_filesystem | A Short Message Format (SMF) field that identifies the ID of the file system in which this file exists. This field is not case-sensitive when searched. Example: iPhone | |||
smf_folder | A Short Message Format (SMF) field that identifies the name of the phone source folder containing the message (for example, Recents, Sent, Inbox, or Drafts), if applicable. This field is not case-sensitive when searched. Example: Inbox | Uses path tokenization rules (for word breaks on \ or /) |
||
smf_iscarved | A Short Message Format (SMF) field that identifies whether an item has been carved (reassembled via reconstruction of fragments indicating the absence of file system metadata). Values are Y or N. This field is not case-sensitive when searched. Example: N | |||
smf_isinfected | A Short Message Format (SMF) field that identifies whether an item has been infected with a virus (Y or N). This field is not case-sensitive when searched. Example: N | |||
smf_message_body | Yes | A Short Message Format (SMF) field that strictly contains the body of a message (that is, the actual text without any header values). This field applies to Cellebrite and Teams messages. Example: This month's issue is about Controlling Project Scope | New in 5.4.1.1 | |
smf_message_deleted | A Short Message Format (SMF) field that identifies whether a message was Deleted or is intact, or Unknown. This field is not case-sensitive when searched. Example: Intact | |||
smf_message_status | A Short Message Format (SMF) field that identifies the message status (for example, Sent, Read, or Delivered). This field is not case-sensitive when searched. Example: Sent | |||
smf_message_type | A Short Message Format (SMF) field that identifies the service (carrier) through which this message was sent or received (for example, AppMessage, IMessage, MMS, or SMS). This field is not case-sensitive when searched. Example: MMS | |||
smf_overflow | Yes | A Short Message Format (SMF) field that identifies any additional fields discovered in the XML that do not currently map to an existing field. This field can also identify models, which can contain multiple fields or other models, and will then appear indented. For each overflow field, this field contains the name of an overflow field followed by a colon and the field value, and each field entry is new line delimited. Therefore, you will see this format for overflow fields: <field1>:<value> <field2>:<value> <field3>:<value> |
||
smf_platform | A Short Message Format (SMF) field that identifies the device platform this message was sent from (for example, PC, Mobile, or Web). This field is not case-sensitive when searched. Example: Mobile | |||
smf_relatedfilepaths | A Short Message Format (SMF) field that provides file path information (for example, related to voicemails) for RelatedNodes and Related Models items in sections of a source Cellebrite XML file. Example: /mobile/Library/Voicemail/144.amr | New in 5.4.3.0 Uses path tokenization rules (for word breaks on \ or /) |
||
smf_source_application | A Short Message Format (SMF) field that identifies the source application from which this message was sent or received (for example, Native Messages, Discord, WhatsApp, or Snapchat). This field is not case-sensitive when searched. Example: Native Messages | |||
smf_sourcefile | Yes | A Short Message Format (SMF) field that identifies the data area handle (for location) and name of the source XML file. Example: 00005c555c00cc0b1a413a9ea3132b1cf339be3e:Bob_Jones_iPhone_Messages_01.xml | ||
smf_sourcefilename | Yes | A Short Message Format (SMF) field that identifies the name of the source XML file for this record. Example: Bob_Jones_iPhone_Messages_01.xml | ||
smf_sourcefilepath | A Short Message Format (SMF) field that identifies the full internal path to the source XML file. Example: <DataArea>: 2023-03-23_Collection/Backup/Bob_Jones_iPhone_Messages_01.xml | Uses path tokenization rules (for word breaks on \ or /) |
||
smf_sourceindex | A Short Message Format (SMF) field that identifies the source index value. This field is not case-sensitive when searched. Example: 266422 | |||
smf_sourcemodels | A Short Message Format (SMF) field that provides one or more sets of information for the source models field in a source Cellebrite XML file. Each set of this information consists of name and value pairs in the format <key1>: <value1>, <key2>: <value2>, ... Type: <type>, ID: <guid>, where each set is separated by a semicolon. The name and value pairs can include information such as the Direction (Outgoing or Incoming) and Type (for example, InstantMessage) in the context of an ID (GUID value). This field is not case-sensitive when searched. Example: Direction: Outgoing, Type: Instant Message, ID: a203f0ad-3875-425c-8ce5-daecf8828061; Direction: Incoming, Type: Instant Message, ID: c113e3ac-3344-313d-7bc3-bacce7613033 | New in 5.4.3.0 | ||
smf_sourcerecordid | A Short Message Fomat (SMF) field that identifies a GUID (for example, from Cellebrite). This field is not case-sensitive when searched. Example: b6044fc7-a84d-4e36-a962-1cf67a3bc111 | |||
smf_tags | A Short Message Format (SMF) field that identifies a Tag indicating the file purpose or location (for example, Archives, Audio, Configuration, Image). This field is not case-sensitive when searched. Example: Audio | |||
smf_team_name | A Short Message Format (SMF) field that identifies the name of a source chat, channel, or subchannel in Teams. This field is not case-sensitive when searched. Example: channel1 | New in 5.5.0.0. | ||
smf_threaddeleted | A Short Message Format (SMF) field that identifies whether an email thread has been Deleted or is Intact. This field is not case-sensitive when searched. Example: Intact | |||
SMF_ThreadLastActivityDate | Yes | For export, a Short Message Format (SMF) field that identifies the date portion of the smf_threadlastactivitydatetime field. The default format is mm/dd/yyyy . Example: 3/30/2023 |
||
smf_threadlastactivitydatetime | A Short Message Format (SMF) field that identifies the full timestamp on which activity ended for this thread. The format is yyyy-MM-dd-HH-mm-ss. Example: 2023-07-21-09-30-05 | |||
SMF_ThreadLastActivityTime | Yes | For export, a Short Message Format (SMF) field that identifies the time portion of the smf_threadlastactivitydatetime field. The default format is HH:mm:ss. Example: 20:11:22 | ||
smf_threadname | Yes | A Short Message Format (SMF) field that identifies the name of this chat, if applicable. Example: 13133313233@s.whatsapp.net | ||
smf_threadsource | A Short Message Format (SMF) field that identifies the application on which this chat thread took place (for example, Snapchat, Discord, WhatsApp, ooVoo, Native Messages, and Facebook Messenger. This field is not case-sensitive when searched. Example: Native Messages | |||
SMF_ThreadStartDate | Yes | For export, a Short Message Format (SMF) field that identifies the date portion of the smf_threadstartdatetime field. The default format is mm/dd/yyyy . Example: 2/14/2022 |
||
smf_threadstartdatetime | A Short Message Format (SMF) field that identifies the full timestamp on which activity started for this thread. The format is yyyy-MM-dd-HH-mm-ss. Example: 2023-07-24-00-33-08 | |||
SMF_ThreadStartTime | Yes | For export, a Short Message Format (SMF) field that identifies the time portion of the smf_threadstartdatetime field. The default format is HH:mm:ss. Example: 22:15:26 |
Export Exceptions
The following table summarizes the Export Exceptions that can be reported in the Export metadata field ExceptionDescCode
. See the Export Exceptions section of the Export Overview topic for a more detailed list of these exceptions along with their associated import (parsing and OCR) warnings/errors, where applicable.
Export Exception Code | Description |
---|---|
CONNECTOR_FAILURE | Indicates a Connector failure (for example, the file could not be retrieved using the Connector and location). This failure may also occur if the Connector itself could not be read. |
CONNECTOR_READ_ERROR | Indicates that either the document or directory could not be read, or the document or directory has illegal (invalid) characters in the name. |
CONVERSION_FAILURE | Indicates a failure during conversion to HTML, MHTML, or HTML/MHTML, depending on what is selected for the eDiscovery Export . This error may also occur for a Special File (that is, a file that is not a directory or regular file). |
CORRUPT | Indicates a file identified as corrupted or damaged in some way during the parsing process. |
ENCRYPT | Indicates an encrypted or password-protected file. |
INVFILETYPE | Indicates a file identified as having an undetermined file type or unsupported file type during parsing. |
NATIVEFILE_NOT_FOUND | Indicates that the native file could not be found (that is, the document or directory does not exist and the document is considered missing). |
NO_TEXT_FOUND | Identifies all files that did not have text extracted during processing. These have a parsingstatus of NODATA. |
OCR_ERROR | Indicates an OCR error. For a list of these errors, see the section on OCR Errors in View a Scan Report. |
OCR_INVIMAGE | Indicates an OCR processing error, most likely because the image was in a format that could not be handled. |
PARSING_ERROR | Indicates the appropriate import (parsing) error, as identified in the Scan Report. For a list of the import errors, see the section on Warnings and Errors table in View Data Set Reports. |
PDF_HIGHLIGHT_WARNING | Indicates that search term highlighting failed during generation of a PDF for export. The PDF was still generated, just without highlighting. |
SETUP_ERROR | Indicates an installation or setup error (for example, that Lotus Notes or OCR is not installed or licensed). |
SKIPPED | Indicates excluded, or skipped, files, identified with the Excluded category in the Scan Report. This exception means that archive files (.zip, .tar, and .pst) were excluded from the archive extraction process during import. |
SUCCESS | Indicates success. |
SYSTEM_ERROR | Indicates a system error. |
UNEXPECTED_CONVERSION_FAILURE | Indicates an unexpected conversion-related issue. For example, if you select PDF for Export, the software assesses the images to see if there are any missing or if there are any image file types not supported for conversion. If you do not also select |
UNKNOWN | Indicates an unknown error or unexpected error. |