Supported File Types for Analysis

The Digital Reef software identifies a large set of file types when data is added to a Project. This topic lists the file types that are identified and/or supported during parsing.

Note: Digital Reef now restricts import and reprocessing of data to Projects using Parsing Library V2. All new or migrated Digital Reef Projects use Parsing Library V2, which is identified in the Project Index Settings. Any legacy Parsing Library V1 Projects are expected to be migrated to V2. In general, when writing queries that include file types in new or migrated Projects, remember to use the Parsing Library V2 file type name for file types.

You can view all of the file types discovered during indexing of a selected Data Set from the Document Types Report on the Reports tab. Digital Reef assigns its own file types for file types such as container files and specific types of email, as indicated in the first table. For more information, see Supported Container Files and Supported Emails. You can view the file type for a given document from the Metadata view of the Document Viewer, in the filetype field.

File Types Assigned by Digital Reef

This table lists the file types assigned by Digital Reef to provide more precise information for different types of container files and email. In general, when Parsing Library V2 can identify one of these file types, you will see the V2 name; otherwise, you will see the DR name.

Digital Reef Assigned File Type Additional Information
application/arj An ARJ compressed file.

Parsing Library V2 supports this file type, so the appropriate parsing status will be reported for it. This type was identified, but not supported for parsing with Parsing Library V1.
application/7zip A 7ZIP compressed file.
application/bloomberg-ib-dump Bloomberg IB Compliance Dump Format in XML (Instant Bloomberg)
application/bloomberg-message-dump Bloomberg Message Compliance Dump Format in XML
application/lotusnotes An IBM Lotus Notes NSF email archive

Note: Digital Reef supports Lotus Notes versions up to and including Lotus Notes 9.0.1.
application/msoutlook Applies to Microsoft Outlook PST/OST 97/2000/XP, Microsoft Outlook PST/OST 2003/2007
application/msoutlook-mac Applies to a Microsoft Outlook for Mac Archive (OLM)
application/x-bzip2 A BZIP2 compressed file.

When Parsing Library V2 is able to report a BZIP2 file, the V2 name is used, bzip2 Archive; otherwise, the DR name is reported. Parsing Library V1 reported a BZIP2 file with the DR name, as shown.
application/x-compress Applies to UNIX Compress, .COM Files
application/x-gzip UNIX GZip. For a Bloomberg attachment archive, application/x-gzip supports an auxfiletype of bloomberg-attachment-archive.
application/x-lzh-compressed Applies to LZH Compress, Self-Extracting LZH
application/x-rar-compressed .RAR File (compressed)
application/x-tar UNIX TAR File
application/zip .ZIP or .ZIPX (or JAR file)
Cellebrite iPhone Backup Source Cellebrite iPhone Backup data in XML
diskimage/ad1 An AD1 file, Forensic Toolkit (FTK) Imager Logical Image

In general, identified by Digital Reef, but not supported for parsing.
diskimage/bitlocker An LEF with BitLocker-encrypted partitions (extracted as Unallocated Clusters files with a diskimage/bitlocker file type).
diskimage/ewf An Expert Witness Compression Format File (for example, for EnCase, E01)
diskimage/fat For MS-DOS and vfat file systems
diskimage/gptpartitions GUID Partition Table (GPT) partitioned disk images
diskimage/hfsplus HFS+ file system
diskimage/iso9660 ISO 9660 image files

Parsing Library V2 reports ISO image files as ISO Disk Image. Parsing Library V1 reported this type using the DR name.
diskimage/lef Applies to Logical Evidence Files (for example, L01). Identified as a Disk Image Container file and supported for parsing.
diskimage/linux Linux (unpartitioned) images, including ext2, ext3, and ext4 file systems
diskimage/mbrpartitions Master Boot Record (MBR) partitioned disk images
diskimage/ntfs NTFS file system
email Generally used to represent identified email messages, including regular Microsoft Outlook email, as well as Lotus Notes, MBOX messages, Bloomberg, and Bloomberg Instant Messages. This file type supports auxiliary file types of msg, eml, and emlx (for Apple 2.0 Mail Messages).
MBOX A Unix MBOX email archive.

Parsing Library V2 reports this type as Sendmail MBOX. Parsing Library V1 reported this type as mbox(rfc-822 mailbox).

Microsoft Cabinet Archive A Microsoft Cabinet archive file (CAB)

Parsing Library V2 reports Microsoft Cabinet Archive for this type of file. As of 5.4.1.0, Microsoft Cabinet Archive files are treated as binary files. Text is no longer extracted from these files. Parsing Library V1 reported Microsoft Cabinet file as the file type.
NIST Generally used to represent NIST files as of 5.4.0.0.
text/csv Comma-separated values

Note: For searching based on legacy configurations, the DR list of types for the disk image category continues to include application/ewf (legacy ewf files only, as diskimage/ewf is the current type used for EWF files), and, for the email archive category, container(assentor), which is for legacy CA Message Manager only (now obsolete).

Supported File Types for Parsing Library Version V2

The following table lists the supported formats for Parsing Library Version V2. Many are supported for full content extraction. File types supported for file identification only but not content extraction are identified with an asterisk (*) in the table; these file types will generally report a parsingstatus such as 00068 FILE_ID_ONLY.

Files that are not recognized by the V2 library are reported with a file type of Unknown format and a parsingstatus of 00019 FILE_TYPE_UNDETERMINISTIC. The table does not include directory, which is also a supplied file type.

The File Type Shown in UI column shows what actually appears in the UI for a given type of file. If you see more than one entry in this column, it means that either entry might appear, depending on the individual parsing situation.

Note: Although the table lists common extensions for the various file types, keep in mind that extensions are not always reliable and do not necessarily reflect the true file type of a file.

Type of File Supported File Type Shown in UI File ID only Versions Supported Common Extensions Additional Notes/ DR Notes
3GP ThreeGP *   3GP  
7-Zip
application/7zip,
7-Zip Archive
    7Z  
7-Zip Self-extracting archive, SFX 7ZIP Archive (SFX)     EXE  
ACE ACE Archive     ACE  
Adobe Flash Adobe Shockwave Flash     SWF  
Adobe Flash Video Adobe Flash Video *   FLV  
Adobe Illustrator Adobe Illustrator
    AI ID and metadata only

If Adobe Illustrator files are saved with PDF options, they may be identified as Adobe PDF.
Adobe InDesign Adobe InDesign   1.x-7.x INDD ID and metadata only
Adobe PDF Adobe PDF   1.0 – 1.7 (Extension 3, 5) (Acrobat 1 - 11) PDF  
Adobe PDF XFA Forms Adobe PDF     PDF  
Adobe Photoshop Adobe Photoshop Image   8.x, 9.x, 10.0
(CS 1-3)
PSD  
Adobe Photoshop Large Document Format (PSB) Adobe Photoshop Large Document Format (PSB)     PSB  
Adobe PostScript Adobe PostScript *   PS  
AFP MODCA
(Advanced Function Presentation MO:DCA)
AFP MODCA        
AmiPro for Windows AmiPro     AMI, SAM  
ANSI Text Text   7-bit, 8-bit TXT  
Apple Disk Image Apple Disk Image   Mac OSX 32/64 only DMG DR classifies this as a Disk Image for reporting (e.g., docclass::disk_image)
Apple Double Apple Double *     Apple Double files can have any extension.
Apple Executable Apple Executable *   BIN  
Apple iBook Apple iBook     IBOOK  
Apple iWork Keynote Apple Keynote, Apple iWork   ’09 (5.x) KEY  
Apple iWork Keynote Apple Keynote, Apple iWork   6, 7 KEY  
Apple iWork Numbers Apple Numbers, Apple iWork   ’09 (2.x) NUMBERS  
Apple iWork Numbers Apple Numbers, Apple iWork   3, 4 NUMBERS  
Apple iWork Pages Apple Pages, Apple iWork   4.x PAGES  
Apple Mail
email

    EMLX  
Apple PLIST Binary File Apple PLIST Binary File     PLIST  
ARJ
application/arj,
ARJ Archive
    ARJ  
ASCII Text Text   7-bit, 8-bit TXT  
Audio Interchange File Format Audio Interchange File Format     AIFF  
Audio Video Interleave (AVI) AVI Video     AVI ID and metadata only
AutoCAD 2018 AutoCAD 2018   AutoCAD 2018+    
AutoCAD Drawing AutoDesk AutoCad   12, 13, 14, 2000, 2002, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2012, 2013, 2014, 2015, 2016, 2017 DWG  
AutoCAD Drawing Exchange Format AutoCAD Drawing Exchange Format     DXF  
AutoDesk Design Web Format AutoDesk Design Web Format *   dwf  
BIN HEX Encoded BinHex Encoded File *   HBX, HEX, HQX  
BitTorrent Metafile BitTorrent Metafile *   TORRENT  
Brooktrout Fax Image Brooktrout Fax Image        
Bzip2 bzip2 Archive,
application/x-bzip2
    BZ2, TBZ2
CALS Raster CALS Raster   Type 1 CAL  
Canon Camera Raw Image 2 CR2 Image     CR2 ID and metadata only
Canon Camera Raw Image 3 CR3 Image     CR3 ID and metadata only
Canon Raw CIFF Image file CRW Image     CRW  
Comma Separated Values text/csv     CSV  
Computer Graphics Metafile CGM Image     CGM  
Corel Draw Image Corel Draw Image *   CDR  
Corel Presentation Corel Presentation     SHW  
Dassault Systemes CATIA CAD Dassault Systemes CATIA *   CATIA  
Dassault Systemes SolidWorks SolidWorks     SLDASM, SLDDRW, SLDPRT  
dBASE file dBASE Database   3,4    
dBASE III file dBASE Database   3,4    
dBASE IV file dBASE Database   4    
DCX DCX Image     DCX  
DICOM Medical Image DICOM Image     DCM  
Domino DXL message/
Domino XML
Domino XML     DXL  
DVD Information File DVD Information File *   IFO, BUP  
DVD Video Object DVD Video Object     VOB ID and metadata only
Email Message email
    EML, EMLX, P7M Signed Email Messages (.p7m files) will report a filetype of email as of 5.4.2.0
Encapsulated PostScript EPS (with TIFF Header) *   EPS  
Encapsulated PostScript with Preview EPS (with Preview) Image     EPS  
Encoded mail message, MHT Microsoft HTML Archive   MHT MHT  
Encoded mail message, Multipart Alternative email   Multipart Alternative    
Encoded mail message, Multipart Digest email   Multipart Digest    
Encoded mail message, Multipart Mixed email   Multipart Mixed    
Encoded mail message, Multipart News Group email   Multipart News Group    
Encoded mail message, Multipart Signed email   Multipart Signed    
Encoded mail message, TNEF WinMail Data File   TNEF WINMAIL.DAT  
EOT Font EOT Font *      
EPUB Book EPUB     EPUB  
ESTSoft ALZip ALZ Archive     ALZ  
ESTSoft EGG EGG Archive     EGG  
Eudora   Classic 1-7 OSE MBX Would be considered a type of email.
FlashPix FlashPix Image     FPX  
Flexible Image Transport System FITS image *   FITS  
Framework Spreadsheet Generic Spreadsheet/Other   III FW3  
Framework WP FW3     FW3  
Fuji Xerox Docuworks Fuji Xerox Docuworks *      
GEM Raster GEM Raster     IMG  
GNU Zip
application/x-gzip,
GNU Zip Archive (GZIP)
  0.1, 1.0 GZ  
Graphics Interchange Format (GIF) GIF Image   87a, 89a, Animated GFA, GIF, GIFF ID and metadata only
Hancom Office HanCell Hancom Hancell   2010, 2014 CELL  
Hancom Office HanShow     2010,2014 SHOW  
Hangul Hangul   v3 HWP  
Hangul Hangul   96-2014 HWP  
HFS Partition HFS Partition        
HP TRIM email rendition       VMBX  
HTML (Text Only) HTML     HTM, HTML  
HTML (Codes Revealed) HTML     HTM, HTML  
HTML (Metadata Only) HTML     HTM, HTML  
IBM DCA RFT DCA RFT     RFT, TXT, DCA  
IBM DCA/FFT Fast Find Index
    RFT, FFT  
IBM DisplayWrite DisplayWrite4   4 RFT, DCA, DW4, DOC  
IBM DisplayWrite DisplayWrite5   5 RFT, DCA, DW4, DOC  
IBM Lotus Symphony Document     1.x, 3.x ODT  
IBM Lotus Symphony Spreadsheet Generic Spreadsheet/Other   1.x, 3.x SXS, SX, ODS  
IMNET COLD IMNET COLD     IMT  
IMNET Group 4 Image IMNET Group 4 Image     IMT  
Initial Graphics Exchange Specification Initial Graphics Exchange Specification     IGS  
Interchange File Format Interchange File Format     IFF  
Intergraph-Microstation CAD Intergraph/Microstation CAD (DGN)     DGN  
ISO Disk Image ISO Disk Image     ISO DR classifies this as a Disk Image for reporting (e.g., docclass::disk_image)
ISYS Index ISYS Index *   IXA, IXB, IXC  
Java Archive application/zip     JAR  
Java Class Java Class *   CLASS  
JEDMICS C4 Image JEDMICS C4 Image        
Joint Photographic Exports Group (JPEG) JPEG Image     JPEG, JPG, JPE, JIF ID and metadata only
JPEG2000 JPEG2000 Image     J2P, J2C, JPF  
JT Open CAD Jupiter Tessellation *   JT  
JungUm JungUm     GUL ID and metadata only
JustSystems Ichitaro JustSystems Ichitaro   5, 6, 8+ JTD, JBW, JTT  
LibreOffice Document Open Document Format   3, 4, 5 ODT  
LibreOffice Presentation Open Document Format   3, 4, 5 ODS  
LibreOffice Spreadsheet Open Document Format   3, 4, 5 ODS  
Linux Executable and Linkable Format Linux Executable *   ELF  
Log File Log File     LOG  
Lotus 1-2-3 Generic Spreadsheet/Other   through Millennium 9.6 WK, WKS, WK3, WK4  
Lotus Manuscript Manuscript   1.0, 2.x MANU, MNU, MAN  
Lotus Notes application/lotusnotes,
Lotus Notes

 

  NSF In the Document Viewer, the HTML tab displays a code to indicate that the HTML view is unavailable.
Lotus WordPro Lotus WordPro *   LWP  
LZH
application/x-lzh-compressed,
LZH Archive
    LZH  
Macintosh PICT Image Macintosh PICT Image     PICT  
MacPaint 1BPP Image MacPaint Image     MAC  
Mass 11 Mass11   8 M11  
MathCAD MathCAD *   MCD, XMCD  
Media Exchange Format Material Exchange Format (MXF) *   MXF  
Microsoft Access file Microsoft Access   97-2016 MDB, ACCDB  
Microsoft Cabinet Microsoft Cabinet Archive     CAB As of 5.4.1.0, Microsoft Cabinet Archive files are treated as binary files. Text is no longer extracted from these files.
Microsoft Document Imaging Microsoft Document Imaging     MDI ID and metadata only
Microsoft Excel for Windows Microsoft Excel   2.0-7.0 (Excel 95) XLS  
Microsoft Excel for Windows Microsoft Excel   1997 (v.8) - 2010, 2013, 2016 XLS, XLSX, XML  
Microsoft Excel for Windows Microsoft Excel   2007 - 2016 (Binary) XLSB  
Microsoft Excel for Mac Microsoft Excel   1, 1.5, 2.2, 3.0, 4.0, 5.0 XLS  
Microsoft Excel for Mac Microsoft Excel   8.0- 15.0 XLS, XLSX, XML  
Microsoft Excel for Office 365 Microsoft Excel     XLS, XLSX  
Microsoft Excel Full-Text Microsoft Excel Full-Text        
Microsoft HTML Help Microsoft Compiled Help   1.0, 1.1a, 1.3, 1.32, 1.33MAML CHM  
Microsoft Office Binder Microsoft Binder     OBD  
Microsoft OneNote Microsoft OneNote   2007, 2010, 2013, 2016 ONE  
Microsoft OneNote TOC Microsoft OneNote TOC * 2007, 2010, 2013, 2016 ONETOC  
Microsoft Outlook
email,
Microsoft Outlook Email Message

  97-2016 MSG  
Microsoft Outlook
application/msoutlook,
Microsoft PST/OST
  97-2016 PST, OST  
Microsoft Outlook for Mac, OLK15message Microsoft Outlook for Mac OLK15message
  2011 OLK14  
Microsoft Outlook for Mac, OLK15msgsource Microsoft Outlook for Mac OLK15msgsource   2016 OLK15msgsource  
Microsoft Outlook for Mac Archive, Mac OLM application/msoutlook-mac     OLM If this type cannot be correctly identified, you will see application/zip. In this case, it is not a mail archive and cannot generate a mail container error.
Microsoft Outlook Express Outlook Express DBX File *   DBX  
Microsoft Outlook Forms Template       OFT  
Microsoft Outlook MSO Object Outlook MSO object     MSO  
Microsoft Paint MSP 1BPP Image Microsoft Paint MSP Image     MSP  
Microsoft PowerPoint for Windows Microsoft PowerPoint * 3.0-4.0 PPT These older version files report a 00009 FILE_READ error, but note that they are File ID only, and reprocessing them will have no effect.
Microsoft PowerPoint for Windows Microsoft PowerPoint   97-2013, 2016 PPT, PPTX  
Microsoft PowerPoint for Mac Microsoft PowerPoint   1-4, 98, 2001,
v.X, 2004, 2008, 2011
PPT, PPTX  
Microsoft PowerPoint for Office 365 Microsoft PowerPoint     PPT, PPTX  
Microsoft Project Microsoft Project   98-2003 MPP  
Microsoft Project Microsoft Project   2007, 2010, 2016 MPP, MPX  
Microsoft Publisher Microsoft Publisher     PUB ID and metadata only
Microsoft Windows Binary Microsoft Windows Binary *      
Microsoft Windows Bitmap Microsoft Bitmap     BMP ID and metadata only
Microsoft Windows Clipboard Microsoft Windows Clipboard     CLIP, CLP  
Microsoft Windows DLL Microsoft Windows Executable *   DLL  
Microsoft Windows Executable Microsoft Windows Executable *   EXE, COM, SYS  
Microsoft Windows Installer Microsoft Windows Installer *   MSI  
Microsoft Windows Shortcut Microsoft Windows Shortcut *   LNK  
Microsoft Windows Movie Maker Microsoft Windows Movie Maker *   MSWMM  
Microsoft Word for DOS Microsoft Word for DOS   4.0-6.0 DOC  
Microsoft Word for Mac Microsoft Word for Mac   1-5, 5.1, 6 DOC, DOCX  
Microsoft Word for Mac Microsoft Word for Mac   98, 2001, v. X, 2004, 2008, 2010, 2011, 2016 DOC, DOCX, XML  
Microsoft Word for Windows Microsoft Word for Windows   1.x, 2.x DOC  
Microsoft Word for Windows Microsoft Word for Windows   6.x, 95, 97, 2000, 2002, 2003, 2007, 2010, 2013, 2016 DOC, DOCX, XML  
Microsoft Works for DOS Microsoft Works   2 WPS  
Microsoft Works for Windows Microsoft Works   3, 4, 6, 7 WPS  
Microsoft XPS Microsoft Open XML Paper Spec     XPS, OXPS  
Microsoft Visio Microsoft Visio * 3.0 VSD  
Microsoft Visio Microsoft Visio   4.0 - 2013 VSD  
Microsoft Visio Microsoft Visio   10.0 - 2013, 2016 VDX, VSDX  
MO:DCA Files DCA RFT        
MPEG Video MPEG Video     MPG ID and metadata only
MPEG-4 Video MPEG-4 Video     MPG4 ID and metadata only
MPEG-1 Audio Layer 3 MPEG Audio Layer3   ID3v1, ID3v2 MP3 ID and metadata only
MPEG-2 Audio Layer 3 MPEG Audio Layer3   ID3v1, ID3v2 MP3 ID and metadata only
MultiMate MultiMate   through 4.0 DOX  
MultiMate Advantage MultiMate     DOX  
Musical Instrument Digital Interface (MIDI) MIDI Audio * Standard MID, MIDI, SMF  
NCR Image NCR Image     NCR  
Netbpm Netpbm (PPM, PBM, PGM, PNM)     PPM, PBM, PGM  
OGG FLAC Audio OGG FLAC Audio     FLAC ID and metadata only
OGG Vorbis Audio OGG Vorbis Audio     OGG ID and metadata only
OpenAccess II (OAII) OAII        
OpenOffice Calc Open Document Format   1.1 - 4.x ODS  
OpenOffice Impress Open Document Format   1.x, 2.x, 3.x, 4.x ODP  
OpenOffice Writer Open Document Format   1.1 -4.x ODT  
PaintShop Pro Image PaintShop Pro Image     PSP  
Paradox Database File Paradox Database *   DB  
Parasolid Model Part Parasolid Model Part     X_T  
Password Protected Office File Password Protected Office File       This file type identifies any Microsoft Office document that is password protected.
PCL
(Printer Command Language)
PCL
    PCL  
PCX
(PaintBrush Bitmap Image file)
PCX Image     PCX  
Photoshop Image PhotoShop Image
    PSD  
Portable Network Graphics Format (PNG) Portable Network Graphics Image   1.0, 1.1, 1.2 PNG ID and metadata only
Pro/ENGINEER Assembly Pro/ENGINEER Assembly *   ASM  
Pro/ENGINEER Drawing Pro/ENGINEER Drawing *   DRW  
Pro/ENGINEER Drawing Form Pro/ENGINEER Drawing Form *   FRM  
Pro/ENGINEER Model Part Pro/ENGINEER Model Part *   PRT  
Process Monitor Log Process Monitor Log
       
Professional Write for DOS ProWrite   1, 2 PW, PW1, PW2  
Professional Write Plus for Windows ProWrite   1 PW  
Progressive JPEG JPEG Image     JPEG, JPG ID and metadata only
Q&A Write QA3   3, 4 (Classic), 5 QA, QA3  
QuarkXpress QuarkXpress *   QXx, QCx  
Quattro Pro Quattro Pro Spreadsheet     QPW  
QuickBooks Backup QuickBooks Backup *   QBB  
QuickBooks for Windows QuickBooks for Windows *   QBW  
QuickTime Apple QuickTime * 1.x-X MOV  
Real Media Real Media     RM ID and metadata only
RedHat Package Manager RedHat Package Manager Archive     RPM  
Rich Text Format Rich Text Format   1.0, 1.3, 1.5, 1.6, 1.7, 1.8, 1.9.1 RTF  
Roshal Archive
application/x-rar-compressed,
Roshal Archive (RAR)
  1.5, 2.0, 2.9, 5 RAR  
Roshal Archive (Multi-part) application/x-rar-compressed,
RAR Multi-Part File
    RAR  
Roshal Archive, Self-extracting archive, SFX RAR Archive (SFX)     EXE  
Scalable Vector Graphic SVG Image     SVG  
SciTex CT
(Scitex Continuous Tone)
SciTex CT
    CT  
Self-extracting .exe Microsoft Windows Executable     EXE  
Sendmail "mbox" Sendmail MBOX     MBOX  
SGI Image (Silicon Graphics Image) SGI Image        
SGML Text SGML     SGML  
Signature Signature        
Source Source Code        
SQLite Database SQLite DB *      
StarOffice Calc     8, 9 SXC, SXS, ODS  
StarOffice Impress 3     8, 9 SXI, SDI, SDP  
StarOffice Writer     8, 9 SXW, SDW  
StarView Metafile StarView Metafile (SVM)     SVM  
STEP 3D CAD STEP 3D CAD     STP, IFC  
Stereolithography CAD (Binary) Stereolithography CAD (Binary) *   STL  
Stereolithography CAD (Text) Stereolithography CAD (Text)     STL  
StuffIt Stuffit Archive *   SIT  
StuffIt Self Extracting Archive StuffIt Self Extracting Archive *   SEA, EXE  
StuffIt X StuffitX Archive *   SITX  
Sun Raster Image Sun Raster Image
    RAS  
Symbian Executable Symbian executable *      
SysInternals ProcMon Logs   *   PML  
Tagged Image File Format (TIFF) Tagged Image File Format   Revision 3.0-5.0 TIF, TIFF  
Targa TGA Image     TGA  
Thunderbird     1, 1.5, 2.x, 3.x MBOX Would be considered a type of email.
Transcript Transcript        
TrueType Font TrueType Font *   TTF  
Unicode MBCS Text (MBCS)        
Unicode UTF8 Text UTF8        
Unicode UTF16LE          
Unicode UTF16BE          
Unicode UCS2LE Text UCS2        
Unicode UCS2BE Text UCS2        
Unicode UCS4LE Text (UCS4)        
Unicode UCS4BE Text (UCS4)        
Uniplex Uniplex        
UNIX AR Archive Unix AR Archive *   A  
UNIX cpio UNIX cpio Archive     CPIO  
UNIX Tar application/x-tar,
Tape Archive (TAR)
    TAR  
UNIX Compress Archive application/x-compress,
UNIX Compress Archive
    Z  
UUEncode Uuencode     UUE  
vCalendar vCal     VCS  
vCard vCard   2.1 VCF  
Visual Studio SUO file Visual Studio SUO file *   SUO  
Wang IWP WangWP     IWP  
Wang WP Plus WangWPplus     IWP  
Waveform Audio File Format (WAVE) Wave Audio     WAV, AIFF ID and metadata only
Wavefront OBJ Wavefront OBJ     OBJ  
WebP WEBP Image   0.4.2 WEBP ID and metadata only
Windows Cursor Microsoft Windows Cursor     CUR  
Windows Icon Windows Icon     ICO  
Windows Enhanced Meta File Microsoft Windows Enhanced Metafile     EMF  
Windows Media Audio Microsoft Windows Media   WMT 4.0, WMA 2, 7, 8, 9 WMA ID and metadata only
Windows Media Video Microsoft Windows Media   WMV 7, 9 WMV ID and metadata only
Windows Meta File Microsoft Windows Metafile     WMF  
Windows Resource File Windows Resource File *   RES  
Windows Thumbs.db Thumbs.db,
Windows Thumbnail Cache
*   DB There are two possible values that could be reported for this.
In the Document Viewer, the HTML tab displays a code to indicate that the HTML view is unavailable.
Windows Write Microsoft Windows Write     WRI  
WinWord Microsoft Word for Windows   6 DOC  
Wireless Bitmap Image Wireless Bitmap Image     WBMP  
WordPerfect 4.2 WordPerfect42   4.2 WPF  
WordPerfect for DOS WordPerfect   3,4,5,6 WPD  
WordPerfect for Macintosh WordPerfect   1.0-1.0.7, 2.0, 2.1, 3.0, 3.1, 3.5, 3.5e WPD  
WordPerfect for Mac 1 WordPerfect for Mac 1        
WordPerfect for Windows WordPerfect   5.1-12.0, X3, X4, X5, X6, X7, X8 WPD, WP5  
Word Perfect Graphic Word Perfect Graphic        
WordStar 5.0 WordStar5   5 WS5  
WordStar 2000 for DOS WordStar2000   01/03/11 WS2, DOC  
WordStar for DOS WordStar   3.x-7 WS, WSx  
WordStar for Windows WordStar   1 WSD  
WordStar for Windows WordStar * 2 WSD  
X-Windows xbitmap X-Windows xbitmap     XBM  
X-Windows pixmap X-Windows pixmap     XPM  
XML XML (document)   Document File    
XML     Record View    
XWindows Dump XWindows Dump
    DMP  
XXEncode XXEncoded File     XXE  
XYwrite XYwrite   I-III+, 4.0, Windows XY  
XZ XZ Archive     XZ  
Zip application/zip,
Zip Archive
  PKZip, WinZip ZIP  
Zip (Multi-part) application/zip,
Zip Archive
    ZIP  
Zip Self-extracting archive, SFX ZIP Archive (SFX)     EXE  

The following table summarizes two special file types for Parsing Library V2.

Special File Type Associated Parsing Status Additional Notes
Corrupt File 00009 FILE_READ This special file type is for a file that the Parsing Library V2 software tries to identify but that fails the identification process for some reason, causing the file to be classified as a corrupt file.

Files that are able to be identified and assigned their respective file type but are damaged in some way will report 00028 FILE_DAMAGED instead.
Empty File 00017 FILE_ZERO_LENGTH This special file type is for a file that the Parsing Library V2 software tries to identify but determines is an empty file with 0 bytes of data.

 

Supported File Types for Legacy Parsing Library Version V1

The following table provides a list of the standard file types that were supported by the legacy Parsing Library, Parsing Library Version V1. Many supported full content extraction. File types that were supported for file identification only but not content extraction are identified with an asterisk (*) in the table; these file types were associated with a parsingstatus of 00021 FILE_NOT_SUPPORTED.

Note: This V1 table is a list of standard file types supported by V1 and is not intended to identify the actual string, or proper name, displayed by the software in the file type report information and document metadata.

Supported V1 file types that were modified by Digital Reef to provide more information are identified in the last column. Not included is directory, which is also a file type.

Note: Even after migration of V1 Projects to V2, it may still see V1 file types reported in cases where there is no V2 equivalent, or when there is a complicating factor regarding behavior.

Supported File Type File ID Only Modified by Digital Reef/Notes
.ARC File *  
.COM File *  
7z Archive File   Yes, as application/7zip
Adobe Acrobat (PDF)    
Adobe Illustrator    
Adobe Illustrator 9    
Adobe Indesign    
Adobe Indesign Interchange    
Adobe Photoshop    
Adobe Photoshop Large Document Format *  
Advanced System Format    
Ami    
Ami [Clip] *  
Ami Pro Snapshot    
Ami Professional    
Ami Professional Draw    
AOL Messenger Log File *  
Apache Office 3.x Calc (ODF 1.2)    
Apache Office 3.x Draw (ODF 1.2)    
Apache Office 3.x Impress (ODF 1.2)    
Apache Office 3.x Writer (ODF 1.2)    
Apache Office 4.x Calc (ODF 1.2)    
Apache Office 4.x Draw (ODF 1.2)    
Apache Office 4.x Impress (ODF 1.2)    
Apache Office 4.x Writer (ODF 1.2)    
AppleDouble    
Apple iWork Keynote File   (includes Apple iWork 9 for iPad support)
Apple iWork 2013/2014 Keynote  
Apple iWork 2013/2014 Numbers File *
Apple iWork 2013/2014 Pages File *
Apple iWork Keynote File Preview   (includes Apple iWork 9 for iPad support)
Apple iWork Numbers File   (includes Apple iWork 9 for iPad support)
Apple iWork Numbers File Preview   (includes Apple iWork 9 for iPad support)
Apple iWork Pages File   (includes Apple iWork 9 for iPad support)
Apple iWork Pages File Preview   (includes Apple iWork 9 for iPad support)
Apple Mail 2.0 Message (EMLX)   Yes, email file type with auxfiletype emlx
Arehangeul *  
ASCII Text    
AutoCAD 2.5 Drawing    
AutoCAD 2.6 Drawing    
AutoCAD 2004    
AutoCAD 2007/2008/2009 Drawing    
AutoCAD 2010/2011/2012 Drawing    
AutoCAD 2013/2014/2015 Drawing    
AutoCAD Drawing - Unknown Version *  
AutoCAD Drawing 9    
AutoCAD Drawing 10    
AutoCAD Drawing 12    
AutoCAD Drawing 13    
AutoCAD Drawing 14    
AutoCAD Drawing 2000    
AutoCAD DXB *  
AutoCAD DXF (ASCII)    
AutoCAD DXF (Binary)    
AutoDesk DWF *  
AutoDesk DWF Archive File    
AutoShade (RND)    
AvantGo HTML *  
Bentley Microstation DGN    
BinHex Encoded (Continued Part)    
BinHex Encoded (Text)    
Calendar (Text)    
CALS Raster File Format    
Candy 4 *  
CCITT Group 3 (Fax)    
CEO Decision Base    
CEO Spreadsheet    
CEO Word    
CEO Write    
CGM Graphic Metafile    
Chicago WordPad    
Clear Signed S/MIME (Secure/MIME)    
Compact HTML (CHTML) *  
Computer Graphics Metafile    
Corel Draw 10 *  
Corel Draw 11 *  
Corel Draw 12 *  
Corel Draw 2.0    
Corel Draw 3.0    
Corel Draw 4.0    
Corel Draw 5.0    
Corel Draw 6.0    
Corel Draw 7.0    
Corel Draw 8.0    
Corel Draw 9.0    
Corel Draw X4-X7  
Corel Draw X4-X7 Template  
CorelDraw ClipArt    
Corel Presentations 7.0 - 12.0 / X3-5    
Corel Presentations X4 - X7  
DataEase 4.x    
DBase III    
DBase IV/V    
DCX    
DEC DX 3.0 and below    
DEC DX 3.1    
Desktop Services Store    
Digital Imaging and Communications in Medicine (DICOM) File *  
Domino XML Document    
DRM protected Microsoft Excel *  
DRM protected Microsoft Excel 2007/2008 *  
DRM protected Microsoft PowerPoint *  
DRM protected Microsoft PowerPoint 2007/2008 *  
DRM protected Microsoft Word *  
DRM protected Microsoft Word 2007/2008 *  
DRM protected Unknown *  
Dual PowerPoint 95/97    
eFax Document *  
Embedded Graphic    
Enable Spreadsheet    
Enable Word Processor 3.x    
Enable Word Processor 4.x    
Enhanced Windows Metafile    
Envoy *  
Envoy 7 *  
Escher    
Europa Fulcrum *  
Excel 2.x Chart    
Excel 2000 Save As... HTML    
Excel 2013 Add-in Macro    
Excel 3.0 Chart    
Excel 4.0 Chart    
Excel 5.0/7.0 Chart *  
Excel Macro Enabled    
Excel Template 2013    
Excel Template Macro Enabled 2013    
Executable    
Export Image *  
Extensible Metadata Platform *  
File Identification: None *  
File sealed by Oracle IRM *  
First Choice (Database)    
First Choice (SpreadSheet)    
First Choice WP    
Flexiondoc v4.0 (XML)    
Flexiondoc v5.0 (XML)    
Flexiondoc v5.1 (XML)    
Flexiondoc v5.2 (XML)    
Flexiondoc v5.3 (XML)    
Flexiondoc v5.4 (XML)    
Flexiondoc v5.5 (XML) *  
Flexiondoc v5.6 (XML)    
Flexiondoc v5.7 (XML)    
FrameMaker   MIFF only
FrameMaker 3.0   MIFF only
FrameMaker 3.0 Japanese   MIFF only
FrameMaker 4.0   MIFF only
FrameMaker 4.0 Japanese   MIFF only
FrameMaker 5.0   MIFF only
FrameMaker 5.0 Japanese   MIFF only
FrameMaker 5.5   MIFF only
FrameMaker 6.0   MIFF only
FrameMaker 6.0 Japanese   MIFF only
FrameMaker Graphic   MIFF only
Framework III    
Freelance    
Freelance 96 and for Windows    
FulText Document Format    
GDSF Bitmap    
GEM Bitmap (IMG)    
GEM File    
Generic DXL    
Generic Password Protected Microsoft Office 2007 Document *  
Generic WKS    
GIF    
GNU Zip    
Graphics Data Format    
Hana *  
Hanako 1.x *  
Hanako 2.x *  
Handheld Device Markup Language (HDML) *  
Hangul 2002 - 2010 Word Processor    
Hangul 97 Word Processor    
Harvard Graphics 2.x Chart    
Harvard Graphics 3.x Chart    
Harvard Graphics for Windows    
Harvard Graphics 98 *  
Harvard Graphics 3.0 Presentation    
HP Gallery    
HP Plotter Graphic Language    
IBM DCA/FFT   Final Form Text
IBM DCA/RFT    
IBM DisplayWrite 2 or 3    
IBM DisplayWrite 4    
IBM DisplayWrite 5    
IBM Picture Interchange Format    
IBM/Lotus Symphony Document (ODF 1.1)    
IBM/Lotus Symphony Presentation (ODF 1.1)    
IBM/Lotus Symphony Spreadsheet (ODF 1.1)    
IBM Writing Assistant    
Ichitaro 3.x *  
Ichitaro 4.x/5.x/6.x    
Ichitaro 8.x-13.x/2004-2014    
ID3 Ver 1.x    
ID3 Ver 2.x    
IGES Drawing File Format    
Interchange Format *  
Interleaf ASCII Format *  
Interleaf bitmap ver 18    
Interleaf bitmap ver 20    
Interleaf Japanese Format *  
Internet HTML    
Internet HTML ( Unicode)    
Internet Mail Message    
Internet News Message    
ISO Base Media File *  
Java Class File *  
JBIG2 Bitmap    
JPEG 2000    
JPEG 2000 jpf Extension    
JPEG 2000 mj2 Extension *  
JPEG File Interchange    
JustWrite 1.0    
JustWrite 2.0    
Kingsoft Office Spreadsheet File    
Kingsoft Office Writer File    
Kodak FlashPix    
Kodak Photo CD    
Legacy    
Legacy [Clip] *  
Libre Office 3.x Calc (ODF 1.2)    
Libre Office 3.x Draw (ODF 1.2)    
Libre Office 3.x Impress (ODF 1.2)    
Libre Office 3.x Writer (ODF 1.2)    
Libre Office 4.x Calc (ODF 1.2)    
Libre Office 4.x Draw (ODF 1.2)    
Libre Office 4.x Impress (ODF 1.2)    
Libre Office 4.x Writer (ODF 1.2)    
Lotus 1-2-3 97 Edition    
Lotus 1-2-3 98 Edition    
Lotus 1-2-3 for OS/2 Chart    
Lotus 1-2-3 OS/2 Release 2    
Lotus 1-2-3 Release 1    
Lotus 1-2-3 Release 2    
Lotus 1-2-3 Release 3    
Lotus 1-2-3 Win Release 4/5    
Lotus Data Interchange Format *  
Lotus Manuscript 1    
Lotus Manuscript 2    
Lotus Notes Database (NSF)    
Lotus Notes Database R6.x *  
Lotus PIC    
Lotus screen snapshot (Text and BMP)    
WordPro 97/Millenium    
LZH Compress   Identified and parsed as application/x-lzh-compressed
Mac Excel4 Workbook    
Mac PowerPoint 3.0 (Mac and MacB3)    
Mac PowerPoint 4.0 (Mac and MacB4)    
Mac PowerPoint 4.0 (extracted from docfile)    
Mac Word 3.0 *  
Mac Word 4.0    
Mac Word 5.0    
Mac Word 97 *  
Mac WordPerfect 1.0    
Mac WordPerfect 2.0    
Mac WordPerfect 3.0    
Mac Works 2.0 (DB)    
Mac Works 2.0 (SS)    
Mac Works 2.0 WP    
Macintosh Bitmap Embedding    
Macintosh Paint    
Macintosh Picture    
Macintosh Picture ver.1    
Macintosh Picture ver.2    
Macromedia Director *  
Macromedia Flash 10 *  
Macromedia Flash 4-8    
Macromedia Flash 6    
Macromedia Flash 9 *  
MacWrite II    
Mail Archive DXL    
Mail Message DXL    
Mail Rule DXL    
Mass 11    
Mass 11 Vax    
Matsu 4 *  
Matsu 5 *  
mbox(RFC-822 mailbox)    
MHTML    
Micrografx Designer    
Micrografx Designer (7) *  
Micrografx Graphics Format    
Microsoft Access    
Microsoft Access 2000/2002/2003 *  
Microsoft Access 2007/2010/2013 *  
Microsoft Access 2007/2010/2013 Template File    
Microsoft Access 95/97 *  
Microsoft Access Snapshot File *  
Microsoft Access Web Database *  
Microsoft Cabinet File   Identified and parsed as an archive (part of docclass::archive)
Microsoft Digital Video Recording    
Microsoft Excel    
Microsoft Excel 2000    
Microsoft Excel 2002    
Microsoft Excel 2003    
Microsoft Excel 2007 Excel Add-in Macro File    
Microsoft Excel 2007/2008    
Microsoft Excel 2007/2008 Binary    
Microsoft Excel 2007/2008 Macro Enabled Template    
Microsoft Excel 2007/2008 Macro Enabled Workbook    
Microsoft Excel 2007/2008 Template    
Microsoft Excel 2010 Binary    
Microsoft Excel 2010 Excel Add-in Macro File    
Microsoft Excel 2010 Macro Enabled Template    
Microsoft Excel 2010 Macro Enabled Workbook    
Microsoft Excel 2010 Template    
Microsoft Excel 2010 Workbook    
Microsoft Excel 2013    
Microsoft Excel 2013 Binary    
Microsoft Excel 2016 Binary  
Microsoft Excel 2016 Excel Add-in Macro File  
Microsoft Excel 2016 Macro Enabled Template  
Microsoft Excel 2016 Macro Enabled Workbook  
Microsoft Excel 2016 Template  
Microsoft Excel 2016 Workbook  
Microsoft Excel 3    
Microsoft Excel 4    
Microsoft Excel 4 (MAC)    
Microsoft Excel 5    
Microsoft Excel 5 (MAC)    
Microsoft Excel 97/98/2004    
Microsoft Excel XML 2003    
Microsoft Excel XML 2007-2016  
Microsoft Exchange Database * For EDB file identification, applies to Microsoft Exchange Database files or other Microsoft ESE Database files.
Microsoft InfoPath File *  
Microsoft Live Messenger Log File    
Microsoft Multiplan 4.x    
Microsoft Office Binder    
Microsoft Office Theme File *  
Microsoft OneNote File   Text only
Microsoft OneNote Package   Text only
Microsoft OneNote SOAP/HTTP File *
Microsoft OneNote Table of Contents File    
Microsoft Outlook PST/OST 2003/2007/2010/2013    
Microsoft Outlook PST/OST 97/2000/XP    
Microsoft Outlook for Mac 2011   Not treated as a mail archive, so no support for individual email extraction.
Microsoft Outlook PAB *  
Microsoft Pocket Word    
Microsoft PowerPoint 2 *  
Microsoft PowerPoint 97-2004    
Microsoft PowerPoint 2000/2003    
Microsoft PowerPoint 2007/2008    
Microsoft PowerPoint 2007/2008 Macro Enabled Presentation    
Microsoft PowerPoint 2007/2008 Macro Enabled Slideshow    
Microsoft PowerPoint 2007/2008 Macro Enabled Template    
Microsoft PowerPoint 2007/2008 Slideshow    
Microsoft PowerPoint 2007/2008 Template    
Microsoft PowerPoint 2010/2011    
Microsoft PowerPoint 2010 Macro Enabled Presentation    
Microsoft PowerPoint 2010 Macro Enabled Slideshow    
Microsoft PowerPoint 2010 Macro Enabled Template    
Microsoft PowerPoint 2010 Slideshow    
Microsoft PowerPoint 2010 Template    
Microsoft PowerPoint 2013    
Microsoft PowerPoint 2016  
Microsoft PowerPoint 2016 Macro Enabled Presentation  
Microsoft PowerPoint 2016 Macro Enabled Slideshow  
Microsoft PowerPoint 2016 Macro Enabled Template  
Microsoft PowerPoint 2016 Slideshow  
Microsoft PowerPoint 2016 Template  
Microsoft Project 2000/2002/2003    
Microsoft Project 2002    
Microsoft Project 2007    
Microsoft Project 2010    
Microsoft Project 2013 *
Microsoft Project 98    
Microsoft PST/OST 97-2007    
Microsoft PST/OST 2003/2007/2010    
Microsoft Publisher 2000/2003 *  
Microsoft Publisher 2007 *  
Microsoft Rich Text Format    
Microsoft Visio 2003/2007/2010    
Microsoft Visio 2013    
Microsoft Visio 2013 Macro Enabled Drawing    
Microsoft Visio 2013 Macro Enabled Stencil    
Microsoft Visio 2013 Macro Enabled Template    
Microsoft Visio 2013 Stencil    
Microsoft Visio 2013 Template    
Microsoft Visio XML *  
Microsoft Windows Explorer Command File *  
Microsoft Windows Write    
Microsoft Word 2000    
Microsoft Word 2002    
Microsoft Word 2003/2004    
Microsoft Word 2007/2008    
Microsoft Word 2007/2008 Macro Enabled Document    
Microsoft Word 2007/2008 Macro Enabled Template    
Microsoft Word 2007/2008 Template    
Microsoft Word 2010    
Microsoft Word 2010 Macro Enabled Document    
Microsoft Word 2010 Macro Enabled Template    
Microsoft Word 2010 Template    
Microsoft Word 2013    
Microsoft Word 2013 Macro Enabled Document  
Microsoft Word 2013 Macro Enabled Template    
Microsoft Word 2013 Template    
Microsoft Word 2016  
Microsoft Word 2016 Macro Enabled Document  
Microsoft Word 2016 Macro Enabled Template  
Microsoft Word 2016 Template  
Microsoft Word 4    
Microsoft Word 5    
Microsoft Word 6    
Microsoft Word 97/98    
Microsoft Word Picture    
Microsoft Word XML 2003    
Microsoft Word XML 2007-2016  
Microsoft Works (MAC) 2.0    
Microsoft Works (Windows)    
Microsoft Works (Windows) 3    
Microsoft Works (Windows) 4    
Microsoft Works 1.0    
Microsoft Works 2.0    
Microsoft Works 2000 *  
Microsoft XML Paper Specification    
MIDI File *  
Mime File    
MIFF 3.0    
Mosaic Twin    
MPEG Layer3 *  
MPEG Layer3 ID3 Ver 1.x    
MPEG Layer3 ID3 Ver 2.x    
MPEG-1 audio - Layer 1    
MPEG-1 audio - Layer 2    
MPEG-1 audio - Layer 3    
MPEG-1 video *  
MPEG-2 audio - Layer 1    
MPEG-2 audio - Layer 2    
MPEG-2 audio - Layer 3    
MPEG-2 video *  
MPEG-4 file    
MPEG-7 file    
MS Excel 3.0 Workbook *  
MS Excel 4.0 Workbook *  
MS Excel Mac 4.0 Workbook *  
MS Office 15 (2013) Word - Macro Enabled XML Format    
MS Office 15 (2013) Word Template - Macro Enabled XML Format    
MS Works Database    
MS Works Spreadsheet    
MS Works/Win Database    
MS Works/Win DB 3    
MS Works/Win DB 4    
MS Works/Win Spreadsheet    
MS Works/Win SS 3    
MS Works/Win SS 4    
MultiMate 3.6    
MultiMate 4.0    
MultiMate Advantage II    
MultiMate Note    
Navy DIF    
OASIS OpenDocument v1.0 (XML) *  
Office 4.x Calc (ODF 1.2)    
OfficeWriter    
Open Office 1.x Calc    
Open Office 1.x Draw    
Open Office 1.x Impress    
Open Office 1.x Writer    
Open Office 2.x Calc (ODF 1.1)    
Open Office 2.x Draw (ODF 1.1)    
Open Office 2.x Impress (ODF 1.1)    
Open Office 2.x Writer (ODF 1.1)    
Open Office 3.x Calc (ODF 1.2)    
Open Office 3.x Draw (ODF 1.2)    
Open Office 3.x Impress (ODF 1.2)    
Open Office 3.x Writer (ODF 1.2)    
Oracle Multimedia Internal Raster Format *  
Oracle Open Office 3.x Calc (ODF 1.2)    
Oracle Open Office 3.x Draw (ODF 1.2)    
Oracle Open Office 3.x Impress (ODF 1.2)    
Oracle Open Office 3.x Writer (ODF 1.2)    
OS/2 Bitmap    
OS/2 PM Metafile    
OS/2 v.2 Bitmap *  
OS/2 Warp Bitmap    
Outlook Appointment   Yes
Outlook Appointment Form Template   Yes
Outlook Clear Signed Email    
Outlook Clear Sign Email Form Template    
Outlook Contact   Yes
Outlook Contact Form Template   Yes
Outlook Distribution List    
Outlook Distribution List Form Template    
Outlook Email   Yes
Outlook Email Form Template   Yes
Outlook Journal   Yes
Outlook Journal Form Template   Yes
Outlook Mail Message   Yes
Outlook News Message   Yes
Outlook Non Delivery Report    
Outlook Non Delivery Report Form Template    
Outlook Opaque Signed Email Form Template    
Outlook Opaque Signed Email Opaque    
Outlook Post    
Outlook Post Form Template    
Outlook Sticky Note   Yes
Outlook Sticky Note Form Template   Yes
Outlook Task   Yes
Outlook Task Form Template   Yes
P1 Japan *  
Paintbrush    
Paint Shop Pro Format    
Paradox Version 2/3    
Paradox Version 3.5    
Paradox Version 4    
Password Protected Microsoft Excel 2007/2008    
Password Protected Microsoft Excel 2007/2008 Binary    
Password Protected Microsoft Excel 2010    
Password Protected Microsoft Excel 2010 Binary    
Password Protected Microsoft Excel 2010-2016  
Password Protected Microsoft Excel 2010-2016 Binary  
Password Protected Microsoft PowerPoint 2007/2008    
Password Protected Microsoft PowerPoint 2010-2016  
Password Protected Microsoft PowerPoint 2010    
Password Protected Microsoft Word 2007/2008    
Password Protected Microsoft Word 2010    
Password Protected Microsoft Word 2010-2016  
Password Protected Quattro Pro Win 9.0/X3-X5    
PbM (Portable Bitmap)    
PC File 5.0 - Letter    
PCX    
PDF MacBinary Header *  
PDFI   PDF Image
Perfect Works (for Windows, Picture)    
PFS: First Choice 2.0    
PFS: First Choice 3.0    
PFS: Write A    
PFS: Write B    
PFS: Professional Plan    
PgM (Portable Graymap)    
PKZip    
Pocket Word - Pocket PC *  
Portable Network Graphic Format (PNG)    
Post Script (EPS and Post Script)    
PowerPoint 2000    
PowerPoint 2000 Save As... HTML    
PowerPoint 2013 Macro Enabled    
PowerPoint 2013 Slideshow File    
PowerPoint 2013 Template    
PowerPoint 2013 Template Macro Enabled    
PowerPoint 2013 Slideshow Macro Enabled    
PowerPoint 3.0    
PowerPoint 4.0    
PowerPoint 4.0 (extracted from docfile)    
PowerPoint 7.0    
PpM (Portable Pixmap)    
Pro Write Plus [Clip] *  
Professional Write 1    
Professional Write 2    
Professional Write PLUS    
Progressive JPEG    
PST Fields File    
Q&A Database    
Q&A Write    
Q&A Write 3.0    
QuarkXPress 5.0 For Windows *  
Quattro    
Quattro Pro    
Quattro Pro 10.0 for Windows    
Quattro Pro 11.0 for Windows    
Quattro Pro 12.0 for Windows    
Quattro Pro 4.0 *  
Quattro Pro 5.0    
Quattro Pro 6.0 for Windows    
Quattro Pro Win 7.0 Graph    
Quattro Pro Win 7.0 Notebook    
Quattro Pro 8.0 for Windows    
Quattro Pro for Windows    
Quattro Pro Win X4    
Quattro Pro Win X5    
Quattro Pro Win X6    
Quattro Pro Win X7  
Quattro Pro Windows Japan *  
QuickBooks Backup *  
QuickFinder *  
Quicktime Movie    
QXD Mac 3.0 *  
QXD Mac 3.1 *  
QXD Mac 3.2 *  
QXD Mac 3.3 *  
QXD Win 3.3 *  
QXD Mac 4.x *  
QXD Win 4.x *  
R:Base File 1 *  
R:Base File 3 *  
Rainbow *  
RAR (RAR and EXE)   Yes, as application/x-rar-compressed
RBase    
RBase 5000    
Real Audio / Real Video *  
Reflex 2.0 Database    
Resource Interchange File Format *  
Rich Text Format Japan *  
S/MIME (Secure/MIME)    
Samna    
Samsung Jungum File *  
Scalable Vector Graphics File    
Self extracting 7z Archive File    
Self extracting LZH File    
Self extracting PKZip File    
Smart DataBase    
Smart Spreadsheet    
SmartWare II    
Sprint    
StarOffice Calc 6 and 7    
StarOffice Draw 6 and 7    
StarOffice Draw 8    
StarOffice Impress 6 and 7    
StarOffice Writer 6 and 7    
StarOffice 8 Calc    
StarOffice 8 Impress    
StarOffice 8 Writer    
StarOffice 9 Calc (ODF 1.2)    
StarOffice 9 Draw (ODF 1.2)    
StarOffice 9 Impress (ODF 1.2)    
StarOffice 9 Writer (ODF 1.2)    
StarOffice Calc    
StarOffice Impress    
StarOffice Writer    
StarView Metafile *  
Strict Open XML 2013 Document *  
Strict Open XML 2013 Presentation *  
Strict Open XML 2013 Spreadsheet *  
Strict Open XML 2016 Document  
Strict Open XML 2016 Presentation  
Strict Open XML 2016 Spreadsheet  
StuffIt *  
SunRaster Format    
SuperCalc 5    
Symphony 1.0    
Tagged Image File Format (TIFF)    
Tar   Yes, as application/x-tar
Text - 7-Bit Text   Also applies to Bloomberg Message Dump Text files
Text - Arabic (ANSI 1256)    
Text - Arabic (ASMO-708)    
Text - Arabic (DOS 720)    
Text - Arabic (ISO 8859-6)    
Text - Arabic (Mac)    
Text - Baltic (ANSI 1257)    
Text - Baltic (ISO 8859-4)    
Text - C Europe (ANSI 1250)    
Text - C Europe (ISO 8859-2)    
Text - C Europe (Mac)    
Text - C Europe (DOS 852)    
Text - Chinese (Big 5)    
Text - Chinese (GB)    
Text - Chinese Simplified    
Text - Chinese Traditional    
Text - Cyrillic (ANSI 1251) (1251 and Windows)    
Text - Cyrillic (DOS 855)    
Text - Cyrillic (ISO 8859-5)    
Text - Cyrillic (KOI8-R) (KOI8 and KO18-R)    
Text - Cyrillic (Mac)    
Text - EBCDIC 273    
Text - EBCDIC 277    
Text - EBCDIC 278    
Text - EBCDIC 285    
Text - EBCDIC 37    
Text - EBCDIC 500    
Text - EBCDIC 870    
Text - EBCDIC Text    
Text - French (EBCDIC 297)    
Text - Greek (ANSI 1253)    
Text - Greek (ISO 8859-7)    
Text - Greek (Mac)    
Text - Hebrew (7-bit)    
Text - Hebrew (ANSI 1255)    
Text - Hebrew (DOS OEM 862)    
Text - Hebrew (IBM PC8)    
Text - Hebrew (ISO 8859-8)    
Text - Hebrew (VAX E0)    
Text - Icelandic (EBCDIC 871)    
Text - Italian (EBCDIC 280)    
Text - Japanese (EUC)    
Text - Japanese (JIS)    
Text - Japanese (Mac)    
Text - Japanese (ShiftJIS)    
Text- Japanese JIS    
Text - Korean (ANSI 1361 Johab)    
Text - Korean (ANSI 949)    
Text - Korean (Hangul)    
Text - Russian (DOS OEM 866)    
Text - Spanish (EBCDIC 284)    
Text - Thai (Windows ANSI 874)    
Text - Turkish (ANSI 1254))    
Text - Turkish (DOS OEM 857    
Text - Turkish (EBCDIC 1026)    
Text - Turkish (ISO 8859-9)    
Text - Turkish (Mac)    
Text - Unknown (DOS Latin 2)    
Text - Vietnamese ( ANSI 1258)    
Text - Western (ANSI 1252)    
Text - Western (ISO 8859-1)    
Text - Western (Mac)    
Text (ANSI 7-bit)    
Text (ANSI 8-bit)    
Text (ASCII 7-bit)    
Text (ASCII 8-bit)    
Text (Latin2)    
Text (MAC 7-bit)    
Text (MAC 8-bit)    
Text (Unicode)    
Text (UTF-8)    
Text Mail    
TotalWord    
Transport-Neutral Encapsulation Format (TNEF)    
Trillian Text Log File    
Trillian XML Log File *  
TrueType (MAC) Font File *  
TrueType Font Collection File *  
TrueType Font File *  
Truevision TARGA    
UNIX Compress   Yes, as application/x-compress
Unknown format   This format is used when the software cannot determine the file type.
UUE Encoded (Continued Part)    
UUE Encoded (Text)    
vCalendar    
vCard    
Visio 2003/2007    
Visio 2000    
Visio 3.x *  
Visio 4.0    
Visio 5.0    
Visio 6.x    
Volkswriter    
VP-Planner    
W P Presentations    
Wang IWP    
Web Clipping Application (WCA) HTML *  
WebP Image *
Windows 98/2000 Bitmap    
Windows Excel Bitmap    
Windows Bitmap    
Windows Clipboard File *  
Windows Compiled Help File *  
Windows Cursor    
Windows DIB    
Windows Help File *  
Windows Icon    
Windows Media Audio    
Windows Media Player Playlist *  
Windows Media Video    
Windows Meta File    
Windows shortcut *  
Windows Sound    
Windows Video    
Windows Thumbnail Cache    
Windows Works SS *  
WinWord Bitmap    
Wireless Bitmap    
Wireless HTML *  
WOFF Font File  
WOFF2 Font File  
Word 2000 Save As... HTML    
Word for Windows 1.0    
Word for Windows 1.2 J *  
Word for Windows 1.x *  
Word for Windows 2.0    
Word for Windows 2.0 (OLE)    
Word for Windows 5.0 J *  
Word for Windows 6 Meta Picture    
Word for Windows 6.0    
Word for Windows 7.0    
Word for Windows Meta File    
WORDMARC    
WordPerfect 10 Graphic (WPG)    
WordPerfect 2 Graphic (WPG)    
WordPerfect 4    
WordPerfect 5 Europa    
WordPerfect 5.0    
WordPerfect 5.1    
WordPerfect 5.1 Japan    
WordPerfect 6.0    
WordPerfect 6.1 - 12.0 / X3-7  
WordPerfect 7.0    
WordPerfect 8.0    
WordPerfect Encrypted    
WordPerfect Graphic 7.0 - 12.0/X3-5  
WordPerfect Graphic (WPG)    
WordPerfect Informs 1.0 *  
WordPro 96 *
WordPro 97/Millennium  
WordStar 2000    
WordStar 4 and below    
Wordstar 5    
Wordstar 5.5 *  
Wordstar 6    
Wordstar 7    
Wordstar for Windows    
WP/Novell Unknown Format *  
WPG1 Embedded Bitmap (WPG)    
WPG2 Embedded Bitmap (WPG)    
WPS+    
XHTML Basic *  
XML    
XML With Doctype HTML    
X-Windows Bitmap    
X-Windows Dump    
X-Windows Pixmap    
XXE Encoded Data (Continued Part)    
XXE Encoded Data (Text)    
XyWrite / Nota Bene (Write and Signature)    
Yahoo! Instant Messenger    
YEnc Encoded Data (Continued Part)    
YEnc Encoded Data (Text)    

 

 

Unsupported Character Encodings

Please be aware that the third-party parsing library used by Digital Reef does not currently support the following encodings:

Note: These encodings are not supported in all 4.3.x releases and 5.x releases.

  • KOI8-R
  • GB18030
  • ISO-2022-JP
  • ISO-2022-KR
  • ISO-2022-CN

Lack of support for these encodings means that documents with these encodings will not render properly in the Digital Reef Document Viewer, and the converted HTML file and text versions at export will not render the documents with these encodings as expected.

MIME Support

Supported MIME formats include the following:

  • EML
  • MHT (Web Archive)
  • NWS (Newsgroup single-part and multi-part)
  • Simple Text Mail (defined in RFC 2822)
  • TNEF Format

Supported MIME Encodings

Supported MIME encodings include the following:

  • base64 (defined in RFC 1521)
  • binary (defined in RFC 1521)
  • binhex (defined in RFC 1741)
  • btoa
  • quoted-printable (defined in RFC 1521)
  • utf-7 (defined in RFC 2152)
  • uue
  • xxe
  • yenc
  • Message body encodings supported:
  • Text
  • HTML
  • RTF
  • TNEF Text/enriched (defined in RFC 1523)
  • Text/richtext (defined in RFC 1341)
  • Embedded mail message (defined in RFC 822)

Note: After you add a Data Set, you can view the Data Set Reports tab, which includes a Scan Summary and Warnings and Errors report. These reports indicate how the parsing process handled the target files.

See also: