Obsolescence Notice

The version described in this document is obsolete and should not be used for new applications anymore.

The links to the Javadoc in this document are non-functional because the package names have been changed in TrueZIP 7. The Javadoc for TrueZIP 6.8.4 is now available for download via Maven Central.

Introduction to TrueZIP 6

TrueZIP is a Java based Virtual File System (VFS) to enable transparent, multi-threaded read/write access to archive files (ZIP, TAR etc.) as if they were directories. Its primary features are:

Easy To Use
TrueZIP features drop-in replacements for the java.io.File|FileInputStream|FileOutputStream classes. If you know how to deal with these classes, you can instantly use TrueZIP - though you should really, really read the tutorial in order to bypass some common pitfalls and power up your client applications with some advanced, yet easy to use I/O tricks.
Thread-Safe
Multiple threads can read and write the same archive file at the same time. Where restrictions apply, TrueZIP enforces them to prevent client applications from corrupting archive files.
Unlimited Nesting
Client applications can create and access nested archives up to a virtually unlimited nesting level: "outer.zip/inner.tar.gz/nuts.jar/META-INF/LICENSE.TXT" is a perfectly valid path name.
Strong Cryptography
TrueZIP's Random Access Encryption Specification (RAES) features 256 bit AES encryption in CTR block mode with SHA-256 authentication. Note that RAES can be applied to any data payload - not just ZIP files.
Extensible
TrueZIP can support virtually any archive type via its pluggable archive driver architecture. TrueZIP ships with archive drivers for ZIP, TAR and many relatives (JAR, TAR.GZ, TAR.BZ2, TZP, ...).
Robust
TrueZIP ships with numerous assertions and comprehensive unit tests to ensure maximum reliability. Companies like JBoss, Vignette and many more rely on TrueZIP to deploy applications and content.
Fast
Despite the startup time of the JVM and despite the overhead required to implement a VFS, TrueZIP's nzip utility main class unzips the sources for JDK1.5.0_09 even slightly faster than 7-Zip - details to follow in a future article on this web site.
Business Friendly
Covered by the Apache Software License, Version 2.0.
Pure Java
No native code incorporated.

Motivation

Although the ZIP File Format Specification originated as a proprietary, de-facto standard by PKWARE Inc., ZIP files are ubiquitous on the Internet and thus on many platforms today. The Java SE API provides the package java.util.zip with classes like ZipInputStream, ZipOutputStream and ZipFile for convenient access to ZIP compatible files. However, using the package java.util.zip has some limitations/disadvantages:

  • Before JSE 7, the API always used UTF-8 for entry names and comments instead of CP437 (a.k.a. IBM437, the genuine IBM-PC character set), which is used by the de-facto standard PKZIP from PKWARE. As a result, you couldn't read or write ZIP files with international entry file names such as e.g. "Motörhead.xml" in a ZIP file.
  • Before JSE 7, you could not read or write ZIP64 archive files.
  • You would need to use an additional API (the package java.util.zip) if your application needs to support ZIP compatible files in addition to ordinary files and directories, thereby adding more dependencies and complexity, which requires more work and potential bugs.
  • You could either read or completely write ZIP compatible files, but you cannot update individual entries.
  • The classes would not support the concept of a directory, which you could create, modify, list, rename, recursively copy or delete.
  • You could not browse a ZIP archive file with a JFileChooser or FileSystemView.
  • There is no GUI tree class to browse and edit file systems.

Resolution

The TrueZIP API provides drop-in replacements for the well-known classes File, FileInputStream and FileOutputStream. This concept makes TrueZIP very simple to use: To archive-enable a client application, you only need to add a few import statements for the package de.schlichtherle.io and add some type casts where required.

(Note that the type casts are an artifact of requiring only J2SE 1.4.2 for TrueZIP 6. For the future TrueZIP 7, you will not need any type casts any more, but it will require JSE 5.)

Now you can simply address archive files like directories in a path name. For example, the path name "archive.zip/readme" addresses the archive entry "readme" within the ZIP file "archive.zip". Note that file name suffixes are fully configurable and TrueZIP automatically detects false positives and falls back to treat them like ordinary files or directories. This works recursively, so an archive file may even be enclosed in another archive file, like in "outer.zip/inner.zip/readme".

When required, TrueZIP commits any changes to archive files automatically. A JVM shutdown hook ensures that no changes get lost even if the JVM terminates due to a throwable. Optionally, a client application may call this operation manually using File.umount(), e.g. in order to catch any exceptions from this operation.

TrueZIP features an archive driver interface in order to support virtually any archive type. TrueZIP 6 ships with archive drivers for ZIP, TAR and many relatives (JAR, TAR.GZ, TAR.BZ2, ODF, TZP etc.). Support for even more archive types is expected to be added over time (contributors wanted: 7z, ARJ or RAR anyone?).

Even if a client application never accesses an archive file, using TrueZIP is still beneficial because of enhanced and fast I/O methods in the File class such as recursive renaming, deleting or copying. These operations use asynchronous I/O and hence provide a performance which is equivalent to the method java.nio.Channel.transfer(...), although they operate on plain InputStream and OutputStream objects.

Finally, a client application may just require a simple to use, yet secure file format for encrypted and authenticated data which supports transparent random read access just as if it were reading plain data using a RandomAccessFile-like interface. This is provided by the package de.schlichtherle.crypto.io.

Features

  • The low level ZIP API in the package de.schlichtherle.util.zip supports ZIP64 extensions. This is completely transparent, so that any client application can read or write ZIP archive files of more than 4GB size. By default, they are written to a ZIP file only if they are required so that maximum interoperability with third party tools is retained. ZIP64 extensions may also be enforced - see the Release Notes for TrueZIP 6.7 ).
  • Random access to archive files using drop-in replacements for the classes File|FileInputStream|FileOutputStream.
  • Browse archive files using the drop-in replacements for JFileChooser and FileSystemView.
  • Browse and edit the file system with a custom JTree, including archive files up to any nesting level.
  • Fully transparent access to RAES encrypted ZIP files using AES encryption with up to 256 bit key length.
  • Properly supports nested archive files, i.e. new FileInputStream("dir/outer.zip/anotherdir/inner.zip/readme.txt") is perfectly OK.
  • Asynchronous I/O for ordinary copying of InputStreams to OutputStreams, delivering equivalent performance to the java.nio package.
  • Lots of utility methods for handling File objects (cat(), catFrom(), catTo(), copyFrom(), copyTo(), archiveCopyFrom(), archiveCopyTo() etc.).
  • Supports IBM437 and any other character set encodings for ZIP compatible files with international entry names and comments.
  • Configurable file suffixes for ZIP compatible files (.zip, .jar, .tzp, .tar, .tbz2, whatever).
  • Provides an enhanced ZIP API which is fully interoperable with the genuine java.util.zip package.
  • In general, classes are thread safe with documented exceptions.
  • Requires only J2SE 1.4.2 but benefits from new features in JSE 5 and JSE 6 by detecting classes reflectively (e.g. java.io.Console for password prompting if running headless).
  • Optimized for maximum performance and minimum memory footprint (in order of priority).
  • Lots of unit tests for maximum reliability.

Benefits

  • Treats ZIP entries like files or directories, i.e. you can create, modify or delete each of them individually.
  • One API less to get a headache from (java.util.zip).
  • Minimal integration efforts: Some import statements are usually all that is required to make a client application archive-enabled.
  • Provides fully transparent access to RAES encrypted ZIP files for maximum security.
  • Eases dealing with nested ZIP compatible files by treating them as subdirectories of another ZIP compatible file.
  • Offers enhanced file operations like cat(), copy*(), archiveCopy*() etc.

Limitations / Caveats

  • TrueZIP's various copy methods do not provide automatic path name completion! So if you are going to copy the source file foo to the destination directory bar, then you have to specify the full destination path bar/foo.
  • The RAES encryption is not compatible to WinZip's encryption scheme. This is because of security issues with WinZip's encryption scheme. For more information, please refer to the news section on RAES.
  • TrueZIP may seem to behave erratically if it is loaded and used by multiple class loaders. For each archive file which has been presented to TrueZIP (via the de.schlichtherle.io.File* classes), TrueZIP associates some internal state with it. This data is held in static maps. Either make sure that multiple class loader instances never access the same archive file or prevent the loading of multiple instances by assigning a shared parent class loader which loads all TrueZIP classes first.

Documentation

Getting Started

First and foremost, please read the tutorial for TrueZIP 6. It explains the concept of TrueZIP and explains typical use cases.

Second, I recommend to play around with the nzip utility main class first. For a tutorial about this utility, please read here.