TrueZIP Kernel Usage

Abstract

This article showcases the TrueZIP Kernel API in contrast to the TrueZIP File* API.

Design

In contrast to the modules TrueZIP Path or TrueZIP File*, the API of this module is more complex to use, yet very extensible because it relies on Dependency Injection for the resolution of file system drivers and I/O entry pools. It also provides service locators for file system driver providers and I/O entry pool providers in order to enable applications to resolve these dependencies easily at runtime.

TrueZIP Kernel versus TrueZIP File*

The API of the TrueZIP Kernel module is considerably different to the API of the TrueZIP File* module: While the File* API primarily aims to "make simple things easy", the Kernel API primarily aims to "make complex things possible".

This means that while the Kernel API offers more features than the File* API, it also requires more coding to get a job done. So when is it appropriate to code against the Kernel API instead of the File* API?

  • If you don't want to rely on the service locator pattern which is employed by the File* API in order to enumerate the set of file system drivers which are available on the class path at run-time.
  • If the type of the top level file system your application needs to access is not a platform file system (adressed by the "file" scheme), but any other file system type, e.g. a web service (adressed by the "http" scheme).
  • If you need to access file system specific entry properties, e.g. the comment of an entry in a ZIP file.
  • If you want to reduce the number of kernel calls required to query a set of properties for the same file system entry.

Basic Operations

Suppose you'ld like to imitate the functionality of the cat(1) Unix command line utility. This utility simply concatenates the contents of each parameter path name on the standard output. E.g. the shell command...

$ cat fileA fileB

...would print the contents of fileA and then fileB to standard output.

Here's how to copy the contents of the path name parameter resource to the standard output using the API of the module TrueZIP File*:

/**
 * Copies the contents of the parameter resource to the standard output.
 * <p>
 * The set of archive file suffixes detected by this method is determined
 * by the {@linkplain TConfig#getArchiveDetector default archive detector}
 * and the respective file system driver providers on the class path.
 *
 * @param  resource the path name string of the resource to copy.
 * @throws IOException if accessing the resource results in an I/O error.
 */
static void pathCat(String resource) throws IOException {
    new TFile(resource).output(System.out);
}

Suppose that the modules TrueZIP Driver ZIP (Maven artifactId truezip-driver-zip) and TrueZIP Driver File (truezip-driver-file) are present on the class path at run-time. Furthermore, if the current directory is the root directory and the file file exists and the file archive.zip is a valid ZIP file, then the following path names could be addressed with the code above:

  • file
  • /file
  • archive.zip/entry
  • /archive.zip/entry

Basically any valid path name could be addressed with this code. If an application needs to address URIs, too, then the following code could be used instead:

/**
 * Copies the contents of the parameter resource to the standard output.
 * <p>
 * The set of archive file suffixes detected by this method is determined
 * by the {@linkplain TConfig#getArchiveDetector default archive detector}
 * and the respective file system driver providers on the class path.
 *
 * @param  resource the URI string of the resource to copy.
 *         The URI must be file-based, i.e. the top level file system
 *         scheme must be {@code file}.
 * @throws IOException if accessing the resource results in an I/O error.
 * @throws URISyntaxException if {@code resource} does not
 *         conform to the syntax constraints for {@link URI}s.
 */
static void uriCat(String resource) throws IOException, URISyntaxException {
    URI uri = new URI(resource);
    TFile file = uri.isAbsolute() ? new TFile(uri) : new TFile(resource);
    file.output(System.out);
}

Using the same configuration like before, the following URIs could now be addressed with the code above:

  • file
  • /file
  • file:/file
  • archive.zip/entry
  • /archive.zip/entry
  • zip:file:/archive.zip!/entry

This adds a lot of flexibility, but the File* API still has one limitation: All URIs must be file-based, i.e. the scheme of their top level file system must be file. In particular, it's not possible to access a resource via HTTP(S).

Now for comparison, let's implement the same functionality using the API of the module TrueZIP Kernel.

/**
 * Copies the contents of the parameter resource to the standard output.
 *
 * @param  resource the URI string of the resource to copy.
 * @throws IOException if accessing the resource results in an I/O error.
 * @throws IllegalArgumentException if {@code resource} does not
 *         conform to the syntax constraints for {@link URI}s.
 */
static void cat(String resource)
throws IOException, URISyntaxException {
    // Get a manager for the life cycle of controllers for federated
    // file systems.
    FsManager manager = FsManagerLocator.SINGLETON.get();
    try {
        // Search the class path for the set of all supported file system
        // drivers and build a composite driver from it.
        FsCompositeDriver
                driver = new FsSimpleCompositeDriver(FsDriverLocator.SINGLETON);
        // Resolve the source socket.
        // Note that an absolute URI is required, so we may need to use the
        // File class for transformation from a normal path name.
        // Using the File class rather than the TFile class implies that
        // the caller cannot specify an archive file in a path name.
        // To overcome this limitation, you should use a TFile instead.
        // Unfortunately, this would introduce a cyclic dependency on the
        // module TrueZIP File*, so it's not an option for this sample.
        URI uri = new URI(resource);
        uri = uri.isAbsolute() ? uri : new File(resource).toURI();
        FsPath path = FsPath.create(uri, FsUriModifier.CANONICALIZE);
        InputSocket<?> socket = manager
                .getController(     path.getMountPoint(), driver)
                .getInputSocket(    path.getEntryName(),
                                    BitField.noneOf(FsInputOption.class));
        // Copy the data.
        // For this small example, we could skip the call to in.close() or
        // use Streams.copy(in, out), but this would not be correct if this
        // were not just the end of the application.
        InputStream in = socket.newInputStream();
        try {
            Streams.cat(in, System.out); // copy the data
        } finally {
            in.close(); // ALWAYS close the stream!
        }
    } finally {
        // Commit all unsynchronized changes to the contents of federated
        // file systems, if any were accessed, and clean up temporary files
        // used for caching.
        manager.sync(FsSyncOptions.UMOUNT);
    }
}

Using the same configuration like before, this code could access the same URIs than before. However, this could access any URI scheme for which a file system driver is present on the class path at run-time, too. E.g. if the module TrueZIP Driver HTTP(S) (truezip-driver-http) is present on the class path at run-time, even http(s)-based URIs could get accessed.