Virtual File Systems (VFS) in Java

Many different solutions to persistence follow a similar structure: a hierarchical organization into files and folders. It appears sensible to access these representations using a common application interface.

Initially, I implemented my own little library, which was capable of accessing files on the hard disk and keeping them in memory. Now I found the Apache Commons VFS library (http://commons.apache.org/vfs/), which offers similar functionality and more. I am thinking now to migrate my libraries to this library. However, I think also in one of the coming releases of Java a common API for file access should be included.

However, the VFS library is not trivial to get ready to plug into a Maven project. This can be achieved by the following steps:

  • Checkout the current trunk using ‘svn co http://svn.apache.org/repos/asf/commons/proper/vfs/trunk vfs‘ (I tested with revision 1032448)
  • In the checked out files, go to the directory ‘core
  • Test and package the current release (it was 2.1-SNAPSHOT as I tested, far higher than the version 1.0 available on public maven repositories or the 2.0-SNAPSHOT version on the apache snapshots repository): ‘mvn compile package
  • Now you can upload the created commons-vfs-2.1-SNAPSHOT.jar from the folder /core/target to your local maven repository (the pom file is located in the directory /core/pom.xml.

You can start using the library using:

<groupId>org.apache.commons</groupId>
<artifactId>commons-vfs</artifactId>
<version>2.1-SNAPSHOT</version>

The created archive also has a MANIFEST.MF file with exported packages etc required for an OSGI environment.

Resources

Examples of using Apache Commons VFS

Usage Example for SFTP

A GUI written on top of Commons VFS

Java Serialization

There are a number of different ways in which objects or object graphs can be serialized and deserialzed in Java. One major distinguishing feature is the intrusiveness of the employed framework. Some, like XStream, can work with any Java object without modification. Others, like JDO, beans serialization frameworks or the standard Java serialization mechanism require you to annotate or write the Java objects in a certain way.

Here just a few quick results of a quick web survey:

  • The standard Java serialization mechanism is rather slow.
  • It is better to use a framework like Kryo, which provides much better performance EDIT: There seem to be some issues with Kryo: mainly, (1) objects appearing multiple times in the object graph are serialized once for every occurence and (2) Kryo requires a no-args constructor to be present in the class (see this blog post).
  • XStream seems to be a popular and reliable framework to serialize Java objects to XML. However, it can be expected to create larger objects and take more time than tools, which can choose their own serialization format such as Kryo.
  • JDO seems to be doing a lot of things automatically. For instance, the updates on fields of persisted objects are tracked in the background. This leaves me with the feeling that complex things might be made a bit too easy by this framework. Also, the tutorial on the DataNucleus website shows that additional configuration files have to be maintained in order to persist the objects. However, JDO is a well-defined standard and interacts with a great choice of data stores.

Resources

A nice comparison of the serialization performance of different frameworks (just scroll down a bit) or a newer version of the same benchmark