A Scoped Temporary Directory and Recursive File Deletion in Java

In my current project, I came across the need for creating temporary files. Since I want to be cool, I decided to use Java’s NIO with the Path API instead of the old File API.

Anyway, for creating temporary files, I could use the function createTempFile of the class java.nio.file.Files. Since temporary files should go into temporary directories (to group them by purpose), this class also provides the method createTempDirectory. Equipped with all of this goodness, I am ready to go. However, there are two things that I would like to be maximally convenient:

  1. I want the temporary directory to be managed with Java 7′s try-with-resources statement, i.e., to have a scope…
  2. …at whose end it should be automatically deleted (together with whatever files and directories I created inside.

That would allow me to handle temporary files and folders in a way that ensures that nothing useless is left lying around without having to explicitly deleting files and folders when I am done. To implement both of the above features is fairly easy. However, there may be some minor issues with stuff I saw on the web, so I will just outline my approach here.

It should be noted that none of the stuff below has been tested. I cannot guarantee any fitness for use and will not assume any responsibility for anything. Ever.

Deleting Folders with Everything Inside

I can delete files and folders in Java with the method delete (again of class java.nio.file.Files). However, that will fail if I try to delete a folder which still has a file or another folder inside. Thus, I need to do that recursively. Since I want to mainly delete temporary folders and files, I want my code not to be fail-fast: If something goes wrong (e.g., a file cannot be deleted), it should continue and try to delete as many files and folders as possible and only at the very end throw an exception. This exception should contain as much information about what could not be deleted and why as possible. This makes my situation a bit different from other solutions, such as [1], which is the basis for my solution. I implement it as follows:

Besides not being fail-fast, this methods differs from my inspirational source [1] in another way: It does not employ any operating-system specific commands (such as rm or rd) but only relies on the Java API. The reasons for this are as follows:

  1. The most important one may be that the Path API is not “platform-bound”. It could be implemented for zip-file systems as well or for memory file systems. If you would have such a path and throw it into a platform-native command such as rd, all hell could break loose (because that path may map to anything, I think).
  2. From an OS command, you may not get detailed error information.
  3. Calling an OS command or running a process for each deletion may actually be slower (or faster, who knows, but I like deterministic behavior).
  4. Apropos, deterministic: using OS-dependent commands may introduce slight differences in behavior on different OSes.
  5. Also, you may only be able to implement OS-specific deletion for a small subset of OSes, such as Linux/Unix and Windows anyway. Plus I am not sure how reliable the detection of the OS is from within Java.
  6. And you will actually somewhat depend on the versions of these OSes. Maybe the parameters of the commands change in an ever-so-subtle in the future or already differ amongst the many versions of Windows, for example – at least I wouldn’t want to check or maintain that.
  7. Finally, I think a Java program should stay within the boundaries of the virtual machine for as long as possible. I would only call outside processes if absolutely necessary. This is just a general design idea: Reduce outside dependencies, rely on the security settings, configuration, and implementation of the JVM.

Of course, there are also some good reasons for using OS-specific commands. For instance, they can probably better deal with the underlying file systems, better understand stuff such as hard and soft links (although these are maybe not so likely to be in a temporary directory), and they actually may be faster. All in all, it depends on the likings of the programmer which way to go.

Temporary Directory Managed with try-with-resources

Java 7′s try-with-resources statement allows us to put an instance of AutoCloseable into a construct similar to a try…finally which ensures that its close method is called under all circumstances (unless the JVM is terminated ungracefully before). For I/O related stuff, we would implement its sub-interface Closeable.

We can use this mechanism and the above code to implement a temporary folder which, along with its contents, is deleted at the end of the try-with-resources scope as follows:

How to Use

This temporary folder can now be used as follows:

Running the above program (under Linux) should result in something like:

Inside try-with-resources scope
Temp path is: /tmp/7939536986162702754
/tmp/7939536986162702754 exists: true
/tmp/7939536986162702754/test.txt exists: true
After try-with-resources scope
/tmp/7939536986162702754 exists: false
/tmp/7939536986162702754/test.txt exists: false

This means that a temporary folder was created. We then created a file inside that folder. Once the closing curly brace of the try-with-ressources was reached, both have automatically been deleted.

P.S. A reason why I used NIO with the Path API is that it is not bound to real files on disk. One day, I may be able to use a file system in memory for some of the temporary files. In that case, my delete method would still work all the same.

Posted in Programming | Tagged , | Leave a comment

Meeting of the German Research Network China on 2014-09-11 in Shanghai

Yesterday I attended the meeting of the German Research Network in the residency of the German General Consul in Shanghai in Yongfu Lu 151, Shanghai [上海市永福路151号]. The meeting was opened by the new Consul for Cultural and Press Affairs, as well as Scientific Cooperation, Marcel Viëtor. The main event was a presentation given by Nobel laureate, Prof. Dr. Erwin Neher, on the topic of discovery of Ion Channels. This presentation was remarkable in that I, as someone with little understanding of Biology, could follow and understand it. Thus, I want to outline it below.

Prof. Neher started with the history of the field, beginning with early experiments regarding electricity in the body of animals with, e.g., frog legs. He then outlined the remarkable contributions of Ramón y Cajal in the early 1900s, who already provided precise sketches of nerve cells comprising our brain. In the 1860s, Julius Bernstein discovered electric action potentials in muscles and formulated his membrane theory in 1902. In the 1950s, Hodgkin and Huxley discovered the nerve impulse. The membrane of a nerve cell becomes permeable for Na+ and K+ ions during the action potential change. In the late 1960s, ion canals were introduced into artificial membranes to experiment with such effects.

Prof. Neher wanted to investigate whether this was also the case in natural cells, or whether other mechanisms were responsible for their dynamic electrical behavior. The main problem here was that this requires measuring the electrical flow of such a cell at a resolution of 1 pico ampere, but the most exact measuring devices at that time were way to imprecise for that, as they exhibited noise at a much higher intensity. One of the most important contributions of Prof. Neher and his team was the development of a more precise method to measure electrical current in cells, the Patch Clamp Technique. Different from poking a small electrode into a cell, in this method a part of the cell’s membrane is sucked into a glass pipette. The advantage here is that there is little or no water between the measurement electrode and the cell, since such liquid is responsible for much of the noise in the measurement data. With the new, more precise measurement, it became possible to show that proteins in the cell membrane act as ion channels that can open and close.

This discovery led to several subsequent surprising findings. For instance, ion channel proteins fulfill important tasks in many different cell types in our body, ranging from nerve cells, heart cells, lymph- and liver cells, to light receptors in the eyes and even cells in plants. It turned out that many medical substances actually dock at the ion channels to achieve their respective effects, by, e.g., blocking them (as is the case in local anesthetics). Knowing the importance of ion channels, the third surprise is maybe not so surprising: Defects in the ion channels can lead to many diseases. This is the case in cardiac dysrhythmia, epilepsy, type-II-diabetes, cystic fibrosis, and myasthenia gravis. Defects in the ion channels can be caused by defects in the genetic material which, if interpreted, produces incorrectly shaped or otherwise faulty channel proteins.

Finally, ion channels seem to also be responsible for our temperature sensation. There are channel proteins which get active at different temperature ranges, some which become permeable at low temperatures, some permit ions to pass at high temperatures. These channels can also be activated by certain chemicals. Interestingly, the channels corresponding to low temperatures can be activated by mint and menthol chemicals while those corresponding to high temperatures are triggered by capsaicin, a chemical common in many spices. This would be a simple explanation why spicy food feels hot and why menthol is perceived as cold.

After a short and interesting discussion following Prof. Neher’s talk, the meeting concluded (and I had to catch my train back to Hefei).

This meeting was quite nice. It was my first chance to see a Nobel laureate and it was truly impressive. Also the ambient was great. The residency of the General Consul is a beautiful building near the city center, built in the 1940s and recently restored. I took a few snapshots before the meeting and attached them below.

Home » Meeting of the German Research Network China on 2014-09-11 in Shanghai » Conferences » Meeting of the German Research Network China in September 2014 in Shanghai
Front Gate of the Residency of the German Consul General in Shanghai
Front Gate of the Residency of the German Consul General in Shanghai
The Front Gate of the Residency of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10.
Heritage architecture plaque at the Front Gate of the Residency of the German Consul General in Shanghai
Heritage architecture plaque at the Front Gate of the Residency of the German Consul General in Shanghai
Heritage architecture plaque at the Front Gate of the Residency of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10.
Entrance of the Residency Building of the German Consul General in Shanghai
Entrance of the Residency Building of the German Consul General in Shanghai
The Entrance of the Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10.
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Small Park in the Residency of the German Consul General in Shanghai
Small Park in the Residency of the German Consul General in Shanghai
A small park in the Residency of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10.
Small Park in the Residency of the German Consul General in Shanghai
Small Park in the Residency of the German Consul General in Shanghai
A small park in the Residency of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10.
Small Park in the Residency of the German Consul General in Shanghai
Small Park in the Residency of the German Consul General in Shanghai
A small park in the Residency of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10.
Nice Window of the Residency Building of the German Consul General in Shanghai
Nice Window of the Residency Building of the German Consul General in Shanghai
A nice window of the Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Nice Window of the Residency Building of the German Consul General in Shanghai
Nice Window of the Residency Building of the German Consul General in Shanghai
A nice window of the Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai
Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Nice Window of the Residency Building of the German Consul General in Shanghai
Nice Window of the Residency Building of the German Consul General in Shanghai
A nice window of the Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], taken on 2014-09-10. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Meeting Room in the Residency Building of the German Consul General in Shanghai
Meeting Room in the Residency Building of the German Consul General in Shanghai
A meeting room in the Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海], where the meeting of the German Research Network on 2014-09-10 took place. The building has recently be reconstructed, but still keeps its historical windows, design, and structure.
Photo of the Attendants of the Meeting
Photo of the Attendants of the Meeting
A photo of the attendants of the meeting of the German Research Network on 2014-09-10 in the Residency Building of the German Consul General in Shanghai at Yongfu Lu 151, Shanghai [永福路151号, 上海].
Posted in China, Conferences | Tagged , , , | Leave a comment

New Article: An Open Source Framework for the Traveling Salesman Problem

I am a researcher in the field of (metaheuristic) optimization, which means that I am developing and analyzing optimization algorithms, methods to find good solutions for hard problems. A hard problem is one for which we cannot obtain the optimal (best possible) solution within feasible time. Instead, heuristic and metaheuristic algorithms are used to obtain acceptably good solutions within an acceptable runtime. The goal of research in optimization is to find the best such methods for given problems or problem classes. This, of course, entails that we somehow need to compare optimization algorithms, which turns out to actually be quite complicated. Below I list some of the issues:

  1. For once, finding good optimization algorithms is actually a multi-objective problem, since we want algorithms, which can find solutions that are as good as possible as fast as possible.
  2. Second, many optimization algorithms are actually anytime algorithms: They have an “approximate” solution any time during their execution and as time progresses, the approximation quality should improve. An Evolutionary Algorithm (EA), for instance, starts with a set of randomly created (bad) solutions and refines them iteratively, (hopefully) constructing better and better solutions. The same is true for local search algorithms. Branch and Bound algorithms are anytime algorithms too. Per definition, an anytime algorithm has not just a single, final result, but creates many solutions on its path and can be run either for a short or a very, very long time. If there are no “final results”, comparing two algorithms means we somehow need to compare progress over runtime.
  3. This leads to the third problem: How can we define runtime? Is it the number of created candidate solutions? Or the actual CPU time which has elapsed. None of these two methods is really satisfying, an issue I have discussed already in an older post.
  4. Fourth, comparison must be done in a statistically sound way, which means that experiments need to be repeated for, say, 30 times, robust statistics must be calculated, and several different benchmark problem instances must be used.
  5. Fifth, several different statistics can be computed based on the gathered data. It is necessary to somehow aggregate them into meaningful conclusions.
  6. All of that, running experiments several times on several benchmark instances and evaluating the large amount of generated data, drawing figures, etc. is quite a lot of (cumbersome) work. It needs tool support.

In the current issue (August 2014) of the IEEE Computational Intelligence Magazine (CIM), there will be a paper titled “Benchmarking Optimization Algorithms: An Open Source Framework for the Traveling Salesman Problem.” In this paper, my colleagues and me present some ideas to tackle the problems listed above. Our TSP Suite is an automated experimentation framework for the TSP, which aids the researchers in implementing and unit testing their algorithms, running experiments in an automated fashion in parallel or distributed on a cluster, and evaluating the generated log data, resulting in stand-alone reports almost similar to Bachelor’s theses. The work presented in the paper is still preliminary – there is lots of things that need to be improved (in particular in terms of the statistics side) and there are many more issues to explore – but I think we are on a good way.

The abstract of our article is as follows:

The article proposes an experimentation procedure for evaluating and comparing optimization algorithms. It is argued that end-of-run results alone do not give sufficient information about an algorithm’s performance, so the proposed approach analyzes the algorithm’s progress over time. Comparisons of performance curves in diagrams can be formalized by comparing the areas under them. Algorithms can be ranked according to a performance metric. Rankings based on different metrics can then be aggregated into a global ranking, which provides a quick overview of the quality of algorithms in comparison. An open source software framework based on the Traveling Salesman Problem (TSP), called the TSP Suite, is presented as an application of this experimental procedure. The framework can support researchers in implementing and unit testing TSP solvers and running experiments in a parallel and distributed fashion. It also has an evaluator component, which implements the proposed evaluation process and produces detailed reports. Comprehensive experiments have been conducted using the TSP Suite to benchmark several local search and evolutionary computation methods. The results show that the tested pure global optimization algorithms are outperformed by local search. The best results, however, come from hybrid algorithms.

Thomas Weise, Raymond Chiong, Ke Tang, Jörg Lässig, Shigeyoshi Tsutsui, Wenxiang Chen, Zbigniew Michalewicz, and Xin Yao. Benchmarking Optimization Algorithms: An Open Source Framework for the Traveling Salesman Problem. IEEE Computational Intelligence Magazine (CIM), 9(3):40-52, August 2014. doi:10.1109/MCI.2014.2326101

Of course, our work is not the first one on this topic. Some pioneering and outstanding contributions, to name a few, are:

Posted in Evolutionary Computation, Optimization | Tagged , , , , , , | Leave a comment

Using Java’s try-with-resources for Hierarchical and Parallel APIs

During my work on the TSP Suite (soon to appear in IEEE CIM), I came up with an – as I blieve – nice approach to construct hierarchical APIs by (mis-)using Java 7‘s try-with-resources statement and the related AutoCloseable interface. Since I have used this approach in several different locations in my programs and found it quite useful, I want to share it here. Before I do that, please let me point out that I do not know whether this idea already exists somewhere and whether it actually may be an anti-pattern. The complete source code of the examples and example APIs used in this post is provided at its end.

AutoCloseable and try-with-resources

First of all, let me shortly outline what the interface AutoCloseable and the try-with-resources concept are. A program will usually open resources, such as sockets for information exchange via the network or streams to read and write data. Such resources must eventually be closed, once we are finished with them. For example, a FileOutputStream must be closed after we are finished writing our data to a file.

Resources, Exceptions, and try-finally

However, many resources may throw Exceptions: The aforementioned FileOutputStream, for example, may throw an IOException if were are trying to write more data to it than the free capacity of the underlying file system permits. Still, we need to make sure that the resources are properly closed even in case of errors. This can be done with a try-finally-based approach, such as the one illustrated below:

As you can see, we put the calls to the close() methods into finally blocks, which are executed even in case of Exceptions being thrown. However, we need to do this try-finally approach for every resource. There is the danger that we sometimes may forget it. Also it makes the code a bit longer.

Resources, Exceptions, and try-with-resources

Since Java 7, classes representing resource that need to be closed should implement the AutoCloseable interface. (Neatly, AutoCloseable has been introduced as super-interface of Closeable, which is already implemented by many resource classes.) Any implementation of that interface can be embedded in a try-with-resources statement, which will invoke the close()-method automatically. A Java compiler (such as the one of Eclipse) can even be instructed to issue warnings or errors if a resource object is not properly embedded into such a statement. The Java 7 version of the example above now looks like:

This makes the code shorter and less error-prone.

How we can build simple APIs with that: Example

We can use this mechanism to build simple, hierarchical APIs. I will show this based on an example where we can relatively easily see why this could makes sense: XMLStreamWriter.

XML Output with XMLStreamWriter

The XMLStreamWriter is a lightweight output API for XML documents. It is somehow similar to a character stream Writer with additional methods to write XML elements and attributes (or an “inverse” version of SAX, for output, if you want). It is used like this:

Executing the above program will lead to output looking like this (without line breaks and indentation):

As you can see, each writeStartElement() is matched with a writeEndElement() (similar to how resource construction is matched with calls to close() in our previous examples). Elements can be arbitrarily nested, and it is up to the programmer to keep track of method calls to match.

XML Output with a Hierarchical API using try-with-resources

How about this: Let the method for opening an element return an instance of AutoCloseable, which can be handled with the try-with-resources approach. Its (implicitly invoked) close() method would replace the call to writeEndElement(). This would have a couple of advantages, foremost it reduces code size and second, it makes it impossible to mismatch element opening and closing calls, because forgetting or adding a superfluous element closing would result in a missing or excess curly braces (“}“), and the compiler does not appreciate that.

I did a quick and dirty implementation of this idea in the package org.optimizationBenchmarking.utils.io.xml of the attached sources for this example. The resulting code allows us to re-write the previous example as:

The code has the same behavior, i.e., it directly writes the XML data to the underlying stream and does not construct any fancy data structure in memory. Compared to the XMLStreamWriter, all the explicit calls to element closing methods have disappeared (and became implicitly invoked at the corresponding curly braces). There is one difference to the previous example: Each opened element will be closed, in exactly the order in which the elements are opened, even in case of Exceptions.

Under the hood, there now is an additional internal liked list of the AutoCloseable implementations implicitly managed by the try-with-resources statements. However, since XMLStreamWriter probably also maintains some kind of element stack internally, this does probably not even cause additional memory consumption.

Advantages

This kind of API has a set of advantages:

  1. It reduces code complexity.
  2. It reduces the probability of coding errors.
  3. It can better represent the structure of method calls, since automated code formatting can add indentation representing the nesting level of API calls. This is not (or at least hardly) possible with separate open/close method calls, as in case of XMLStreamWriter.
  4. It does not necessarily need more memory.
  5. It automatically provides the robustness of try-finally, without the need to add any additional code.
  6. It looks as if you are working on complex objects, but it may actually be maintained by a very lightweight data structures. My XML example API somehow has a DOM-like look and feel, but it actually works like a XMLStreamWriter.
  7. It may be implemented to support parallel processing without much overhead, as I will discuss later in this post.

Fields of Application

I imagine two potential fields of application of this idea: First, for producing clear and simple structured output APIs, like the XML example above.

Hierarchical, Abstract Document Output APIs

In the TSP Suite, for instance, I have a more elaborate abstract API to produce documents, which is implemented for XHTML and LaTeX. If we can do this with XML, then we can also not make an API for general structured text documents, where different opened elements can have different children. A section may have a section body, a section body may contain figures, equations, or text, a text may contain emphasized sub-text, etc. All of this can easily be facilitated by hierarchically nesting different implementations of AutoCloseable, while it is virtually impossible to do this properly and in an easy-to-use style with a single output object (in the style of XMLStreamWriter). I will make a new release of this API for the next version of TSP Suite, but the currently available source code of that project already has a working version.

Hierarchical Builder Pattern

The second field of application is to create more complex builder patterns. The builder pattern allows you to avoid having very complex constructors. Say you have an immutable object with 20 fields. Now you could make a 20-parameter constructor for that object, or a builder class where you can set the fields with separate method calls and, once you are finished setting the fields, obtain the constructed object.

Sometimes, your goal is not just to build a flat data structure with many parameters, but maybe a hierarchical one. As simple example, take bibliography records as used in BibTeX. One such record, say, for a conference paper, may contain a list of authors (each having a personal and a family name), a title and features of the proceedings containing the paper, such as editors, a book title, a location, and a publisher. Creating an object with all of that information may be complex. By using the try-with-resources idea, we can construct hierarchical builders and get code such as:

OK, without context, it may hard to understand or appreciate this example, but the key point is: I want to construct a complex (immutable) data structure with nested immutable objects, and I can use try-with-resources and AutoCloseable to implement a hierarchically nested builder architecture for doing that.

Difference to Closures and Lambda Expressions

There is a certain similarity of this idea to closures. Java has always supported some kind “explicit” closures. In Java, you can define anonymous local classes inside a class or method, which may access fields or variables of the enclosing class. This concept is a very popular way (actually, probably the default method) to implement event listeners for GUI components. A closure is function which can access data of the enclosing entity, exactly as such a local class. In Java 8, Lambda expressions are introduced, an (I think) simplified syntax for a special case of local anonymous classes.

Anonymous local classes and lambda expressions may look a bit to what I suggest here, but they are a different thing: As the name implies, the code pieces go into classes, whereas in my case, the code is just normal code. Also, the code of local classes or lambda expressions is usually passive: it is invoked from an outside activity, say an enclosing method providing it with input, or an event source. This is different from my situation, where normal coded actively uses an API.

Still, the same effects that I list here could probably be realized with closures or lambda expressions. Using these devices may even be more suitable or intuitive to achieve that parallel output that I will discuss in the next section. Yet, it would come at the cost of more (probably many more) anonymous classes being created, whereas my method probably only introduces moderate overhead.

Parallel Single-Document Output

One final and, in my opinion, very interesting argument for using the nested AutoCloseable approach to construct structured APIs, is that it inherently supports parallelism. In case of the bibliography/builder example above, it seems to be quite natural. We can easily imagine that the “page-setting” code could run in a different thread. We can build a complex object in parallel.

But we can even create parallel output to a single document, say an XML document. With that I mean: You can have ten threads which all write to the same, single XML stream in parallel and the final result will still be valid XML, with all output properly ordered. This is a thing the XMLStreamWriter API cannot provide. Invoking startElement() methods in parallel in different threads is a recipe for trouble. Not so in case of an API managed with AutoCloseables.

Basic Idea

The idea is as follows:

  1. Each element of the API (say the XMLElement objects in my example) maintains an internal reference to its output destination, let’s say of the type Writer, to which it writes its output.
  2. Each element furthermore manages a queue of open child elements.
  3. It is allowed to have multiple open child elements. In case of the XML example, you could write to multiple elements in parallel. Every time a new element is opened, it will be added to the end of its parent’s child queue.
  4. When a new child element is opened, two cases are distinguished:
    1. The parent currently has no other open child element. In this case, the internal output destination reference (say, the Writer) of the child element will be set to reference the same Writer as the parent element.

      This is the default situation in the non-parallel scenario (and the previous examples), where each element has at most one open child element. After a child element is closed, the next one can be opened and written to. Exactly as it is the case XMLStreamWriter. All output goes (sequentially) to the same Writer, no additional overhead (except for the single additional field per element) occurs.

    2. However, if another child element is already open, the internal output reference (Writer) of the new child element will be a new, independent buffer in memory, say a CharArrayWriter.

      It is easy to see that all child elements can be written to at the same time and in parallel, since all the data goes to different locations. No synchronization is required for writing, only during the opening and closing phases.

  5. Whenever a child element is closed, the parent’s queue is checked from the start.
    1. If the head of the queue is a closed child element, there are two cases:
      1. If the internal output reference of (the Writer used by) the closed child element at the head of the queue is the same as the one of the parent, then nothing needs to be done.
      2. If he internal Writer of the closed child element at the head of the queue is a buffer, then the contents of this buffer are appended to the parent’s Writer.

      The head of the queue is removed and we continue with the next child in the queue. This sequential processing ensures that the outputs are arranged in the same order in which the elements were opened.

    2. If the head of the queue is an open child element, the processing of the queue is stopped here. (Once that element is closed, the queue will be processed again.)
  6. Of course, you can only close an element after all of its child elements are closed, but this bit of synchronization is not too complex..

I implemented a basic form of hierarchical API using queueing of child elements in a class called org.optimizationBenchmarking.utils.hierarchy.HierarchicalFSM in the example sources. The capability of parallel text output by using the internal references to output destinations or memory buffers is implemented exemplarily in class org.optimizationBenchmarking.utils.hierarchy.HierarchicalText. The classes used in my XML example are based on HierarchicalText. The Bibliography example is a hierarchical builder pattern and does write output to a stream, so it is based on HierarchicalFSM directly.

Use Case: ForkJoinPool

Following this concept, we can create APIs for writing any kind of stuff to a single stream, in parallel. These APIs should have almost no additional costs if you just want to use them sequentially. Moreover, this concept fits quite well to another novelty in Java 7: ForkJoinPool. A ForkJoinPool, as discussed in this tutorial, allows you to submit jobs which, in turn, can submit sub-jobs, and so on.

Each such task could write its output to a different element of the hierarchical API and close that element once its own output is written and its sub-tasks are finished. Apart from this, the programmer herself does not need to deal with parallelism too much. The output to the document will automatically be parallelized, with the level of parallelization depending on the number of available processors. Due to the queue-based sequencing of the output, the resulting documents will look exactly the same as if they were written in a single thread.

This form of parallel output may be quite nice in cases where contents for different parts of the output document require you to do independent, complex calculations. This is the case in our TSP Suite: The output documents contain several statistics and figures which are computed independently from each other. If we can compute and output all of them in parallel, the throughput may increase significantly.

Experiment: Does parallel output provide speedup?

Of course, parallelization causes some overhead. Buffers need to be allocated and re-sized as contents are written to them in parallel. Then we have to pay for using ForkJoinPool and implementing the output “generators” as instances of ForkJoinTask (such as RecursiveActions) instead of simple procedures. (This is similar to using closures directly, but closures would require us to put each nested output scope into a class, whereas we here can nest several try-with-resources statements into one task.) Anyway, we also need to use synchronization whenever a new task is started or one terminates. Finally, the documentation of ForkJoinTask states Tasks should also not perform blocking IO, and should ideally access variables that are completely independent of those accessed by other running tasks. … but we are doing IO.

Thus, I conducted a small experiment to investigate whether the throughput of writing to a sequential document can actually be increased by parallelizing the output with the API implemented exemplarily for XML based on my little HierarchicalText-class.

Experimental Setup

For this purpose, I wrote some code creating random example document structures in memory of fixed sizes s∈{10,100,…,10000} and serialize them using either

  1. Java’s XMLStreamWriter in the default, serial fashion (I will refere to this setup as X
  2. my hacked-together XML API in serial fashion (referred to as S)
  3. my XML API in parallel, using
    ForkJoinPools with the single-parameter constructor for different degrees of parallelism, ranging from 1 to Runtime.getRuntime().availableProcessors() (in my case, 8). I will refer to these setups as P#, where # stands for the number of processors.
  4. my XML API in a parallel fashion, using ForkJoinPools constructed with asyncMode=true. These setups are referred to as A1 to A8.

I measured the performance for two kinds of output destinations: A Writer simply discarding all of its output, i.e., not performing any blocking operations, and a FileWriter writing to a temporary file. Furthermore, I wanted to investigate how the computational requirements to create the output data influence the performance. For this purpose, I simulate a “computation” for any attribute or element text by performing loops of bogus mathematical operations. I did this for loops with d∈{0,10,100,…,10000} iterations.

I tried to follow my advices from my post on measuring runtimes by conducting 500 dry runs before the actual measurement to let the Java JIT compiler do its potential work and by conducting several runs per configuration and several runs per document for each output destination. In particular, for each document, I performed 3 runs and took the shortest measured runtime. Each setting of s and d is repeated like this for 15 different (random) documents and the median runtime is used.

The implementation of this performance test is given in package examples.org.optimizationBenchmarking.utils.io.xml.performance in the attached sources. The results I have measured on my Ubuntu computer and the OpenJDK 64-Bit Server VM for Java 1.7. The results and figures are contained the above package in the download, as file results.txt. Since actual runtimes are not important, I normalized them in each of the figures. In particular, for each figure and setting of the “computational effort to obtain output data” d, I picked the fastest setup and divided all measures runtimes of all setups by its runtime, for this particular value d.

Experimental Results

Figure 1 illustrates the relative performance of the different settings for the discarding Writer and documents with 10 elements for different values of d. In the case where no computation or only few calculations are necessary to obtain the data to be written, i.e., where the output is directly available, both XMLStreamWriter (X) and my API (S) used in a serial fashion are significantly faster than the parallel versions. However, if more computations become necessary to create the output (as I it would be the case in my TSP Suite), the parallel versions can save up to 25% compared to the serial output. Of course, the setups P1 and A1 cannot do that, as they pay the overhead for using ForkJoinPool without actually being parallel.

The output time requirements of XML APIs for 10-element XML documents to a Writer discarding its output, over different attribute- and element text access times. For small access times, serial output is faster. Parallel output is faster if the computational requirements to obtain the strings to write increases.

Figure 1: The output time requirements of XML APIs for 10-element XML documents to a Writer discarding its output, over different attribute- and element text access times.

Figure 2, illustrating the same output routine for 1000-element documents, confirms this finding, with two differences: First, for these larger documents, parallelism seems to already pay off for small computational requirements to “generate” output data and time savings of 50% seem to be possible. Second, it seems to stop paying off for large computational requirements and all setups except S have the same performance. I have no real idea why this would happen. It may be specific to my example implementation, be caused by some JIT behavior, or by some under-the-hood behavior of ForkJoinPool, maybe related to thread allocation and disposal, or other magic.

The output time requirements of XML APIs for 1000-element XML documents to a Writer discarding its output, over different attribute- and element text access times. For small access times, serial output is faster. Parallel output is faster if the computational requirements to obtain the strings to write increases. Oddly, for very high requirements, this advantage disappears.

Figure 2: The output time requirements of XML APIs for 1000-element XML documents to a Writer discarding its output, over different attribute- and element text access times.

When writing 10-element documents to a temporary file (illustrated in Figure 3), we can observe a behavior similar to when writing to a discarding Writer. The speedups provided by the parallel version seem to be more consistent, which probably results from the more extensive use of buffers in memory.

The output time requirements of XML APIs for 10-element XML documents to a FileWriter, over different attribute- and element text access times. For small access times, serial output is faster. For larger access times, parallel output is faster.

Figure 3: The output time requirements of XML APIs for 10-element XML documents to a FileWriter, over different attribute- and element text access times.

For 1000-element documents (see Figure 4), we can obtain nice speedups and may sometimes be able to write a document in half of the time. (Again, with the same odd behavior of performance collapse for large computations.)

The output time requirements of XML APIs for 1000-element XML documents to FileWriter, over different attribute- and element text access times. Parallel output is faster, except for large documents, where all experiments behave roughly the same.

Figure 4: The output time requirements of XML APIs for 1000-element XML documents to FileWriter, over different attribute- and element text access times.

All in all, we can find the parallel output to the same underlying stream may, at least in some cases, provide a significant speedup. Of course, we have to pay for synchronization and other overhead when using ForkJoinPool, plus we violate its contract by actually doing blocking I/O. As a result, the speedup is not equal to the number of used processors.

Then again, my API was hacked together and the way I create random documents may be biased in some way. It should furthermore be noted that I did not do too many repetitions for statistical sound evaluation. On a different machine, you may get entirely different results. Also, my API does not support anything fancy, which is likely the reason why it is a bit faster than XMLStreamWriter if used in a serial way. So these results here are just something inbetween a sanity test and a rough proof-of-concept.

Conclusions

In this post, I have outlined a concept for hierarchical APIs based on Java 7′s try-with-resources and AutoCloseable interface. I have argued that, at least to me, it seems to be robust, both in terms of reducing potential sources of coding errors as well as in the presence of exceptions being thrown. Furthermore, without much effort, it can be used to parallelize output to sequential streams, which seems to provide a performance improvement – at least in some preliminary experiments. If implemented properly, I could even imagine that this could speed up the way a web- or application server creates single-stream contents.

Of course, I am not a software engineer. I know nothing about frameworks out there, I know nothing about fancy concepts and patterns. So this idea may actually be an old hat. Or discouraged by experts for some reason. Anyway, I think it is cute.

Example Source Code

The current version of the source codes belonging to this article are contained in this ZIP archive. I put them under the LGPL version 3 license. I do not resume any responsibility of whether they may work or not. This holds, for instance, for my XML output API – it is just an example and not intended for any productive application at all.

The source archive contains two root packages:

  1. org.optimizationBenchmarking.utils is the utility package for the next version of TSP Suite. It contains examples for potential base classes to build a hierarchical API using AutoCloseable and try-with-resources in sub-package hierarchy, my example XML API in sub-package io.xml, and the example bibliography API in sub-package bibliography.
  2. Package examples holds examples for everything we have discussed above. If you are looking for examples of a class you found in package org.optimizationBenchmarking.utils.x.y, then you will find it in package examples.org.optimizationBenchmarking.utils.x.y, if there are any.

The source archive also contains a heap of other utility classes, which, in one way or another, are used by my code. You may ignore them.

Download: hierarchicalAPI.zip

Posted in Programming | Tagged , , , , , | Leave a comment

Deadline Extension for SSCI’14 (and our special session)

Good news for everyone who was a bit late to submit their paper to SSCI 2014: The deadline has been extended to July 15, 2014. The same, of course, automatically holds for our Special Session on Benchmarking and Testing for Production and Logistic Optimization discussed on this blog. Its CfP has already been updated. We are looking forward to receive your submissions ^_^

Posted in Conferences, Evolutionary Computation | Tagged , , | Leave a comment