Archive for category java

CHARTER Surveillance Use Case – Industrial Evaluation

Screen shot 2011-10-10 at 4.18.26 PMThis month, Luminis has started development of a surveillance use case. The purpose of the case is industrial assessment and validation of tools and technologies developed in the “Critical and High Assurance Requirements Transformed through Engineering Rigor” project (CHARTER).  The ultimate goal of CHARTER is to ease, accelerate, and cost-reduce the certification of embedded systems. The CHARTER tool-suite employs real-time Java, Model Driven Development (MDD), rule-based compilation and formal verification. The coming series of articles will describe evaluation experiences in the surveillance use case.

The CHARTER project includes user partners from four key industries: aerospace, automotive, surveillance and medical, each of which develops embedded systems that require high assurance or formal certification in order to meet business or governmental requirements. The four user partners will each validate the CHARTER tools and methodology using industrial applications and actual development scenarios, which will provide feedback for the project and ensure the tools and technologies perform as expected, and deliver the expected improvements in embedded systems development. As part of the evaluation process, metrics will be used to quantify industrial experiences in terms of development effort, cost savings, verification time, etc., to document for others the benefits achieved.

The CHARTER project was established to improve the software development process for developing critical embedded systems. Critical embedded software systems assist, accelerate, and control various aspects of society and are common in cars, aircraft, medical instruments and major industrial and utility plants. These systems are critical to human life and need to be held to the highest standards of performance through formal certification procedures. Improving the quality and robustness of these systems is paramount to their widespread adoption.

, , , , , , , , , ,

No Comments

GraniteDS and AIR for mobile

In this article I will briefly show how to resolve some obstacles I came across when I developed my first application with AIR for mobile and GraniteDS. The most noteworthy reason of using AIR to create mobile applications is of course the multi-platform deployment using a single codebase. Furthermore, with Granite you are able to disclose the services of an existing Java backend to a mobile platform without significant changes to the backend. This offers great potential for enterprises who are struggling with the fragmented mobile market and don’t want to completely rewrite their existing Java backend.

I will assume you have some familiarity with AIR for mobile and Granite. It’s mostly the same as for Flex but there are some things you have to take into account.

1. Get the right version of Granite

The latest version of Granite is 2.2.1 GA. However, in the most recent version of AIR and Flex Adobe made some changes in the API which breaks backwards compatibility for some features of Granite. Therefore this release of Granite won’t work using the newest SDK. Refer to this post on the Granite form for more info on how to make these changes yourself. If you don’t want be bothered with building Granite yourself just download this version of GraniteDS to get started immediately.

2. Connecting to the right server

A problem for AIR applications in general (both desktop and mobile) is setting the right server settings. It is quite simple when running within the browser: The server address is changed with a single reconfiguration of the swf-file and all clients are using the new address on a browser refresh. With AIR it is a bit different and you’re not always at liberty to ‘hardcode’ the service settings in the AIR distribution package.
Granite offers a method for dynamic server configuration using the server initializer component:

Tide.getInstance().addComponentWithFactory(
  "serviceInitializer",
  DefaultServiceInitializer, {
    contextRoot: '/my-app',
    serverName: “10.0.0.1”,
    serverPort: “8080});

Note that once a connection is made, it is not possible to reconnect with another configuration, because the service initializer is only used once. You have to restart the application to enable the new connection settings or reset Tide’s RemoteObject. Unfortunately Tide’s API doesn’t support this reset. I came up with a small workaround which requires you to extend the EJB class with an extra reset method with the following body:

public class Ejb extends org.granite.tide.ejb.Ejb {
  /**
   * Reset the Tide Connection to allow new server settings
   */
  public function resetConnection():void {
    if (_ro) {
      _ro.disconnect();
    }
    _ro = null;
  }
 
}

This method resets Tide’s RemoteObject so the next remote call will force a reinitialization using the current settings of serviceInitializer. Refer to Granite’s issue tracker or forum thread for more details.

3. Automatic logout

Applications running on mobile platforms are always susceptible to unpredictable interruptions. For example when a phone call or text is received. Mobile AIR applications provide a deactivate event which is dispatched when the application is halted somehow. The application I wrote was using Tide’s Identity class for user login. Therefore I added an event handler to automatically logout the user and push the LoginView on top of the navigator stack:

private function deactivateHandler(event:Event):void {
  if (identity.loggedIn) {
    identity.logout();
  }
  navigator.popAll(); // Purge the navigator history to disable back button usage
  navigator.pushView(LoginView);
}

4. Build, build, build!

This isn’t directly related to Granite or AIR for mobile. But since they can both be used for enterprise scale applications I thought I’d mention it shortly: Make sure you have a proper build script. Now, I’ve got an example from Chris Black which provides a good starting point. I’ve only added the metadata compiler options required for Tide and of course a reference to the Granite libraries and generated Actionscript classes.

 
 <mxmlc ... >
 
  <!-- .... -->
 
  <!-- location of generated as classes with gas3 -->
  <source-path path-element="${gen.src.dir}" />
 
  <compiler.library-path dir="${basedir}/libs" append="true">
    <include name="granite-essentials.swc" />   
    <include name="granite.swc" />
  </compiler.library-path>
 
  <keep-as3-metadata name="Bindable" />
  <keep-as3-metadata name="ChangeEvent" />
  <keep-as3-metadata name="Destroy" />
  <keep-as3-metadata name="Id" />
  <keep-as3-metadata name="In" />
  <keep-as3-metadata name="Inject" />
  <keep-as3-metadata name="Managed" />
  <keep-as3-metadata name="ManagedEvent" />
  <keep-as3-metadata name="Name" />
  <keep-as3-metadata name="NonCommittingChangeEvent" />
  <keep-as3-metadata name="Observer" />
  <keep-as3-metadata name="Out" />
  <keep-as3-metadata name="PostConstruct" />
  <keep-as3-metadata name="Transient" />
  <keep-as3-metadata name="Version" />
 
</mxmlc>

One plus one

I can imagine one must be thinking: ‘Everyone could have figured that out!’ And I totally agree, because that’s exactly what this article is about. With some experience with Flex, a developer can write a mobile application on top of a Java EE backend. It doesn’t take much to utilize an existing backend from a mobile platform. Since the latest release of AIR the performance for iOS and Android is pretty good and together with the Granite Enterprise Platform the barrier to emerge an enterprise application to a mobile platform has become much lower.

, , , , , , ,

2 Comments

New version of TopThreads JConsole plugin

Some time ago, I created the “TopThreads” plugin for JConsole, that helps you to determine why your Java application is causing such a high CPU load, by showing the most busy threads in your application and giving you the opportunity to inspect thread-stacktraces at the same time. It turned out to be quite usefull and from the responses I got, I can tell people find it still usefull today. A few days ago, I released a new version of this plugin, with one very usefull new feature: CPU usage.

Top thread?

If you’ve used the topthreads plugin, you probably seen this before: suddenly, a thread that is not supposed to be very busy, pops up at the top of the table with usage figures in the 90’s. You wonder WTF is going on, that this thread is taking so much CPU power, until you realize that this figure is only relative to the rest of the application threads. And if the application is hardly doing anything, threads that do a little more than anything might get alarming high figures (and red color). After i ran into this pitfall a few times, i decided i needed to know an absolute usage figures too.

process line in topthreads plugin

If you enable this feature (settings -> show process cpu usage too), the top row of the table shows the CPU usage of the process as a whole. This is simply the sum of the CPU usage of all threads. The percentage shown in this row however, is the percentage this process is using the CPU, which should be approximately the same value a process viewer like top, Activity Monitor or the TaskManager would report. Although this is not always the case – more about that in a minute – it’s at least a good indication whether the process is busy or idle. And even though it may not always be as acurate as i would like it to be, it proved itself to be proficiant to help me avoid confusion.

The usual suspect: the garbage collector

In normal situations (whatever that me be… ;-) ), the CPU usage figure is approximately the same as the figures other tools report. However, especially when the process is very busy, the CPU usage shown is far too low. After some testing, i’m rather confident that this is mainly caused by the garbage collector. As it turns out, TopThreads does not get information for all the JVM threads, which can easily be verified by comparing a thread dump with the thread listing in JConsole. For example, threads that never appear in JConsole (not in the TopThreads tab, but neither in the JConsole thread view) are the “Low Memory Detector”, compiler threads (HotSpot), “Signal Dispatcher” and “Surrogate Locker Thread (CMS)” and the garbage collector threads (the mark-and-sweep thread and the parallel gc threads). I can image that some of these threads can put a lot of load on the CPU when the application is very busy. And one thing is for sure: the cpu cycles that are taken by these threads are not counted in the totals that the TopThreads plugin computes, simply because it doesn’t know about these.

Despite these shortcomings, i find the new feature quite usefull myself. Let me know what you think.

Other improvements in this release:

  • the initial poll time is not fixed to 10 seconds, but depends on the (initial) number of threads. For small apps, the updates will be much more frequent.
  • there are more preferences to set and these are moved to a separate settings dialog. Settings are stored using the Java Preferences API.
  • improved stacktrace panel behavivour, including automatic scroll to the top.
  • better handling of security exceptions, that might occur when connecting to a remote VM.

Please let me know what you think, feedback is always welcome!

, , ,

8 Comments

Getting to know CouchDB 1/x

Some time ago, I started looking at CouchDB. Getting to know technologies is IMO always best performed by thinking up some kind of project. So first, I started thinking of what to do with it. Now that the 1.0.0 has been released I thought it would be good to (finally) blog about my findings so far.

The project I came up with is the following: A simple (OSGi) log listener that stores its log messages into a CouchDB database.

At Luminis we use OSGi a lot. It is a highly modular framework (written in Java) that allows you to combine several modules (called ‘bundles’) that each expose or make use of functionality provided by other bundles. Careful combination of several bundles will result (almost like ‘emerging behavior’) in a complete application that suits your needs. The functionality provided by a particular bundle is best exposed through interfaces, enabling you to switch implementations without breaking the contract with other bundles that use the exposed functionality.

One of the compendium services that are available is the LogService. It allows any service to log the things it considers of any importance. It is then up to the LogService implementation where these log statements end up. Logging is usually sent to standard out or a file, which is fine in most cases. Sometimes however, logging onto the device itself and retrieving the log file for inspection is not an easy task. Take for example an Android device. Thanks to Aaron Miller’s effort, we are now able to run CouchDB on Android. Android apps (EZDroid apps) will now be able to store their logging data in this local instance. This data can then be easily replicated to another instance for inspection.

Also,  for limited devices (where there’s e.g. limited storage available) logging to a file is simply not an option.

Therefore I thought of the following scenarios where

  • CouchDB is installed locally on the device/ server. Access to CouchDB is then guaranteed and no logging is missed. A drawback might be that the OSGi application would require a local installation of CouchDB (for those who consider that a drawback!).
  • CouchDB is installed on a remote instance. Access to the CouchDB instance might be interrupted due to network instability. Then, some log messages might get lost. Since most applications require a working internet connection,  I think we could live with this.

The target of the project would be to store a JSON representation of the actually logged messages into CouchDB. This can easily be accomplished by using a LogListener.

In both cases, the OSGi LogListener that ‘lives’ inside the OSGi application, receives all LogEntries, that contain the messages that are being sent to the LogService (and some meta-data). All it then has to do is convert it to JSON and create a new document in some database at the CouchDB server. If one ever wanted to inspect the log messages, a single call to the CouchDB server would initiate a replication of the database to your local CouchDB instance. Then, you could peruse the log messages offline at your leisure.

Installation and the basic usage of CouchDB is not covered here, there are excellent descriptions already available on the WikiHalorgium’s GitHub and the O’Reilly Free Book.

To test the setup described above, I created such a LogListener implementation. I use json-simple to convert the LogEntry into JSON and end up with a JSON Object such as the following:

{
'message': 'This is a test message',
'time': 1269783246909,
'level': 'LOG_DEBUG',
'serviceReference': {},
'bundle': {
   'id': 5,
    'lastModified': 1269783246510,
    'location': 'file:bundle/net.luminis.log.couchdb-1.0.0.jar',
    'symbolicName': 'net.luminis.log.couchdb'
    }
}

A unique serverId/ instanceId could also be added to be able to distinguish between server instances if you decide to send the logging to a central server.

This, I POST to the configured server. By not submitting an ‘_id’ in the JSON string, CouchDB will make one up for me. The HTTP (1.0) POST itself is done (thus keeping it lightweight) opening a raw socket to the couchdb server:

POST /db HTTP/1.0
Content-Length: xxx
Content-Type: application/json

{ ... the data here ... }

The response is also a valid JSON response which can be inspected for success (along with the HTTP response code, of course).

In my current implementation, I can choose between two modes; either each message is sent to the couchdb instance on at a time, or I send all messages in bulk mode every x seconds. The latter is probably the best for remote couchdb instances.

No Comments

FitNesse and OSGi

As a demonstrator for a customer, I recently built a set of fixtures that allow FitNesse acceptance tests to talk to an OSGi framework. This code is by no means production quality, but merely intended to show the concept and explain the challenges.

I will not explain the details of the acceptance tests here, however, if there’s one point I would like to get across, it’s your fixtures should be as narrow as possible to easily accommodate for implementation changes. Study the different UserAdmin fixtures for more details. Also, I assume some familiarity with OSGi.

FitNesse and OSGi. Why?

Of course its fun, there is some real benefit to be gained here. While the industry well understands the need for unit- and integration testing, also in a modular context, it becomes more complex to create the necessary link between business and code. Yes, using a modular architecture we can behave in a more agile fashion, but all that agility is no good if the business doesn’t hop on the train, and explain well what it needs. FitNesse allows the business to explain its goals in business-lingo, while forcing the specification to be precise enough to be executable: if a concept cannot be explained by simple scenarios, something is wrong, but that’s a different story.

The modular nature of OSGi means that behavior of an application is more emergent than deterministic, making it harder to reason about its correctness: we can prove that our code and bundles are correct (unit tests), that everything works together as it should (integration tests), and that it looks right (user interface tests). However, proving that the business rules (which may well be one of those emergent properties) are handled correctly in a given setting, is another can of worms: we need to connect our acceptance tests to the OSGi framework.

The big picture

FitNesse and OSGi - overview

The solution presented below uses a special ‘fixtures’ bundle, which can be deployed along side other bundles in your framework. This bundles exposes an interface (in our case, through an HTTPServlet), which is used by a set of connectors, which in turn are used by FitNesse.

The details

FitNesse and OSGi - detailed

The ingredients are two parts connector code, one part boiler plate, and one part genuine OSGi-aware fixtures.

The connectors

Starting at the level closest to FitNesse, we find a set of fixtures that FitNesse can use. For us, these contain merely boiler plate code.

public void removeUser(String name) throws Exception {
    doRemoteCall(buildRemoteCall("UserAdmin", name), Void.class);
}

This code instructs our RemoteInvoker to do some call to the outside world. For more details, see RemoteInvoker.java in the UserAdminRemoteFixtures project.

The fixture bundle

Moving one step closer to our service, and into the OSGi framework, we find a FixtureServlet, whose task it is to receive calls from the RemoteInvoker, and turn them into actual method calls on the fixtures.

The fixtures, then, are almost regular OSGi aware objects. I chose to use the Apache Felix Dependency Manager for the dependency management of the fixtures. So, for our UserAdmin fixture, the dependencies are

manager.add(createService()
    .setInterface(UserAdminListener.class.getName(), null)
    .setImplementation(userAdmin)
    .add(createServiceDependency()
        .setService(UserAdmin.class)
        .setRequired(true)));

Here, we state that we have some instance of a fixture userAdmin that registers itself as a UserAdminListener and needs a UserAdmin. How straightforward is that?

The final step takes us to the actual fixture,

public class UserAdminFixture implements UserAdminListener {
	private volatile UserAdmin m_userAdmin;
...
	public void addUser(String name) {
		m_usersCreatedInLastCall = 0;
		m_userAdmin.createRole(name, Role.USER);
	}
...
}

which is just another component using a the UserAdmin service.

Putting it all together

All we now need to do is deploy the fixture bundle in our project, and instruct FitNesse to use the remote connector. The zip file at the bottom of this post contains two shell scripts to do exactly that.

Future work

As I stated at the top of this story, this is by no means production quality code, but the concepts stand as they are. Given the way FitNesse works, the connectors do not need much extra work, perhaps support for collections. However, we could use

  • a way to reduce the boiler plate code,
  • a way to ensure that that both side of the fixtures use the same function naming, and
  • better integration, for instance by only firing up a framework once a FitNesse suite is started.

Let’s play with it!

I have built a zip file containing everything you need to get started, including a set of scenarios that can run with both a homebrew implementation of a User Admin, and the actual Apache Felix User Admin. A Readme gives you more information on getting it all up and running.

, , , ,

1 Comment

Presentaties AgileMDD kennissessie – 30 maart 2010

Op 30 maart organiseerde luminis samen met ArchitecIT een kennissessie over model-driven development. Aan de hand van vijf verschillende thema’s deelden sprekers van diverse organisaties hun praktijkervaringen met MDD. De volgende organisaties waren als deelnemer vertegenwoordigd: ArchitecIT, Delphino Consultancy, luminis, Ministerie van Defensie, Nedap, Nuon, PANalytical, Radboud Universiteit Nijmegen, Sogeti, Tennet en Thales.

De presentaties van deze avond zijn inmiddels online beschikbaar en kunnen hieronder worden gedownload.

agilemdd_logo


In de praktijk zijn er bij softwareontwikkeling nog veel communicatie (overdrachtsmomenten) en bestaan de meeste ontwikkeltaken uit veel handwerk. Op basis van een MDD aanpak kunnen ontwikkeltaken worden geautomatiseerd en kan de onderlinge communicatie worden verbeterd. Hierbij is het echter wel belangrijk om te weten hoe MDD het beste kan worden toegepast en wat hierbij de meest voorkomende valkuilen zijn. Vanuit onze AgileMDD filosofie moet bij model-driven development een pragmatische en doelgerichte aanpak vooral centraal staan. Zo kan de bestaande ontwikkelkracht in de organisatie slimmer worden ingezet.


Programma kennissessie 30 maart:

Wil je op de hoogte blijven van aankomende AgileMDD sessies of geïnteresseerd in advies op maat? Neem dan contact op met Inge Dokter (inge.dokter@luminis.nl) of bel 026-3653470.

Discussie resultaat vorige MDD kennissessie

, , , ,

2 Comments

Swing & OSGi — please play nice!

In a recent blog by Peter Karich, he showed how to create a pluggable Swing application using OSGi. While this works fine for smaller examples, you might run into more serious issues once you application starts to grow.

Plugging Swing: it leaks?

Let’s start with an application not unlike the one from aforementioned blog; it uses a window as host, and has a pluggable menu, and a pluggable table.

SWING_OSGI_Pluggable components600

You can find the code we used at the end of this entry (or, for the impatient, here).

Using this pluggable system, we could end up with several curious situations. For instance, you might have a mixed look and feel in you application.

SWING_OSGI_wrong_menus

Or worse, you might end up with a UI that (sometimes) fails to start, and spits a stacktrace your way.

swing_osgi_NPE

It leaks, but why?

Our host, and all components have been stored in separate bundles, meaning we don’t have full control about the order in which actions are performed (more about that later). However, we do know there are orders of execution that are less than ideal; let’s force one of those.

The project contains an Ant script to make things easier. From the root of the extracted project, run

$ &gt; ant run1

This starts the framework, installing the necessary bundles, but does not start them (note that this step uses Pax Runner, and therefore needs internet access). We can now start our bundles in the order we like.

A tale of two look-and-feels

After starting the framework, wait for the “Welcome to Felix” message, and run

     [java] Welcome to Felix
     [java] ================
     [java]
start 2
start 1

The situation arises because the look and feel is a static concept in Swing. The menu bundle creates its JMenu before (see Menu.java, ln 30) the host sets its look and feel (Host.java, ln 51), and keep that look and feel, even when the host bundle changes it later.

Tables, ScrollPanes and NPEs

The NullPointerException above is a different story, but it goes back to the same staticness of Swing too. To force this situation, start only bundle 4.

     [java] Welcome to Felix
     [java] ================
     [java]
start 4
     [java] Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException
     [java]      at net.luminis.swingosgi.part1.scrolltable.impl.TableComponent$1.run(TableComponent.java:31)
     [java]      at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:209)
     [java]      at java.awt.EventQueue.dispatchEvent(EventQueue.java:633)
     [java]      at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:296)
     [java]      at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:211)
     [java]      at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:201)
     [java]      at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:196)
     [java]      at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:188)
     [java]      at java.awt.EventDispatchThread.run(EventDispatchThread.java:122)

Let’s take a look at the line where this NPE happens:

JScrollPane scrollPane = new JScrollPane(table);
scrollPane.getColumnHeader().setBackground(Color.blue);
m_panel.add(scrollPane);

We know that the ColumnHeader is null. This is because its JTable’s responsibility to create the header, but this is only done once the table knows it is part of an AWT hierarchy. The following lines come from the 1.5 JDK on a Mac; configureEnclosingScrollPane() creates the column header. This addNotify method comes from Component, and notifies of, exactly, the event of being added to an AWT container.

public void addNotify() {
  super.addNotify();
  configureEnclosingScrollPane();
}

Order, order!

So, the static nature of Swing and the dynamic nature of OSGi seem to hurt each other seriously here.

One way to get the application right is by fixing the order in which Swing components can be created. By starting bundle 1 first in our application, we at least fix the look and feel. Getting the scrolling table to run correctly is an entirely different story.

Regarding order, a few possible solutions spring to mind immediately,

  1. Put all UI stuff in one bundle
  2. Use OSGi bundle start levels

Sure, all UI in a single bundle will give you the control necessary, but it also defeats the purpose. OSGi start levels can at least solve the ordering issues, but will not get you out of the NullPointerException and might have more impact than you desire.

What order?

As we have seen, absolute order does not solve our problem. How about separating creation and initialization? Still, we need to impose some order, or at least some hierarchy.

SWING_OSGI_Application composition600

We represent each Swing component by an OSGi service, and leverage the OSGi service dependency resolution to build up our hierarchy; this way, we know the host service will be started last.

  1. Resolve services Once the host bundle starts, we know all components are locked and loaded; the host can now start setting up Swing’s static elements like the look and feel.
  2. Create components Component creation ripples downward: the host gets its direct children, adding them to its container, and in the process triggering the children to get their child components.
  3. Initialize components Once the component creation is done, the host instructs each component to initialize; we can now be certain that all components are part of the AWT hierarchy.

To reach this situation, we introduce a new OSGi service that wraps the component.

SWING_OSGI_Using component provider600

All components are handled by a service implementing ComponentProvider; notice how methods are required to be called on the EventDispatchThread, making sure that all components are created on the EDT, while retaining the order necessary.

public interface ComponentProvider {
 /**
 * Constant to identify ComponentProvider services.
 */
 public static final String COMPONENT_ID_KEY = "component.id";
 
 /**
 * This function should always be called from the EDT. The implementor
 * may assume that this function is called once and before {@link #addedToContainer()}
 *
 * @return the implementors (Swing) component which it provides.
 */
 public JComponent getComponent();
 
 /**
 * Triggered when the component is added to a container. The implementation
 * can validate some stuff. This function must be called on the EDT.
 * Implementors may assume this function is called after {@link #getComponent()}.
 */
 public void addedToContainer();
}

The getComponent function is analogous to the create step above; the addedToContainer triggers the initialize.

Let’s try that out!

To check that this actually works OK, run

$ &gt; ant run2

from the root of the project, and start the bundles in any order you like. The UI will only show up once all required components are available; notice that the Table and the ScrollPane component can be used interchangeably.

Is it all good?

For the most part, yes. You do give up some flexibility: the UI is assembled at runtime, but it is no longer possible to (easily) plug components into a running system without special provisions. Then again, how often do you deploy new Swing-based functionality to a running application?

In the example application, we use ServiceTrackers to keep track of the components needed by the host. In a real system, you should consider using some dependency management mechanism; we have used the Apache Felix Dependency Manager in the past.

The project and the story

The project mentioned above is available as a zipped Eclipse project. You can directly import this into Eclipse, or just unzip it and run the Ant build file.

To run the examples, you will need Apache Ant. Also, since we use Pax Runner, you will need an internet connection.

The presentation we gave about this at Devoxx 09 is at SlideShare.

, ,

12 Comments

ApacheCon US 2009 – Celebrating a decade of open source leadership

The Apache Software Foundation celebrated its 10th anniversary last week at the ApacheCon US in Oakland, California. The event, which lasted from November 2nd to 6th, consisted of many different types of events, ranging from full-day trainings to lightning talks, from a hackathon to technical and marketing sessions. On friday, the event featured a full-day track about OSGi, where all OSGi related Apache projects like Felix, ACE, Sling and Tuscany where present. The big announcement of the conference was the fact that Subversion wanted to join Apache. In fact, during the event, just like with any other project, there was a vote to accept Subversion into the incubator. As with many projects, this triggered some discussion, debating the merits of doing a release during incubation, even though this is a project with many seasoned Apache committers on board.

A conference like no other

Apache probably is the strongest brand in the open source space, but the conference itself focusses strongly on content. Here you will see no sponsored talks by commercial vendors, no sales people trying to sell you anything, it’s all about the code, the community and collaborating with each other. In that sense it’s quite different from most other conferences and if you like meeting and discussing fellow developers, this is a great place to visit. Many events facilitate discussion, and power and internet connectivity are available everywhere.

What open source is all about

Brian Behlendorf summarized the three main cultural elements of Apache quite well:

  • write good code and debate it to the bone
  • be humble
  • collaborate

In essence, Apache is a meritocracy, of which only individuals can become a member. It’s sometimes also described as a do-ocracy as projects are driven by contributions: if you want something done, just do it. Another important aspect is that everything that is done on the Apache projects is discussed and archived on the mailing list. All discussions, code diffs and decisions must be recorded there.

Presenting Apache ACE

Tuesday evenings “birds of a feather” session featured a discussion about Apache ACE, where questions mostly centered around the use cases for ACE and possible integrations with other OSGi components. One of the conclusions is that there are probably three different phases of deployment:

  1. Using Apache Felix File Installer, which allows you to drop components in a local folder to have them installed.
  2. Using Apache Felix Karaf’s provisioning components, which allow you to define features which basically group components and allow you to define dependencies on other features.
  3. Using Apache ACE, which allows you to group components and automatically deploy them to many remote systems.

Friday’s OSGi track started with an introduction to OSGi and moved into more advanced topics during the day. The Apache ACE talk was received well, with several people expressing an interest in wanting to use it and contribute to it.

Final thoughts

Summarizing the week, Floris and I had a great time talking to many interesting people and learning about various projects. ApacheCon is a great conference, and I’m already looking forward to the next one.

, , , , , , , , , ,

No Comments

Verrassende effecten bij een toString

Bij Java cursussen waarschuw ik de cursisten altijd om voorzichtig te zijn met de
toString() methode en ook bij code reviews komt het wel eens aan de orde. Sommige mensen
zijn geneigd "iets" te doen met het resultaat van een toString() aanroep (anders dan
afdrukken of loggen) en het kost me nog wel eens moeite om mensen te overtuigen dat dat nogal
riskant is.

Wat mij betreft is de redenering heel helder: weliswaar definieert de javadoc van
Object dat het resultaat een "concise but informative representation" van het object
moet zijn (en voor mensen goed leesbaar), maar dat laat nogal wat ruimte voor
variatie. Als programmeur kun je er dus niet op vertrouwen dat hetgeen je terugkrijgt
van toString bruikbaar is om er iets mee te doen of enige beslissing hierop te baseren.

Dit wordt nog wel eens afgedaan als een theoretische redenering. Immers, als je zelf de
toString implementeert weet je precies wat hij teruggeeft, en waarom zou je die kennis
niet kunnen gebruiken? Als je het heel netjes doet specificeer je in de javadoc heel
precies wat de return waarde is, en niemand kan je wat maken.

Waar dit aan voorbij gaat, is dat "derden" die specifieke semantiek niet verwachten. Er komt een dag dat
iemand anders de code aanpast of met een afgeleide class uitbreidt, en die gaat niet de
javadoc van toString() bestuderen – die kent hij al. Zonder zich er van bewust te zijn
breekt hij het (gewijzigde!) contract en is Leiden in last. En laten we wel wezen: op
de keper beschouwd was het preciezer definieren van het toString resultaat al een
wijziging in het contract. Hiermee wordt een belangrijk principe van "good coding
style" overtreden: goede code bevat geen verrassingen.

Aanroeper onbekend
Een andere reden om op te passen met de implementatie van de toString method, is dat
het onderdeel is van het Object contract en je nooit weet wie of wat je wanneer
aanroept. Hoe vreselijk dit uit de hand kan lopen, merkte ik een tijdje geleden toen ik
probeerde een performance probleem op te lossen.

Het ging om een applicatie die draaide op JBoss. De testers hadden het gevoel dat de
applicatie onder zware belasting steeds langzamer werd – meer dan ze op grond van de
belasting zouden verwachten. Met een profiler had ik van alles onderzocht, en ook wat
kleinere issues opgelost, maar de klacht bleef.

Nadat ik op een gegeven moment wat had zitten monitoren met JConsole en eigenlijk
gedachtenloos wat zat te klikken in de thread view, viel mijn oog op een stacktrace
waarin een toString method werd aangeroepen van een van andere Queue class. Dat was
verdacht, vooral omdat in die applicatie bij zware belasting nogal lange queues konden
ontstaan. Het zou toch niet zo zijn dat…

Het was wel zo. De queue bevatte enkele duizenden of tienduizenden elementen en de
toString leverde een string op van enkele megabytes groot (niet echt "concise"). De aanroep kwam uit een
toString van een worker-thread class; waarschijnlijk had iemand daarin ooit de toString van
de queue toegevoegd om beter te kunnen debuggen. En de verrassing was dat deze methode
niet vanuit applicatie code werd aangeroepen, maar vanuit JBoss.

Ergens diep in de transactie en lock management van JBoss, wordt van een thread die in
een wachtrij geplaatst wordt, de toString() aangeroepen; ook hier weer voor
debugging en/of monitoring. Hoe drukker de applicatie, hoe meer locks, hoe vaker de
toString werd aangeroepen en hoe drukker JBoss was met het uitrekenen van Strings die
eigenlijk nooit gebruikt werden, in plaats van met het uitvoeren van applicatie code.

Het was zo’n vondst waarop je altijd hoopt als je performance problemen onderzoekt. De
aanpassing was letterlijk in twee tellen gebeurd en de performance verbetering was echt
aanzienlijk.

De moraal van het verhaal is duidelijk: geen gekke dingen doen in de toString, want je
weet nooit wie het aanroept. En dat geldt niet alleen voor (afgeleide classes van)
infrastructurele elementen zoals Thread, maar juist omdat toString onderdeel uitmaakt
van het Object contract, voor alle classes.

No Comments

ScheduledThreadPoolExecutor horribly broken

A while ago, I considered using the Java 5 ThreadPoolExecutor class for executing
remote calls asynchronuously. The application I was working on needs to perform remote
calls on large numbers of devices, and as remote calls can take quite a while, you
don’t want these remote calls to wait on each other. Moreover, as remote calls might fail,
e.g. due to network problems, a retry mechanism was also needed. The
ScheduledThreadPoolExecutor, a subclass of the ThreadPoolExecutor, additionally allows you to
schedule a task at a certain delay, which offers a simple and elegant solution for the
retry mechanism: when a remote call fails, I only had to re-schedule it with a
proper (increasing) delay and the scheduler would take care of it.

Thanks to the fact the ThreadPoolExecutors provides an abstract method afterExecute()
that is called after execution of the task, I didn’t have to pollute my task
implementation with retry logic, but could clearly separate these concerns. In the
afterExecute() method of the (subclassed) ScheduledThreadPoolExecutor, I could ask the
task whether it had succeeded, and if not I could simply reschedule it. And all this
just in a few lines of code:

void afterExecute(Runnable task, Throwable exception) {
     if ((RemoteCallerTask) task).failed()) {
         super.schedule(task, 10, TimeUnit.SECONDS);
     }
}

When I first tested it, I got a ClassCastExeception. My first guess was that it might
have something to do with different class loaders, but when I ran it in a debugger it
turned out to be something that realy surprised me: the Runnable task that was passed
to this method was not my do-a-remote-call task that I had passed to the executor, but
something of a completely different type
(ScheduledThreadPoolExecutor$ScheduledFutureTask).

Maybe I misinterpreted the documentation? I went back to the ThreadPoolExecutor
javadoc. It talked about “methods that are called before and after execution of each
task”, and the parameter description claimed the Runnable parameter to be “the runnable
that has completed”. This seemed to match my expectations: you execute a Runnable task
and that is what is passed to afterExecute. That would be the only sensible definition
of a after-execution hook, wouldn’t it? As the source code is the best documentation, I
checked the ThreadPoolExecutor source, which confirmed what I was expecting: the task
that is run is passed to the beforeExecute() and afterExecute() hooks.

A little bit of studying on the ScheduledThreadPoolExecutor source revealed why I got a
ClassCastExeception: it wraps the (user supplied) task in a ScheduledFutureTask object
before passing it to the base class (that puts it in the task queue). One of the
reasons why this wrapper is needed, is because the ScheduledThreadPoolExecutor uses a
DelayQueue to store the tasks and elements of this queue must implemented Delayed
(i.e. have a method that returns the delay). This type of queue sorts tasks based on
the delay: shorter delay comes first. When taking an element from the queue, it blocks
until the delay of the first task has passed. Using this type of queue makes the
implementation of the ScheduledThreadPoolExecutor quite simple: it wraps the task in a
wrapper that implements getDelay() and puts these wrappers in the queue.

Although I can appreciate the beauty of using a DelayQueue in combination with a normal
ThreadPoolExecutor, I don’t think it is the right solution. The point is that it breaks
one of the fundamental principles of OO programming and that is that derived classes
should respect the contract defined by the base class(es) (and or interfaces). The contract
that the base class ThreadPoolExecutor defines, is that it will call the hook methods
with your task as a parameter. ScheduledThreadPoolExecutor breaks this contract, as it
does not adhere to what its base class has promised.

This break of contract shouldn’t be taken lightly. It makes code that uses these
executors fragile: if for some reason someone decides to use the other class as
implementation of the general executor service, existing code might break. Put
differently: in order to make your code robust, you would to have to take into account
which executor implementation was chosen, at different points in your code. This
violates principles of encapsulation and abstraction: code should never depend on
implementation types, only on interfaces.

I was pretty disappointed that even in the concurrency API, such fundamental
mistakes can be found. Moreover, it appeared this is not the only break of contract, and that fixing this properly doesn’t seem to have any priority, but more on that later.

,

1 Comment