Archive | Java RSS for this section

6 Scala resources for Java programmers

During my Basic Scala and Wicket talk at the London Wicket Event I showed some good Scala starting points for Java programmers. Here they are, clickable and all.

If you want to attend one of the Wicket User group meetings in London, just visit the jWeekend site and register there. It’s really cool to attend, there is a good atmosphere and nice and smart people everywhere…

Scala homepage Scala Homepage

The Scala home on the web.

Contains reference manuals, tutorials,
news, specifications.

First Steps to Scala First steps to Scala

When you don’t know anything about Scala, start here.

Covers the interpreter, variables, methods, loops, arrays, lists, tuples, sets, maps, classes, singletons, traits, mixins.

roundup_for_java_refugees Scala for Java refugees

Series of 6 great articles covering a lot of Scala.

Aimed at Java developers.

Scala Wiki Scala Wiki

An ever growing collection of resources on Scala.

FAQ, code samples, design patterns, Scala job openings

Scala for Java Programmers Scala for Java programmers

Multiple articles covering a feature by feature comparison of Scala and Java.

Mailing Lists Scala Mailing Lists

Official mailing lists

Subscribe: empty message to scala-subscribe@listes.epfl.ch

Why many Java performance tests are wrong

Getting performance statistics right can be hard

A lot of ‘performance tests’ are posted online lately. Many times these performance tests are implemented and executed in a way that completely ignores the inner workings of the Java VM. In this post you can find some basic knowledge to improve your performance testing. Remember, I am not a professional performance tester, so put your tips in the comments!

An example

For example, some days ago a ‘performance test’ on while loops, iterators and for loops was posted. This test is wrong and inaccurate. I will use this test as an example, but there are many other tests that suffer from the same problems.

So, let’s execute this test for the first time. It tests the relative performance on some loop constructs on the Java VM. The first results:

Iterator – Elapsed time in milliseconds: 78
For – Elapsed time in milliseconds: 28
While – Elapsed time in milliseconds: 30

Allright, looks interesting. Let’s change the test a bit. When I reshuffle the code, putting the Iterator test at the end, I get:

For – Elapsed time in milliseconds: 37
While – Elapsed time in milliseconds: 28
Iterator – Elapsed time in milliseconds: 30

Hey, suddenly the For loop is the slowest! That’s weird!

So, when I run the test again, the results should be the same, right?

For – Elapsed time in milliseconds: 37
While – Elapsed time in milliseconds: 32
Iterator – Elapsed time in milliseconds: 33

And now the While loop is a lot slower! Why is that?

Getting valid test results is not that easy!

The example above shows that obtaining valid test results can be hard. You have to know something about the Java VM to get more accurate numbers, and you have to prepare a good test environment.

Some tips and tricks

  • Quit all other applications. It is a no-brainer, but many people are testing with their systems loaded with music players, RSS-feed readers and word processors still active. Background processes can reduce the amount of resources available to your program in an unpredictable way. For example, when you have a limited amount of memory available, your system may start swapping memory content to disk. This will have not only a negative effect on your test results, it also makes these results non-reproducible.
  • Use a dedicated system. Even better than testing on your developer system is to use a dedicated testing system. Do a clean install of the operating system and the minimum amount of tools needed. Make sure the system stays as clean as possible. If you make an image of the system you can restore it in a previous known state.
  • Repeat your tests. A single test result is worthless without knowing if it is accurate (as you have seen in the example above). Therefore, to draw any conclusions from a test, repeat it and use the average result. When the numbers of the test vary too much from run to run, your test is wrong. Something in your test is not predictable or consistent. Try to fix your test first.
  • Investigate memory usage. If your code under test is memory intensive, the amount of available memory will have a large impact on your test results. Increase the amount of memory available. Buy new memory, fix your program under test.
  • Investigate CPU usage. If your code under test is CPU intensive, try to determine which part of your test uses the most CPU time. If the CPU graphs are fluctuating much, try to determine the root cause. For example Garbage Collection, thread-locking or dependencies on external systems can have a big impact.
  • Investigate dependencies on external systems. If your application does not seem to be CPU-bound or memory intensive, try looking into thread-locking or dependencies on external systems (network connections, database servers, etcetera)
  • Thread-locking can have a big impact, to the extent that running your test on multiple cores will decrease performance. Threads that are waiting on each other are really bad for performance.

The Java HotSpot compiler

The Java HotSpot compiler kicks in when it sees a ‘hot spot’ in your code. It is therefore quite common that your code will run faster over time! So, you should adapt your testing methods.

The HotSpot compiler compiles in the background, eating away CPU cycles. So when the compiler is busy, your program is temporarily slower. But after compiling some hot spots, your program will suddenly run faster!

When you make a graph of the througput of your application over time, you can see when the HotSpot compiler is active:

Throughput of a running application

Througput of a running application over time

The warm up period shows the time the HotSpot compiler needs to get your application up to speed.

Do not draw conclusions from the performance statistics during the warm up time!

  • Execute your test, measure the throughput until it stabilizes. The statistics you get during the warm up time should be discarded.
  • Make sure you know how long the warm up time is for your test scenario. We use a warm up time of 10-15 minutes, which is enough for our needs. But test this yourself! It takes time for the JVM to detect the hot spots and compile the running code.

Update

From Dries Buytaert I received a link to a paper called Statistically rigorous Java performance evaluation. I highly recommend reading it when you want to know more about measuring Java performance.

Remember, I am not a professional performance tester, so put your tips in the comments!

76 Events: statistics about Hibernate 3.2.2

Hibernate 3.2.2 statistics class cloudFrameworks are growing with every release. Classes are changed, removed and added. In this series I zoom in on some well known projects and analyze their class names with completely meaningless statistics. This is the analysis of Hibernate 3.2.2.

To get these statistics, I wrote a script that analyzed all classes. They get chopped up on word boundaries, so for ContextAwareFactoryBean the words Context, Aware, Factory and Bean are counted. From the output I generated a Class Cloud.

What is Hibernate?

Hibernate is a free, open source Java package that makes it easy to work with relational databases. Hibernate makes it seem as if your database contains plain Java objects like you use every day, without having to worry about how to get them out of (or back into) mysterious database tables.

Hibernate listens very carefully

What immediately catched my eye was the amount of classes with Event (76) or Listener (52) in their name. There are many events in Hibernate which can be catched. These events can have some related classes, like:

  • the event itself (for example the AutoFlushEvent)
  • an interface (the AutoFlushEventListener)
  • a default implementation (the DefaultAutoFlushEventListener)

It almost surprised me there was no AbstractAutoFlushEventListener or a AbstractAutoFlushEventListenerFactory!

Factories

There are 57 factories, which is quite a lot outside of an industrial park. A lot of stuff can be created using factories, for example the BasicProxyFactory, the CGLIBProxyFactory, the CacheFactory, the ClassicQueryTranslatorFactory and the MapProxyFactory. I would guess this is the most popular design pattern within the Java world. Factories are everywhere.

Types and Collections

Luckily, there is also a lot of stuff directly related to the goal of Hibernate. There are 85 classes with Type in their name, and 54 have something to do with a Collection.

There are basic types like the FloatType and IntegerType. Advanced types like the OrderedMapType and the OrderedSetType. PersistentCollection and BasicCollectionLoader. These class names look quite good!

Class Cloud (click to enlarge)

Hibernate 3.2.2 statistics Class Cloud

Top 10 of partial class names

  • Type: 85
  • Event: 76
  • Factory: 57
  • Collection: 54
  • Cache: 53
  • Exception: 53
  • Query: 53
  • Listener: 52
  • Entity: 47
  • SQL: 39

Longest class name

The longest class name of Hibernate is the CollectionFilterKeyParameterSpecification, with 41 characters!

The API documentation describes this class:

A specialized ParameterSpecification impl for dealing with a collection-key as part of a collection filter compilation.

I thought API documentation was meant to clarify? At least the above contains some pointers (like the ParameterSpecification).
Stay tuned for more useless statistics for other well known projects! If you have suggestions for which projects you want to see, please let me know in the comments!

37 Factories: statistics about Tomcat 6

Tomcat 6 statistics class cloudFrameworks are growing with every release. Classes are changed, removed and added. In this series I zoom in on some well known projects and analyze their class names with completely meaningless statistics. This is the analysis of Tomcat 6.

To get these statistics, I wrote a script that analyzed all classes. They get chopped up on word boundaries, so for ContextAwareFactoryBean the words Context, Aware, Factory and Bean are counted. From the output I generated a Class Cloud.

Tomcat is definitely Context-aware

Tomcat consists of 956 classes. Of those, 55 classes contain the word Context, a percentage of 5.75%. Remember that every live object has its own ‘context’, and the word instantly loses a lot of meaning. The API documentation gives some clearance about the subject:

Context is a Container that represents a servlet context, and therefore an individual web application, in the Catalina servlet engine.

Factories? Sure!

Every self-respecting framework has factories. It nicely keeps the newcomers out and makes it ‘easier’ for the experts. Well, Tomcat has 37 of them. Just like Spring, Tomcat has a BeanFactory. The API documentation say just enough to leave me in confusion:

BeanFactory: Object factory for any Resource conforming to the JavaBean spec.

The AbstractObjectCreationFactory

It was love on first sight with the AbstractObjectCreationFactory. It all comes together in this single class. Every project should have this… until now you created all your abstract objects yourself, this is not needed anymore! All those years!

Unfortunately, I had it wrong. This is an abstract implementation of the ObjectCreationFactory interface. Further investigation revealed that this is part of the Digester package.

The Digester package provides for rules-based processing of arbitrary XML documents.

I have no idea why this is in Tomcat. A leftover? Bad refactoring? Who knows?

Class Cloud (click to enlarge)

Tomcat 6 statistics Class Cloud

Top 10 of partial class names

  • Context: 55
  • Factory: 37
  • Task: 33
  • Ast: 33
  • Channel: 33
  • Rule: 32
  • Base: 31
  • Constants: 30
  • Handler: 30
  • Jsp: 27

Longest class name

The grand prize goes to: MbeansDescriptorsIntrospectionSource, with 36 characters!

The API documentation does not contain a description of this class, so I have no idea what it does…

Stay tuned for more useless statistics for other well known projects! If you have suggestions for which projects you want to see, please let me know in the comments!

8.47% is a Resource: statistics about Wicket

Wicket 1.3.3 statistics class cloudFrameworks are growing with every release. Classes are changed, removed and added. In this series I zoom in on some well known projects and analyze their class names with completely meaningless statistics. Now up: Wicket 1.3.3!

To get these statistics, I wrote a script that analyzed all classes. They get chopped up on word boundaries, so for ContextAwareFactoryBean the words Context, Aware, Factory and Bean are counted. From the output I generated a class cloud.

Wicket likes Requests for Pages or Resources

8.47% of the total number of classes in Wicket contain the word Resource. In absolute numbers, 68 classes (of a total of 802) have this word in their name. This is quite a low percentage for the top result, compared to the Spring statistics where the top result Bean is present in 12.31% of all classes.

From the top ten, it’s easy to spot that Wicket is a web framework, with Pages, Web, Ajax and Resources in many class names.

Wicket loves Wicket

A bit remarkable regarding Wicket statistics: about 4.24% of all classes have the string ‘Wicket’ in their name. Relative to the number of Spring classes that have Spring in their name, this is 2.3 times as much!

In general, the class names of Wicket are well distributed. This probably originates in the fact that Wicket is very specialized on one subject and is not a general purpose platform.

Class Cloud (click to enlarge)

Wicket 1.3.3 statistics Class Cloud

Top 10 of partial class names

Wicket 1.3.3 statistics bar chart

  • Resource: 68
  • Request: 62
  • Page: 56
  • Abstract: 42
  • Target: 37
  • Stream: 36
  • Validator: 36
  • Web: 36
  • Component: 34
  • Ajax: 34

Longest class name

The grand prize goes to: BookmarkablePageRequestTargetUrlCodingStrategy, with 46 characters!

The BookmarkablePageRequestTargetUrlCodingStrategy encodes and decodes mounts for a single bookmarkable page class (source).

Stay tuned for more useless statistics for other well known projects! If you have suggestions for which projects you want to see, please let me know in the comments!