Thursday, March 12, 2009

Performance tips for Clojure

[This has been edited since I was wrong about some things]

Clojure is very fast.
It is also extremely easy to write slow Clojure code.

So here are 5 tips that will help you optimize Clojure code if you are suffering from performance problems.
I'm not going to add this as a point, since it has been written all over the Internet, but it is nonetheless the most important thing to keep in mind.
Make it work, then make it right, then make it fast.
Premature optimization is the root of all evil.
Any other famous frase?

Improving the algorithm is almost always better. No matter how well you micro-optimize an algorithm that runs in exponential time, performance is still going to suck. Some of the tips presented here however, deal with algorithmic properties of some Clojure functions and structures.

To measure speed bottlenecks, you need a profiler. A very good profiler is the Yourkit Java Profiler. Unfortunately it is closed-source and not cheap. They release free early-access versions of the software that you can download. As of the time of this writing, a new version just came out and there is no early-access program.

The profiler that I have been using is JVisualVM. It comes with recent JVM's. I don't like it quite as much as Yourkit, but it's very nice and more than enough.

Follow the following steps only if you have exhausted all possible algorithmic improvements.

1. Profile. Edit one thing. Profile again.


Profile your code with your chosen profiler. Identify the bottleneck. Formulate a theory of what is to blame, fix it, and test it. Don't try two things at once. Try one improvement at a time. This is a general optimization strategy that will serve you well if you don't know it already.

2. Beware of reflection


Use (set! *warn-on-reflection* true) so that Clojure warns you about reflection.

Look at the following function:

(defn str-split [s regex]
(vec (. s split regex)))

Clojure doesn't know if 's' is a string, so it will do an actual search to find a matching method which name is "split" on whatever class 's' belongs to. This will be horribly slow.
It is easily fixed this way:


(defn str-split [#^String s regex]
(vec (. s split regex)))


This new version tells Clojure that str-split expects 's' to be a string.

3. Understand the Clojure data structures' performance characteristics.
[Turns out I didn't understand them. I'll add something later]

4. Iterating is tricky.

When you are writing functions with side-effects, you will often resort to the good-old 'for' loop. The closest equivalent in Clojure is:

(dotimes [i 100])

This is macro-expanded to something like this


(let* [limit (clojure.core/int 100)]
(loop [i (int 0)]
(when (< i limit)
(recur (Unchecked-inc i)))))

..which will be as fast as Java.

If you need a more powerful 'for' loop, it is a very simple macro to write =).

5. Use java primitives.

If you need to do fast arithmetic operations, using encapsulated Integers, Doubles, etc.. is not good enough. You need Java primitives like int,double, etc..

Integer operators do overflow checking to convert Integers and Longs to BigIntegers. You can get around this check by using unchecked operators: unchecked-add, unchecked-dec... etcetera

The usual operators +,*,inc,/, etc.. do not do checking for floats and doubles, and as long as you use primitives they will be fast.

6. When all else fails...
... use Java classes. A Java array will be faster than a Clojure vector. A Java map will be faster than a Clojure map. Use this as a last resort. Clojure's data structures are very fast and should almost always be enough.

This are my tips so far. I may edit this post in the future to add more points if I come up with something or if people that read this suggest something else. I am pretty ignorant with regard to the performance of STM. Maybe someone can add a point about it.

3 comments:

scgilardi said...

rest is an operation on seqs. A list is a seq, but a vector is not. When map is given a vector, it first creates a seq on it. The rest operation on that seq is O(1). In some quick micro-benchmarking, the performance of iterating identity many thousands of times over a list or vector of a few hundred items gave very similar performance. I'm interested in what you see in the code or in a test that suggests that map performs much better over a list than over a vector.

Sergio said...

(def ls (list (range 1000000)))
(def vc (vec (range 1000000)))

(time (dorun (map (fn [x]) ls)))
"Elapsed time: 3.184435 msecs"

(time (dorun (map (fn [x]) vc)))
"Elapsed time: 498.535303 msecs"

I was based on something like that. Probably the extra time taken is from converting the vector to a seq. That would still make it O(n) right? It would just be for another reason.

Maybe you are getting similar performance because you didn't use dorun and map returns a lazy seq.

Thanks, I'll edit the post.

Meikel Brandmeyer said...

You create a list of one single element, namely the said range. Try to remove the list or use list* in your benchmark and you will get similar results to the vector. (Of course mapping over one element is faster than mapping over one million)

Creating a seq on something is O(1), not O(n). The collection is not turned somehow into a seq, but the seq is an abstract view on the collection.