Sergio's devlog: 2009

Monday, August 24, 2009

Original Futurama cast is comming back!!

About a month ago, I sent this email to the casting company that was calling for new voice actors for futurama:

Please, do tell those suit-wearing cheap monkeys at Fox to use the original cast.

Sincerely, the Internet.

About 20 minutes ago, I got this response:

Dear Futurama Fan,

Thank you for voicing your strong support for Futurama and the original voice cast. We appreciate the time you took to fire off a thoughtful and/or concerned and/or homicidal e-mail message to our casting director, Scott Muller. (Rest assured, not a single one of these e-mails reached the actual decision-makers at 20th Century Fox Television. For future reference, please note that Scott Muller is in fact one of Futurama's biggest fans, and was instrumental in bringing about the return of the cast. Please hoist a bottle of Olde Fortran for Scott!)

Speaking of the cast returning... good news, everyone! The cast is returning! All of our series regulars are back in action for Season 6. Billy West, Katey Sagal, John DiMaggio, Maurice LaMarche, Tress MacNeille, Lauren Tom, Phil LaMarr, and David Herman are all on board and have already begun delivering their customary virtuoso performances. The all-new Futurama episodes are slated to air on Comedy Central beginning in June, 2010.

We are extremely grateful for the outpouring of support for the series. Your loyalty over the years has kept the show going through its original four seasons on FOX, its subsequent reincarnations on Adult Swim and DVD, and now its Bendiferous return to life on Comedy Central. See you in the (near) future!

Sincerely,

David X. Cohen & Matt Groening

Good news indeed!

Saturday, April 18, 2009

Clojure optimization of constants

(I actually read about this trick on the clojure user group, but I can't remember where.)

This is a neat trick that eliminates boxing/unboxing overhead when you are using Integer and all you need is int.
(This of course works for the other java primitives)

Let's do something stupid like summing up two Integers ten million times.


(def times 10000000)

(def w 640)
(def h 480)

(time (dotimes [i times]
       (+ w h)))

This little test took around 950ms on my machine.
The w,h values are unboxed from and boxed into Integer classes in order to be used.

Now consider this:



(defmacro w [] `(int 640))
(defmacro h [] `(int 480))

(time (dotimes [i times]
       (+ (w) (h))))

This is a very smart macro trick that I read on the Clojure user group. What it does is it substitutes (w) with a java primitive at compile time.
On my machine, the benchmark took around 430ms.
The problem is that it is ugly, but that can be fixed easily.



(defmacro const-int [name val]
 `(defmacro ~name [] (list 'int ~val)))

(const-int w 640)
(const-int h 480)

(time (dotimes [i times]
       (+ (w) (h))))

Still around 430ms. There is no performace cost because of the way macros are evaluated.
const-int is a macro that creates a macro so our definition looks prettier. The only caveat is you still have to use (w) instead of w.

Sunday, April 5, 2009

pyCave: Porting to mac.

So for the past couple of days I have been porting pyCave to my mac. Yesterday was the day I thought the work was pretty much finished (It's not).

I did it by starting with a GL demo written from scratch, and I went up from there adding features until I had a working copy of pyCave. Obviously, there was a lot of copy-pasting, but the overall writing involved a lot of redesign that ended up touching the whole program. The diff file was huge.

This wasn't the original plan. The 'project' was called 'glui'. It was only a module to build GL user interfaces. It eventually morphed into a new version of pycave =). There was only a single commit. Bad idea. But the code is the product of an intense non-stop coding session.

One of the first changes was adding an opengl extension checking mechanism. I found a nifty thing that pyOpenGL had. Every extension module has a member called EXTENSION_NAME. I used that to check for extensions in a very concise manner.
Little did I know that the nifty feature is actually only featured in the latest releases of pyOpenGL 3 (ubuntu beta releases don 't have it).

So i'm thinking that I'll include a subset of the latest pyOpenGL (3.0.0) so that I won't get emails from ubuntu users with Nvidia cards saying that the game doesn't run.

The game runs almost perfectly on my Macbook with Python2.6. I was going to write about the hell I went through compiling pygame for Python2.6 on a mac (it's practically impossible), but it's a long story. I had to do it because I had screwed up the default Leopard python instalation thanks to a very stupid bash command.
Today I fixed my pretty apple-provided Python and tried to run pyCave (practically all my users will do the same).

It actually crashes.

The 2.5 and 2.6 versions have the same PyOpengl (release candidate 1). It is a very strange bug and I think my frankenstein python2.5 instalation is to blame. I am not sure.
A friend of mine just got a brand new Macbook (lower-end) and I would love for him to let me borrow it for a half hour to do some testing.

I am very optimistic that my 6 year old laptop will be able to run the game. It already does, but there is an odd bug in which all the edges are highlighted. It looks awesome with the proceduraly generated tunnel =) It also has the extension-checking bug, which should be fixed by adding the OpenGL module inside the pyCave distribution.

Tuesday, March 31, 2009

pyGame beta

I have had this little game brewing for too long. It's a small program. Around 2,000 lines of python.
I tried to keep it as "from scratch" as possible. I didn't have a dependency on pygame until I had to add music. I wouldn't hesitate to remove the pygame dependency if I found a lighter cross-platfrom sound library that was as simple to use as pycave.mixer.

I am not using shaders. Shadows were implemented after reading lots of papers with a depth map using OpenGL's fixed functionality. I want to write a more sophisticated shadowing algorithm but it would be too painful without using shaders. I may add the functionality to pyCave but it may be out of the scope of the project, and it would introduce too much pain (see below on porting).

A few weeks ago I released a first beta to generally good reviews. Some people find it too difficult, while others (myself included) find it easy until they reach hardcore mode.
I have learned a few things about pyCave. First, dealing with video card drivers is a pain. Most laptops with intel integrated cards won't work. They just render a black screen instead of showing the menu. I think it's because they don't support multi-texturing without using an opengl extension. I will prove it soon enough with the new interface code that I'm writing.
There was a very strange bug caused by trying to clear the stencil buffer. On my macbook it caused a bus error (mac-speak for segmentation fault). A friend running linux on a lap also had a segfault. I guess it was the same bug. I just removed the line since I am not using the stencil buffer =) It was just a left-over from when I was implementing shadow volumes.

Right now I'm working on making it more cross-platform. I wrote a little python module to check for available extensions. I am changing some abstractions in order to make the menu code less horrible. I will be happy when pyCave runs on my Macbook and another dying 5 year-old laptop.

For now, If you are on a desktop computer with an Nvidia graphics card, you can get pycave from

http://pycave.sourceforge.net/index.php
(svn version is better ;)

Thursday, March 12, 2009

Performance tips for Clojure

[This has been edited since I was wrong about some things]

Clojure is very fast.
It is also extremely easy to write slow Clojure code.

So here are 5 tips that will help you optimize Clojure code if you are suffering from performance problems.
I'm not going to add this as a point, since it has been written all over the Internet, but it is nonetheless the most important thing to keep in mind.
Make it work, then make it right, then make it fast.
Premature optimization is the root of all evil.
Any other famous frase?

Improving the algorithm is almost always better. No matter how well you micro-optimize an algorithm that runs in exponential time, performance is still going to suck. Some of the tips presented here however, deal with algorithmic properties of some Clojure functions and structures.

To measure speed bottlenecks, you need a profiler. A very good profiler is the Yourkit Java Profiler. Unfortunately it is closed-source and not cheap. They release free early-access versions of the software that you can download. As of the time of this writing, a new version just came out and there is no early-access program.

The profiler that I have been using is JVisualVM. It comes with recent JVM's. I don't like it quite as much as Yourkit, but it's very nice and more than enough.

Follow the following steps only if you have exhausted all possible algorithmic improvements.

1. Profile. Edit one thing. Profile again.

Profile your code with your chosen profiler. Identify the bottleneck. Formulate a theory of what is to blame, fix it, and test it. Don't try two things at once. Try one improvement at a time. This is a general optimization strategy that will serve you well if you don't know it already.

2. Beware of reflection

Use (set! *warn-on-reflection* true) so that Clojure warns you about reflection.

Look at the following function:


(defn str-split [s regex]
      (vec (. s split regex)))

Clojure doesn't know if 's' is a string, so it will do an actual search to find a matching method which name is "split" on whatever class 's' belongs to. This will be horribly slow.
It is easily fixed this way:


(defn str-split [#^String s regex]
      (vec (. s split regex)))

This new version tells Clojure that str-split expects 's' to be a string.

3. Understand the Clojure data structures' performance characteristics.
[Turns out I didn't understand them. I'll add something later]

4. Iterating is tricky.

When you are writing functions with side-effects, you will often resort to the good-old 'for' loop. The closest equivalent in Clojure is:

(dotimes [i 100])

This is macro-expanded to something like this



(let* [limit (clojure.core/int 100)]
      (loop [i (int 0)]           
           (when (< i limit)
        (recur (Unchecked-inc i)))))

..which will be as fast as Java.

If you need a more powerful 'for' loop, it is a very simple macro to write =).

5. Use java primitives.

If you need to do fast arithmetic operations, using encapsulated Integers, Doubles, etc.. is not good enough. You need Java primitives like int,double, etc..

Integer operators do overflow checking to convert Integers and Longs to BigIntegers. You can get around this check by using unchecked operators: unchecked-add, unchecked-dec... etcetera

The usual operators +,*,inc,/, etc.. do not do checking for floats and doubles, and as long as you use primitives they will be fast.

6. When all else fails...
... use Java classes. A Java array will be faster than a Clojure vector. A Java map will be faster than a Clojure map. Use this as a last resort. Clojure's data structures are very fast and should almost always be enough.

This are my tips so far. I may edit this post in the future to add more points if I come up with something or if people that read this suggest something else. I am pretty ignorant with regard to the performance of STM. Maybe someone can add a point about it.

Tuesday, March 10, 2009

Lisp after Python after Lisp.

(Or, Python vs Clojure rant)

This post assumes some familiarity with Lisp and Python from the reader.

Like loads of programmers, I have read most of Paul Graham's essays, and I always finish them with that feeling that if I'm not using Lisp then I must be an idiot. Like most programmers, I certainly don't want to be an idiot, so I went ahead and learned Lisp, Common Lisp.

At first, I was horrified by all those damned parentheses. Still, I kept at it until I was able to write code half-decently. I didn't get very far, and truly diving into Lisp was buried deep into my to-do list.

I didn't really got Lisp until I learned about Clojure.
Clojure is a new functional Lisp that runs on the JVM. It is a great language with brilliant design and it has the potential to bring some great research ideas into the mainstream.

Learning Clojure has gotten me to really appreciate Lisp for what it is.
Some months ago I wrote about how I loved LINQ. With Lisp, I can't complain if I don't have LINQ, I can just write a macro to implement it.
I understand that when I'm writing Lisp code, what I am writing is an abstract syntax tree (AST) in human-readable form. You can see code as the very data structure that represents your program, and you can modify it as such. That is why Lisp must be just the perfect language for Genetic Programming, where code needs to modify itself.

I love the fact that in Lisp everything is an s-expression. S-expressions have an advantage that I hadn't anticipated. They make editing easier in an editor like Emacs where you have commands that explicitly handle sexps.
Editing Lisp inside Emacs, after a while, is a very enjoyable experience. You can transpose,kill,navigate through and mark sexps as if they were letters or words. And since everything in Lisp is an s-expresion, it becomes very malleable.

So, after having my Lisp epiphany, I started seeing Python through a new light. I saw that Python was just like Lisp in so many ways, yet it was different in many others.

Python is dynamic and has a REPL. Both are huge advantages that Lisp has had for decades.
In Python, there is a difference between statements and expressions. This was my first big no-no on my comeback to Python.
I would like to write something like this:


a = if (x == 2):
           "Hello"
    else:
           "Goodbye"

Lisp code is elegant, and Lisp as a language is mathematically beautiful. It is also true that once you get used to all the parentheses you start not to notice them. However, Lisp is just not as pretty as Python. There's only so much you can do when your language requires the programmer to write the AST directly.

Python has the most beautiful syntax I have yet to see. Letting whitespace have syntactic meaning is a great design decision. The resulting code is indented the way people should indent their C-style programs anyway.
I have found downsides to this approach. In programs with lots of consecutive, horrible OpenGL API calls, things would look prettier if the language would let you indent at will. 99% of the time, however, it is a feature rather than a bug.

Every programmer has some personal pseudocode. This pseudolanguage is the language we think in. Python is as close to most programmers' pseudocode as you can get. I get this warm, fuzzy feeling when I code in Python. Its philosophy dictates that programming is so hard that the language should really just get out of the way of the programmer.

Programming in Clojure has definitely changed the way I program. More thanks to the fact that it is functional than to the fact that it is a Lisp. After you pass the brain-freeze that is inevitable when dealing with lack of state, you reach a point of enlightenment. That is that you realize that programming functionally frees you from worrying about post-conditions and pre-conditions. It is such a nice feeling when you know that a function isn't really changing the world, that it will change the way you code in other languages. Being stateless lets you stop worrying about a huge set of problems. It makes you smarter, since you are keeping less things in your head, and it reduces the bugs you can create. I am not saying you'll absolutely loathe state after spending time with a functional language. I love a for loop as much as the next dude, I am saying that you will avoid state when you can. I agree with Tim Sweeney in that the functional paradigm should be the default. And that modifying state should be made explicit. I think this concept is present in Clojure's STM approach, although it hasn't really been proven effective in huge, complex applications.

I still prefer to code in Python than to code in Clojure, just like I still prefer to speak in Spanish, my native language, than to speak in French.

When I code in Lisp, I write pseudo-code that is pretty similar to Python and then I translate to Lisp as I write. Yet now, when I'm writing in Python, I often say "This would could be easier to express with Clojure". I would like to know if my pseudolanguage is a product of having been exposed to imperative programming all these years. I don't know if there are programmers out there whose pseudo-language is functional. Maybe it's just human nature to think imperatively.
If my pseudolanguage ever starts to look like Clojure, I guess I'll have my answer

Sunday, February 22, 2009

3n+1 Programming challenge in clojure. (Clojure is fast!)

My cousin was solving a programming challenge involving the collatz conjecture. He was implementing his solution using dynamic programming in C++

I thought it would be fun to try and write it in Clojure.

Our solutions are pretty similar, and mine ended up being about 2 times slower than his.


(set! *warn-on-reflection* true)

(defn check-len*
  ([l cache] (check-len* l cache 1))
  ([k cache len]
  (loop [k (long k)
         len (long len)]
    (let [val (get cache k)
          is-even (even? k)]
      (if val
        (unchecked-dec (unchecked-add len (long val)))
        (if (== k (long 1)) len
              (recur (long (if is-even
                             (bit-shift-right (long k) (long 1))
                             (unchecked-add (long k) 
                                     (unchecked-add (long 1)
                                     (long (bit-shift-right k 1))))))
                     (long (if is-even
                             (unchecked-inc len)
                             (unchecked-add len (long 2)))))))))))

(defn check-max [a b]
    (loop [i (long a)
           max (long 0)
           cache {}]
      (if (>= i b) max
        (let [val (check-len* i cache)]
          (recur (inc i)
                 (long (if (> val max) val max))
                 (assoc cache i val))))))

(defn go [beg end]
  (time (check-max beg end)))

As of memory, for the range 1-10,000,000 I am using 1.7gb of heap memory, while the C++ implementation is using 1.2gb.
My implementation is 42% beefier. and 2x slower. But it's smaller!

Since almost half the time is spent putting and getting values from the clojure map, using a Java HashMap is an improvement, and actually makes the solution faster than the C++ version. (Memory-wise, it settles on less than 1.2gb after the solution, but just before it finishes it reaches 1.4gb)


(set! *warn-on-reflection* true)
(import '(java.util HashMap))

(defn check-len*
  ([l #^HashMap cache] (check-len* l cache 1))
  ([k #^HashMap cache len]
  (loop [k (long k)
         len (long len)]
    (let [val (. cache get k)
          is-even (even? k)]
      (if val
        (unchecked-dec (unchecked-add len (long val)))
        (if (== k (long 1)) len
              (recur (long (if is-even
                             (bit-shift-right (long k) (long 1))
                             (unchecked-add (long k) 
                                    unchecked-add (long 1)
                                   (long (bit-shift-right k 1))))))
                     (long (if is-even
                             (unchecked-inc len)
                             (unchecked-add len (long 2)))))))))))

(defn check-max [a b]
    (loop [i (long a)
           max (long 0)
           cache (new HashMap)]
      (if (>= i b) max
        (let [val (check-len* i cache)]
          (. cache put i val)
          (recur (inc i)
                 (long (if (> val max) val max))
                 cache)))))

(defn go [beg end]
  (time (check-max beg end)))

An all-around better solution is to use the optimization tips on the wikipedia article for the Collatz Conjecture, which would reduce it to some arithmetic trickery.

Thursday, January 22, 2009

Lack of CS interest in Google trends.

So the other day I was checking out Google trends, and I figured I'd check how my favorite editor, Emacs, was doing.

Turns out that there has been a steady decline in the number of google searches done for "emacs" from 2004 to Jan 2009.
I wanted to see how "eclipse" was doing, but I figured that most of the people that would search for eclipse must mean an actual eclipse, not the IDE. So I then proceeded to see how Java was doing:

As you can see, there is a very similar decline, though with less of a slope.

Intrigued, I checked the trend for the much more general term "Algorithm":

Almost the same decline as "emacs"!! Except for spikes that seem to correlate to school terms.

Now take a look at the trend for "computer science":

If you ask me, I would tell you that this is a sign of a steady decline of interest in computer science/programming, since the very general term "math" doesn't show the same slope:

(Note that there are very marked valleys correlating with school vacations).

Sergio's devlog