First Month as a Terracotta Developer
Somehow my first month at Terracotta has already flown by. Here’s what I’ve been up to.
Mostly I’ve been testing. We have a very cool distributed testing framework called “Droid”, and using it we’ve written a number of test scripts (in Groovy) to simulate some real-world use cases and see where we can make improvements. Droid, in a nutshell, allows us to easily and automatically run tests on multiple nodes, and provides means of synchronizing those nodes. In fact, Droid achieves this clustered synchronization using Terracotta, which just goes to show you that we’re not afraid to eat our own dog food.
In the process of working on this I’ve gotten to dive fairly deep into both Groovy and the java.util.concurrent classes, both of which I’ve enjoyed immensely. My humble contribution to Droid was to add some Groovy utilities enabling us to set up multiple test “phases”, which are simply points at which the nodes all synchronize by waiting on a CyclicBarrier. Our test scripts can now register closures to run during a phase, which cleans up the scripts themselves somewhat. Droid already had the ability to “kill” a test node, which was implemented rather ingeniously by using a cluster-wide (via Terracotta) Object to wait on and notify all - basically a daemon thread in each node would wait on this object, and when notified would check to see if it had been killed and should do a System.exit(-1). Building upon this, in my Groovy utilities I added methods to easily set up a kill phase, in which all testing would be suspended while a node was killed off - this leverages the Runnable CyclicBarrier action, which is run before any threads are released from the barrier.
Also as part of all of this, I’ve become reacquainted with Unix. Terracotta has an impressive array of “perf” machines dedicated entirely to continuous performance testing. Some are tuned to act as client L1 nodes, other more powerful multicore machines are tuned to act as the L2 servers. These machines can be reserved, scheduled and used all remotely. So my third week was largely spent actually running some of these distributed tests on multiple perf machines, which involved ssh and sudo.
This week, based in part on some findings from running these tests, I have finally begun (along with Alex) diving into actual production code to see what improvements can be made. In particular, we are looking into the case where cache values are comprised of very large strings, for example large XML documents, and we are investigating how we can improve our string compression code, or whether we can avoid unnecessarily faulting String values into an L1 node.
I love working at Terracotta. I love it! I could go on and on about other cool stuff that’s happening, but I won’t. Suffice to say, the technologies we use and the problems we are tackling are fascinating, to say the least. All of my colleagues are both very sharp and very down-to-earth - no power trips going on that I can tell. Working at home is wonderful - it’s so nice to just take a 15 minute break and walk my son to preschool, or have lunch with my family. Or, in theory, work in my boxers, not that I’ve done that, yet. And working with Alex again rocks.
Alex Miller:
Great write-up! It is really nice to have you aboard and also to have someone to hack code with in person.
13 March 2008, 8:42 pmScott:
Thanks Alex. Yes indeed it is nice to be hacking code together again, and boy are we doing some crazy mind-bending hacking!
14 March 2008, 9:55 pm