How to increase your Ycombinator Karma without really trying

A New Blog

I’ve recently found myself not blogging because I didn’t wish to inflict the posts on Planet Atlassian. So now anything non-technical which I don’t think is of interest to subscribers to this blog (Hi to both of you!) appears here.

ICFP2008 Programming Contest

Along with some colleagues I entered the ICFP programming contest this year.

You can read some more about our experience at Matt’s blog.

I thought the problem — guiding a Mars rover to a goal amongst obstacles — was very well chosen. It was possible to navigate the example maps with very simple software, so everyone could get the satisfaction of seeing their rover succeed. If you had the time and skills, there were a vast number of directions to go in to improve your rover’s performance. This compares favourably with last year’s problem. That problem was very clever, and I had a lot of fun playing with it, but I never got close to having a worthwhile submission because the threshold for even partial success was quite high. So kudos to the ICFP2008 team for creating a problem which has a low barrier to entry while still having the ‘dynamic range’ needed to challenge the top teams.

I’ll definitely participate next year. Among the lessons we learnt:

  • Have all the team on the same premises, especially for the first few hours. When you are writing software this quickly you need to all be on the same page all the time.
  • For developing geometry based heuristics visualisation is very important. You need to be able to see what your code is trying to do — where it is trying to go, what it thinks is blocking it, when it starts to turn and so on.

We wrote our entry in Java, but over the next few weeks I plan to produce something just as sophisticated in CAL. It is a functional programming contest after all :-)

Take Two

I’m much happier with the second version:

out3 a =
    let
        renderGroup g =
            let 
                char = head g;
                count = length g;
                renderChars :: Int -> String;
                renderChars n =
                    (fromChar char) ++
                    (if n >1 then  " " ++ (renderChars $ n-1) else "");
            in
                if count > 2 then
                    (renderChars 2) ++ 
                    " <span>" ++ (renderChars (count - 2)) ++ "</span>"
                else
                    renderChars count;
        concatWithSep list = (head list) ++ concatMap (\s -> " " ++ s) (tail list);
    in
        concatWithSep $ map renderGroup (group a);

This only uses one ‘exotic’ list function, group:

group :: Eq a => [a] -> [[a]]
    Splits the specified list into a list of lists of equal, adjacent elements.

My program is longer than Matt’s — roughly twice as long — but I think this version is easier to understand.

Matt’s Javascript for loop has some complicated state, contained in some loop variables which are modified in the body of the loop as well as in the for statement itself. I think that’s a recipe for confusion.

Dustin’s Programming Problem Take One

My colleague Matt Ryall wrote about this simple algorithm for marking up a series of letters — which is complex enough to be interesting.

I wrote a version in CAL:

arr = ['a', 'b', 'c', 'c', 'd', 'e', 'e', 'e', 'e', 'e',
          'f', 'e', 'f', 'e', 'f', 'a', 'a', 'a', 'f', 'f', 'f'];

data State = State string :: String current :: (Maybe Char) count :: Int;

out =
    let
        initial = State "" Nothing 0;
        f :: State -> Char -> State;
        f state char =
            let
                State s current count = state;
                charStr = fromChar char;
                appendSameChar = 
                    if count == 2 then
                        s ++ " <span>" ++ charStr
                    else
                        s ++ " " ++ charStr;
                appendDiffChar =
                    if count > 2 then
                        s ++ "</span> " ++ charStr
                    else
                        s ++ " " ++ charStr;
            in
                case current of
                    Nothing -> State (fromChar char) (Just char) (1 :: Int);
                    Just c ->
                        if c == char then
                            State appendSameChar (Just char) (count+1)
                        else
                            State appendDiffChar (Just char) 1;
            ;
        finalState = foldLeftStrict f initial arr;
    in
        finalState.State.string ++(if finalState.State.count > 2 then
            "</span>"
        else
            "");

Positive proof that you can write verbose, confusing code in any language — but at least this code had no bugs — once it compiled it worked correctly first time.

As a follow up to this I will try to do better!

Project Euler for non-mathematicians?

This site has potential to be quite interesting, depending on the quality of the instructions.

It may also self select an interesting group of people — I wonder what a version of Slashdot which only those people knew about would be like?

Crowd Authentication for Google’s AppEngine

The launch of Google’s AppEngine has given me the obligation to find out what it’s all about, and the opportunity to learn a bit more about one of our products, Crowd — and, of course, pick up some Python along the way.

I began by working through Google’s Guestbook example, and then replaced its use of Google’s Users API with single sign on via Crowd.

Crowd Single Sign On

A Crowd client application authenticates via SOAP calls to a Crowd server. The sequence looks like this:

Crowd authentication sequence diagram

Duck Punching httplib for SOAP

I used the soaplib Python SOAP library to talk to Crowd. This library uses httplib to talk to the SOAP server, which is a problem as AppEngine only allows applications to use Google’s HTTP request mechanism.

I adapted soaplib to use Google’s fetch function by ‘Duck Punching’, also known as ‘Monkey Patching’. This globally overrides a library class with a class of your own:

import httplib

class AppEngineHTTPConnection(object):
... delegate the functions called by soaplib to Google's fetch ...

class Crowd(object):
def __init__(self, path, applicationName, applicationPassword):
    # do some monkeypatching
    httplib.HTTPConnection = AppEngineHTTPConnection

Simple, and safe in this case, as we want any client of httplib to use our replacement.

By default, soaplib puts an empty namespace on strings and arrays, so I cut, pasted and renamed these classes, changing them to explicitly set the correct namespace. A Pythonista could probably duck-punch their way out of that with less duplication

My modified Guestbook application adds login and logout URLs which authenticate with Crowd and create the appropriate cookie.

Caveats and Conclusions

AppEngine doesn’t support SSL, so your username and password are transmitted unencrypted to the application. To avoid this you could write a small authentication application to be hosted with your Crowd server under SSL.

Your Crowd server and your Google AppEngine application need to be on the same domain, in order to share the Crowd SSO cookie. This should be possible by assigning your domain to a Google Apps account, but I haven’t managed to add my Google Apps domain to my Google AppEngine application yet.

Python

I strongly dislike weakly typed languages such as Python. IDEs can’t sensibly provide code completion, and I found that I made many errors which were only caught at runtime. Most of these were in code which could easily have been type checked at compile time.

The sooner type inference is added to Python or AppEngine supports JVM languages, the happier I’ll be. I expect the latter to happen first.

Code

The code for this AppEngine app can be found in my crowd-appengine Git repository. As usual, get a snapshot if you don’t have a git client installed.

A CAL webapp with persistent data using GWT, STM and BDB

aka, attack of the TLAs.

This webapp’s architecture is depicted below:

Webapp Architecture

Any data structure can be built on top of TVars — and each TVar is a mutable reference, these are not functional data structures.

In this application a simple hashmap is used. Skiplists and relaxed balance btrees are other data structures which might allow reasonable concurrency too, while also providing features like in-order traversal.

This illustrates the relationship between the CAL objects in the CAL ExecutionContext and key-value pairs in the BDB:

cal-gwt-stm-persistence.png

The root of the persistent data structure is a TVar with a ‘well-known’ id — 1 in the example, which is created by a constant applicative form function. This TVar will retrieve its value from the BDB when it is created, or if no value exists for its id, it will be initialised with a default value, which in the case of a hashmap is an array of TVars, each containing an empty list of key-value pairs. A value stored in a TVar is persisted to the BDB by serialising it using CAL’s default output function and Xstream, which can serialize and deserialize instances which are not serializable and do not have accessible constructors.

TVars themselves have transient values, so only the id is persisted — the value is lazily loaded when required, using the id. So even though a TVar may persist a complicated tree of CAL algebraic values, this stops at the first TVar. (The root TVar is never persisted itself — only its value is stored).

You can get a snapshot here — just unpack it and run ant run, then point your browser at http://localhost:8080/caltest.html. Or browse the source.

The ant build script includes a target run-tests which runs some Selenium tests. Stop the server before running that target.

Note that the source code includes various bits of half-baked rubbish, in addition to that described above!

GWT as a CAL client

I’ve been interested in GWT as a way of building rich Internet applications since it appeared, and I’m very pleased to see it getting better and better.

So it’s natural that I’d want to try using it with CAL, a functional language quite similar to Haskell which runs on the JVM.

I used a similar approach to marshalling Javabeans to CAL algebraic types as I used before, but this time I haven’t used any bytecode manipulation — as the Java classes are needed at compile time for the GWT client there isn’t any point in generating them at runtime (although generating them as a separate build step might be useful). I’ve also extended the previous work to include mapping a Java 5 enum to a CAL algebraic type which has constructors with zero parameters.

So in our GWT client we can write:

CaltestServiceAsync service = GWT.create(CaltestService.class);
((ServiceDefTarget) service).setServiceEntryPoint(
    GWT.getModuleBaseURL() + "CaltestService");
service.processPerson(
    new Person(Salutation.MR, "Jim", "Earl", "Jones"), new MyAsyncCallback(...));

and call the CAL function:

public processPerson p =
    let Person s f m l = p; 
    in Person s (toUpperCase f) (lift toUpperCase m) (toUpperCase l);

where the types are:

data Salutation = MR | MRS deriving Inputable, Outputable;
data Person =
    Person salutation :: Salutation 
              firstName :: String 
              middleName :: (Maybe String) 
              lastName :: String 
    deriving Inputable, Outputable;

All this is done via three different annotations, and a special subclass of the GWT RemoteServiceServlet.

The first annotation, @Cal is applied to the GWT service interface, and indicates the CAL module to map the functions on the interface to:

@Cal(workspace = "myworkspace.cws", module = "TDavies.GwtTest")
public interface CaltestService extends RemoteService {
    @Cal
    Person processPerson(Person p);
...

The CAL types Person and Salutation need to be mapped to Java classes:

Person is a simple Javabean with getters and setters for each attribute:

@CalBean(workspace = "myworkspace.cws", module = "TDavies.GwtTest",
    constructorName = "Person")
public class Person implements IsSerializable {
    private Salutation salutation;
    private String firstName;
    private String lastName;
    private String middleName;
...

Note that middleName has the type Maybe String in the CAL type. A value of null maps to Nothing while a value of "x" maps to Just "x".

Salutation is an enum:

@CalEnum(workspace = "myworkspace.cws", module = "TDavies.GwtTest",
    type = "Salutation")
public enum Salutation {
    MR, MRS
}

The names of the enum’s values must be identical to the names of the CAL constructors.

A subclass of RemoteServiceServlet checks for the annotations and transforms the values in both directions.

The source code for this experiment is available via anonymous svn from http://tgdavies.beanstalkapp.com/eddy/browse/trunk/cal. Please note that this repository contains various other half-baked and half-finished experiments! Look at the build.xml file to see how to set up an environment — you’ll need to supply OpenQuark, GWT and Jetty.

In my next post I’ll describe how to persist information on the server.

Fuel Additive Fun

Fuel additives seem to be the most popular science-based scam outside of alternative medicine.

I’m not sure whether the promoters primarily make their money from investors or customers, but it seems that there’s enough money in it to keep a steady stream of ‘inventions’ coming.

The press (or ‘mainstream media’ as us bloggers call it) loves the ‘Aussie innovation’ angle, so I’m pleased that the Sydney Morning Herald has got a grip and has published a series of many critical articles.

Firepower have now threatened to sue blogger Daniel Rutter, a step which I think they will find counterproductive in the age of Google.

The Herald has treated another magic fuel additive completely differently — again, see Dan for the details. An initial uncritical puff-piece, followed by a mysterious two-stage withdrawal of the article, with no real description of the reasons for doing so.