“What’s faster – A supercomputer or EC2?” – Great insight by Ian Foster

I enjoyed reading Ian‘s analysis on how to interpret “faster” when it comes to comparing infrastructure-as-a-service and a supercomputer.

Snippet:

For example, let’s say we want to run the LU benchmark, which (based on the numbers in Ed’s paper) when run on 32 processors takes ~25 secs on the supercomputer and ~100 secs on EC2. Now let’s add in queue and startup time:

  • On EC2 , I am told that it may take ~5 minutes to start 32 nodes (depending on image size), so with high probability we will finish the LU benchmark within 100 + 300 = 400 secs.
  • On the supercomputer , we can use Rich Wolksi’s

    QBETS queue time estimation service

    to get a bound on the queue time. When I tried this in June, QBETS told me that if I wanted 32 nodes for 20 seconds, the probability of me getting those nodes within 400 secs was only 34%-not good odds.

So, based on the QBETS predictions, if I had to put money on which system my application would finish first, I would have to go for EC2.