“What’s faster – A supercomputer or EC2?” – Great insight by Ian Foster

I enjoyed reading Ian‘s analysis on how to interpret “faster” when it comes to comparing infrastructure-as-a-service and a supercomputer.

Snippet:

For example, let’s say we want to run the LU benchmark, which (based on the numbers in Ed’s paper) when run on 32 processors takes ~25 secs on the supercomputer and ~100 secs on EC2. Now let’s add in queue and startup time:

  • On EC2 , I am told that it may take ~5 minutes to start 32 nodes (depending on image size), so with high probability we will finish the LU benchmark within 100 + 300 = 400 secs.
  • On the supercomputer , we can use Rich Wolksi’s

    QBETS queue time estimation service

    to get a bound on the queue time. When I tried this in June, QBETS told me that if I wanted 32 nodes for 20 seconds, the probability of me getting those nodes within 400 secs was only 34%-not good odds.

So, based on the QBETS predictions, if I had to put money on which system my application would finish first, I would have to go for EC2.

Share
Published by

Recent Posts

BrainExpanded – Copilot

Happy New Year everyone! I was planning for my next BrainExpanded post to be a…

3 weeks ago

BrainExpanded – The Timeline

See "BrainExpanded - Introduction" for context on this post. Notes and links Over the years,…

1 month ago

BrainExpanded – Introduction

This is the first post, in what I think is going to be a series,…

1 month ago

Digital twin follow up

Back in February, I shared the results of some initial experimentation with a digital twin.…

1 month ago

Digital Twin (my playground)

I am embarking on a side project that involves memory and multimodal understanding for an…

11 months ago

“This is exactly what LLMs are made for”

I was in Toronto, Canada. I'm on the flight back home now. The trip was…

1 year ago