This 3 times faster than the previous known record.
For those who don’t know, Jim Gray et al established a series of tests, including the 1TB sort, in order to give database vendors a playground for honest comparisons. The results are maintained online. Here are the two related papers:
- “A Measure of Transaction Processing”. Note how it’s signed as “anon et al”. If I remember well, Jim didn’t want for the paper to be associated with a particular vendor. I could be wrong.
- “A Measure of Transaction Processing 20 Years Later”.
Google managed to sort 1TB in 68secs using their MapReduce infrastructure on 1,000 machines. Then, they attempted to sort 1PB of data on 4,000 machines. It’s interesting how when sorting 1PB of data one hits the hard disk failure rates.
Interesting stuff. I am looking forward to the paper.