Amazon S3

Very interesting service. I am curious to see how it’s going to do commercially, especially given the free availability of GoogleBase (although mechanisms for programmatic interaction with the GoogleBase service have not been published yet, at least not ones that I am aware of).

It’s interesting to note the service’s design. They provide both HTTP- (REST is an architectural pattern not a technology) and SOAP-based access. In the SOAP case, they’ve decided not to couple the identity of the artifacts one can manage with their address. There is a single service endpoint and the identity of the artifacts is considered message payload. The HTTP-based version of the mechanisms for programmatic interaction (please note how I am avoiding to use the term ‘API’) does have that coupling but they have to encode semantics in the structure of the URI (i.e. http://<service-address>/<bucket>/<object>). Also, they use PUT to create a new bucket. Hmmm! My limited understanding of REST suggests that their approach is not pure :-) I could be wrong.

Back to Web Services… I guess if WS-Transfer or WS-RF were widely accepted standards/specifications, they could have modeled the interactions using those specifications given the resource-oriented nature of the S3 service. It’s interesting, however, how they can achieve the same functionality without any difficulty or compromise in the design. Also, if there was better tooling support for WS-Security(with the exception of Indigo of course:-) they could have used that instead of HTTPS. But I guess they are trying to remain complaintcompliant with the Basic Profile 1.0 or 1.1. It’d be great to see MTOM support as well there.

So, application domain-specific SOAP Get/Delete messages? A single service endpoint rather than per-resource endpoints? A very simple set of messages; nothing complicated; resource identity as payload. Hmmm… I wonder why hasn’t anyone talked before about such a design approach within the context of the Grid community. Oh wait a minute, perhaps someone has :-)

I think that the Grid community should have a look at the design. This shows that you can build interesting services for global-scale computing, without having to build new infrastructure first, without wasting years in making sure that everyone agrees with a particular way of doing things, without complicating things. And notice how there is no support for lifetime management, renewable references/addresses, etc. I wonder why. Could it be because this is a very difficult problem to solve at global scale?

I am not suggesting we shouldn’t be trying to standardise on common messaging behaviours. If a specific set of infrastructure specifications is in place on how to do resource-orientation, then by all means we should use it (although not for everything). But we shouldn’t try to build on technologies until everyone has agreed on them first. Now why hasn’t anyone said that before? Erm… perhaps these?… “WS-GAF: a framework for building Grid applications using Web Services”, “Web Service Grids: an evolutionary approach”, and “An evolutionary approach to realizing the Grid computing vision” :-)