Grid Evolution vs Grid Intelligent Design

(Disclaimer: As with everything that is posted on this blog, these are my personal opinions and not those of my employer or my direct manager 🙂

As a result of the recent post on Web Fundamentalism by Ian Foster, with whom I had a very interesting conversation over breakfast at SC06, I have been considering whether my thoughts on the subject would make an interesting blog entry. You see, I identify myself as one of those fundamentalists who advocated in favor of stateless services/interactions few years ago.

It's been a while now since I decided to stay away the, often religious, arguments on distributed objects, stateful vs stateless interactions, REST vs service-orientation, SOAP, HTTP, Web Services... the list goes on. I am more interested in building good infrastructure, new functionality, fascinating applications. I am drawn by the exploration of new ideas rather than arguments on old ones. However, for old times sake, few words may be fun to write 🙂

Back in 2003, we (the Newcastle WS-GAF team) started talking about how Web Services technologies could be used in building service-oriented applications [1]. The industry and the Web were adopting practices related to building distributed applications that scaled to support thousands of computers or thousands/millions of interactions. The world of computing was looking at the lessons from decades of distributed system deployments in the real world. We had a good understanding of tight- vs loose-coupling no matter which distributed technology infrastructure was used.

The WS-GAF folks were merely highlighting the successful practices of the times. The distributed-object approach to building large-scale distributed applications was losing traction. Leaking abstractions, lifetime management across administrative boundaries, pin-pointing state + arbitrary behavior at endpoints were becoming things of the past. We never suggested that state is absent from our distributed systems. Instead, we offered technical arguments and examples on how it could be managed differently. Yes, we did say that interactions were stateless. Remember WS-Context? Remember the use of globally unique identifiers (URIs) that were not tight to a particular communication infrastructure? All we were saying about state was that it shouldn't be treated any differently in the Grid domain from what the industry had adopted as common practice.

In business applications we usually include abstractions such 'order numbers', 'credit card numbers', 'customer identifiers', etc. Our interactions don't include infrastructure-specific constructs such as "service instance handlers" or "endpoint references" to orders, credit cards, customers. The state identifiers become part of the semantics of our business interactions rather than the infrastructure. Any state-related functionality is the responsibility of the business tier and not the underlying infrastructure (e.g. "no such customer" or "your order has expired" are messages conveying business semantics). Trying to introduce state management conventions may help in particular domains, like systems management, but does not scale to Internet-scale applications and for general-purpose or business-to-business scenarios, which were the spaces we were interested in.

Back then, we tried hard to demonstrate the ideas with examples, to support our views with technical arguments. We brought as examples the way Amazon and Google had built their Web services, Jim Gray's astronomy-related services, our own attempt with the "Searching for White Dwarfs" application, etc. I found out the hard way how such technical arguments/approaches could be seen as religious, as "community-dividing". At the end of the day, there are different approaches and we can all agree to just use the best tool for the job. There is no universal truth.

But let's leave the technical argument aside. It was unfortunate that back then everyone had focused on the details of the infrastructure. The WS-GAF team was also trying hard to highlight another, very important point related to the process the Grid community was following in defining its architecture/infrastructure. You see, we never believed in "Intelligent Design". (I remember Jim's CAT-5 cable Darwin "ichthys" symbol hanging over me in our office for 4 years 🙂

Wikipedia defines "Intelligent Design" as

Intelligent design (ID) is the concept that "certain features of the universe and of living things are best explained by an intelligent cause, not an undirected process such as natural selection."

and "Evolution" as

evolution is change in the heritabletraits of a population over successive generations, as determined by shifts in the allele frequencies of genes. Over time, this process can result in speciation, the development of new species from existing ones.

From an early stage, we realized that in order for any standard effort to be successful, in the Grid domain or not, it had to have a stable basis. Any new infrastructure/architecture had to evolve from a stable foundation. This is a topic about which we've written and talked extensively [1-4]. It's unfortunate that the discussions about the technical details always overshadowed the point about evolution, stability, industry endorsement, good platform support, user education. It's really unfortunate that OGSI became a "standard" only to be replaced by WSRF (nice Wikipedia overview), which also seems to be in its way of being replaced by WS-ResourceTransfer.

In the meantime, the Grid community has not made the most out of all these years of XML over HTTP or simple SOAP infrastructure support. We could have had a suite of standards for Grid domain-specific services at the disposal of scientists but instead we spent our time and effort in "intelligently" designing infrastructure. Vast resources would have been spent in building solutions and services rather than chasing unstable infrastructure and non-complete specifications.

All is not lost though. There are great examples of Grid/Internet-scale applications out there. Scientific/Technical Computing over the Internet is really gaining momentum and a lot of industry attention. It's a sign of the times that even Microsoft has a Technical Computing group now, led by non-other than Mr. (is it Dr. or Prof. here?) "e-Science" himself, Tony Hey (I have to mention my manager's name... I know he's going to read this 🙂

Perhaps it's time for a "Grid 2.0" that concentrates on the services and their interesting compositions rather than the technology and infrastructure 🙂 I know I am not original in using the term "Grid 2.0" but Tony and I have some intresting ideas in this space. Stay tunned.

It's all part of the fun!

 

[1-4] Papers on Grid evolution

  1. Savas Parastatidis, Jim Webber, Paul Watson, Thomas Rischbeck, WS-GAF: A Framework for Building Grid Applications Using Web Services, Journal of Concurrency and Computation: Practice and Experience, 17(2-4), p391-417, 2005
  2. Malcolm Atkinson, David DeRoure, Alistair Dunlop, Geoffrey Fox, Peter Henderson, Tony Hey, Norman Paton, Steven Newhouse, Savas Parastatidis, Anne Trefethen, Paul Watson, Jim Webber, Web Service Grids: An Evolutionary Approach, Journal of Concurrency and Computation: Practice and Experience, 17(2-4), p377-389), 2005
  3. Savas Parastatidis, Jim Webber, Assessing the Risk and Value of Adopting Emerging and Unstable Web Services Specifications, Proceedings of the 2004 IEEE International Conference on Services Computing (IEEE SCC'04), Shanghai, China, 2004, p.65-72, 2004
  4. Marvin Theimer, Savas Parastatidis, Tony Hey, Marty Humphrey, Geoffrey Fox, An Evolutionary Approach to Realizing the Grid Vision, whitepaper to the Grid community, 2006

3 responses to “Grid Evolution vs Grid Intelligent Design”

  1. “Great examples of grid/internet scale applications out there” …

    In the context of this blog article, I can’t help wandering which examples you considered, and what architecture they used – whether stable, stateless, or not 🙂

  2. How about the Web, email, Amazon’s web services (EC2, S3, etc.), Google’s infrastructure, myGrid, SkyServer, etc, etc.

    They are all services that deal with state and data in one form or the other.

  3. No doubting these are great web/internet applications, but in what way are they grid infrastructures (using the Foster three

    point definition of a grid)?

    (I’m not trying to be difficult, I’m trying to understand the point about what one could have been doing with the architectures available, had one not been trying to do ID).