Dataflow and Manycore architectures

I have been gaining a lot of program management experience managing my team’s activities in the Many/Multicore area. I am even organizing a workshop in this area in June (more in a week or so). We have a very close interaction with Intel, which I enjoy tremendously (I will talk about the outcome of this activity sometime around July). I do not only get to experience first-hand the collaboration on key technologies between two of the largest companies in our discipline but I also get to meet and interact with really clever people. I feel really honored to regularly exchange messages with Microsoft folks like Burton Smith, Jim Larus, Michael Fortin. They have been leaders in this field for so many years! I get to participate in teleconferences with Andrew Chien and Intel folks like Wei Li, Jesse Fang, Jim Held, Geoff Lowney. All huge names in our industry! I truly feel honored and humbled. BTW… they are all going to be at the workshop I mentioned earlier (plus many more big names in the industry and research communities).

I also get to interact on a regular basis with those leading research projects we fund at various universities, like CS @ University of Tennessee (Jack Dongarra and his team), CS @ Indiana University (Dennis Gannon and Geoffrey Fox and their teams), Barcelona Supercomputing Center (Mateo Valero and his team), CS @ Rice University (John Mellor-Crummey and the team over there). It’s just fantastic.

So, I’ve been doing a lot of learning and thinking around the space of Manycore computing. I feel like at home, given that High-Performance, Parallel Computing has been the area of my PhD research. However, when it comes to hardware, I must admit that even though there is an opportunity to learn lots, I don’t find it particularly fascinating. When it comes to programming models, however, I can read and chat for hours.

A regular topic of discussion between Tony Hey, Geoffrey Fox, and I is the past success and the future relevance of functional/dataflow programming. I am very much a supporter of the dataflow model for computation, while the “pragmatists” (having come from a physics background 🙂 keep telling me that the “pure” model will fail again as it has in the past (I must admit, they are not saying in such strong words :-). Of course, the functional programming models didn’t succeed in demonstrating that large software projects could succeed or that performance could match the explicitly-managed parallel applications on massively parallel systems.

I believe, as it is the case in so many areas, the right tool can be used for the right job. However, if we are to succeed in making parallelism mainstream, and we have to given the manycore architectures that will become available very fast, we need to provide tooling that implicitly manage parallelism, instead of directly exposing it to the programmers. I totally agree that there are going to be cases where highly-performance libraries with implementation of parallel algorithms/patterns will be programmed explicitly and made available for imperative programming.

I am glad to see that the industry and the research community are seriously looking into this space again. The article on StreamIt reminded me of this.

Interesting times to be into parallelism 🙂

4 responses to “Dataflow and Manycore architectures”

  1. Yes, I think that functional languages is indeed an interesting space when it comes to multi-core processors. I think it’s why I’m seeing a lot of renewed (new?) interest in languages like Haskell and Erlang. Share as little as possible, and you’re good.

  2. Sava,

    on “Of course, the functional programming models didn’t succeed in demonstrating that large software projects could succeed or that performance could match the explicitly-managed parallel applications on massively parallel systems.”:

    I think Google’s mapreduce (as an example) is a proof for the opposite.

    best regards, Stelios

  3. Funny you should say that Stelios but that’s exactly the example I am using in these discussions for modern application of dataflow ideas 🙂 However, we didn’t have such a demonstration few decades ago. I absolutely agree with you.

  4. Alexander

    Savas,

    I think, pure dataflow is a wrong way. Just beacuse only coarse grains can pay back overhead of core synchronization.

    Best wishes,

    Alexander.