savas parastatidis

ReactGraph Part 5 – A user’s feed via “pull” and “push” graph queries

2016-05-11

Graphs, Programming, ReactGraph, Reactive Computing

Investigation into bringing together graph and reactive computing. This fifth post in the series showcases slightly more complex LINQ-to-graph and Rx types queries for both “pull” and “push” type computation.

Disclaimer

Remember that the goal of the investigation is to think about the abstractions. We don’t (yet) deal with implementation issues such as scale, reliability, hot spots, partitioning, locality of access, performance, etc. Those are very very important issues and the most difficult part of any implementation. If you are looking to work with graphs, you should really look at Neo4j.

Get the user’s feed

In part 4, we saw how we could get a user’s friends and posts. Now, let’s put these together in order to generate a user’s feed. We need to get the N most recent posts from his/her friends and only from those whom he/she follows.

var savas = await graph.GetNode("savas");

var feed = savas
  .Friends()
  .Where(friend => savas.Follows(friend))
  .SelectMany(friend => friend
    .Posts()
    .NodeValue<StatusUpdate>()
    .OrderByDesceding(post => post.PublishedOn)
    .Take(5))
  .OrderByDesceding(post => post.PublishedOn);
  
feed.ToList().ForEach(post => WriteLine($"{post.Text} - by {post.PostedBy}");

//--- Output
// Status update 4 - by Jim Webber
// Status update 3 - by Erik Meijer
// ...

The above should be easy to read. From each of Savas’ friends whom he follows, we get the 5 most recent posts. Note the use of the NodeValue<StatusUpdate> operator that projects the value of each INode in IQueryable<INode> (returned by Posts()) to a StatusUpdate, producing an IQueryable<StatusUpdate>. And remember… the expression tree is submitted to the store, where the data resides, for evaluation.

A special note about post.PostedBy. As an observant reader, you will point out that PostedBy is a property of StatusUpdate. Shouldnt’ we get the name of the author by following an edge (or the reverse of an edge)? Yes, that’s absolutely true. And it’s not the only way. In a future post we will discuss “calculated properties” which are dynamically updated as a reaction to changes in the graph. However, in large graphs, it’s often necessary (for performance reasons) to cache properties or calculate properties based on expressions against the data in the graph (think of Excel cells). We can use the reactive nature of the graph to keep such properties up-to-date.
StatusUpdates from all the friends are collected and ordered. StatusUpdate.PostedBy is such a property.

Let’s rewrite the above query to make use of anonymous types.

var feed = savas
  .Friends()
  .Where(friend => savas.Follows(friend))
  .SelectMany(friend => friend
    .Posts()
    .NodeValue<StatusUpdate>()
    .OrderByDesceding(post => post.PublishedOn)
    .Take(5)
    .Select(post => new { PostedBy = $"{friend.FirstName} {friend.LastName}", Post = post});
  .OrderByDesceding(feedEntry => feedEntry.Post.PublishedOn);
  
feed.ToList().ForEach(feedEntry => WriteLine($"{feedEntry.Post.Text} - by {feedEntry.PostedBy}"));

Don’t you just love LINQ? 🙂

This approach doesn’t suffer from performance issues but it does introduce an anonymous type which might make life difficult for implementators since app domain/machine boundaries would have to be crossed. It’s difficult but not impossible. With software everything’s possible after all 🙂 Nevertheless, we will stick with the previous version since it will give us an opportunity to talk about dynamically calculated properties.

A reactive feed

Ok… Now we know how to pull the feed from the ReactGraph platform. If we were to build an application that renders the feed on a screen, we would have to query the store all the time to get the latest feed entries. We would also have to implement some cursor-like functionality so that we get only the posts since the last time we queried. But this may introduce additional overhead to our store or may require us to build extra layers of caching, further complicating our implementation.

There is a better way. As we saw in part 4, it is possible to express our interest in data in the ReactGraph store in terms of a LINQ/Rx expression. Relevant changes to the underlying data trigger an event with the data in which we are interested. Here’s a video of an end-to-end solution built on top of the ReactGraph platform that might help us visualize the type of fluent/reactive user experience we might want to build. Updates to the feed are based on subscription against reactive sources. (BTW… We are slowly building up towards the description of this end-to-end implementation).

So… how do we describe our intent to receive a stream of StatusUpdate instances that represent the latest and greatest in one’s feed?

var savas = await graph.GetNode("savas");

var feed = savas
  .Friends()
  .Where(friend => savas.Follows(friend))
  .Select(friend => friend.PostStream().NodeValue<StatusUpdate>())
  .Merge();
  
feed.Subscribe(post => WriteLine($"New entry in the feed: {post.Text} - by {post.PostedBy}"));

It’s even simpler than the “pull” graph query. In fact, it combines both a “pull” query (the Friends() operator) and then it projects to an IQueryable<IQbservable<StatusUpdate>>. The Merge() Rx operator gets the events from all the individual post streams and combines them to a single stream to which we subscribe. For a discussion on PostStream(), you can refer to part 4.

Once we start adding posts, they will appear in our console as per the Subscribe() above…

await AddNewPost("jim", "hello graph world");
await AddNewPost("erik", "hello reactive world");
// ... more posts

//--- Output
// New entry in the feed: hello graph world - by Jim Webber
// New entry in the feed: hello reactive world - by Erik Meijer
// ...

Awesome. All subscribers will now get notified whenever a new post is made by one of Savas’ friends at the time the subscription was made. Please note the last part. This is the case because we didn’t express our interest in Savas’ feed as being dependent on the most recent state of his friends, only on his friends at the time of the subscription. We will address the issue in a future post.

Difference between “pull” and “push” feeds

While we have managed to express our interest in all changes to Savas’ feed, we will only observe changes after the subscription was created (i.e. Subscribe()). As we have already discussed, “pull” graph queries always consider the entire state of the graph while “push” graph queries (i.e. reactive) only surface the observations since a subscription was made. Obviously there is a gap. There are might exist situations where we want to get notified about changes to the graph but we also want to start with some data. How do we bridge that gap? An obvious way is to execute a pull graph query and then immediately subscribe for changes.

If you notice in the video above, Savas’ feed is rendered with some previous entries. However, the feed UI wasn’t implemented with a “pull” graph query. Then, how did we do that? Well, that’s the topic of a future post 🙂