Graphs

GraphModel: A .NET Abstraction for Graphs

Just over a month ago, I published “Playing with graphs and Neo4j“. Back then, it was just a toy implementation, an experiment, to help me with what I have been doing for my startup. Well, that little experiment has grown into something much more substantial. As I started building on top of my Neo4j Graph Model implementation, I realized that I needed to pay more attention to it, make it more robust, test it extensively, document it properly, and so on. You know… treat it as production-quality middleware. It’s not quite there yet but I feel that this new release is on the right steps, hence the “1.0.0-alpha” moniker. If there is interest, I hope the community will help take it to v1.

This post explores the new features and how the middleware is layered.

Update: I started creating the Python equivalent of this library. In the process, I discovered that Neo4j supports JSON path queries via “APOC” functions and also dictionaries on nodes. This means that my complicated “complex” property logic might not be really necessary. There are differences in the types of queries that can be supported with the different approaches. My approach offers full expression support. Stay tuned for updates on this topic.

What Started as a Simple Wrapper…

In my original post, I described building a basic abstraction to avoid the tedium of mapping domain models to/from Neo4j’s data structures. The core frustration was simple: “Why should I worry about serializing/deserializing objects when middleware can do that automatically?” That proof-of-concept has evolved into a comprehensive library that not only solves the original mapping problem but introduces powerful features I hadn’t even imagined back then.

What GraphModel Has Become

GraphModel is now a type-safe, LINQ-enabled abstraction layer that provides:

  • Compile-time validation through code analyzers
  • Sophisticated graph traversal with depth control and filtering
  • Support for complex properties
  • Transaction management with full ACID support
  • Attribute-based configuration for clean domain modeling
  • Automatic code generation for domain types to support performant serialization/deserialization, together with a data structure to represent in-memory object graphs in support of Graph Model providers implementers
  • Advanced LINQ querying with graph-specific extensions
  • Performance evaluation suite
  • Extensive test suite

Clean Architecture from the Ground Up

The current architecture follows a clean, layered approach

┌─────────────────────────────────┐
│          Your Application       │
├─────────────────────────────────┤
│        Graph.Model (Core)       │  ← Abstractions & LINQ
├─────────────────────────────────┤
│     Graph.Model.Neo4j           │  ← Provider Implementation  
├─────────────────────────────────┤
│         Neo4j Database          │  ← Storage Layer
└─────────────────────────────────┘

The provider architecture means you’re not locked into Neo4j – the core abstractions can be implemented for any database. In fact, I am considering implementations on top of some of the cloud databases.

Domain Modeling Made Elegant

One of the biggest improvements is how clean domain modeling has become. It also supports many more types and collections. Here’s an example of a domain data model.

public enum State { WA, CA, OR, NY, Unknown, }

public record Address
{
    public string Street { get; set; } = string.Empty;
    public City City { get; set; } = new City();
    public int ZipCode { get; set; }
    public string Country { get; set; } = string.Empty;
    public List<string> Aliases { get; set; } = [];
}

public record Person : Node
{
    public string FirstName { get; set; } = string.Empty;
    public string LastName { get; set; } = string.Empty;
    public string? Email { get; set; } = string.Empty;
    public int Age { get; set; }
    public Address? HomeAddress { get; set; } = null;
    public List<string> Skills { get; set; } = [];
    public List<DateTime> KeyDates { get; set; } = [];
    public List<int> SomeNumbers { get; set; } = [];
}

[Relationship(Label = "FRIEND_OF")]
public record Friend : Relationship
{
    public Friend() : base(string.Empty, string.Empty) { }
    public Friend(string startNodeId, string endNodeId) : base(startNodeId, endNodeId) { }
    public Friend(Person p1, Person p2, DateTime since) : base(p1.Id, p2.Id)
    {
        Since = since;
    }

    public DateTime Since { get; set; } = DateTime.MinValue;
}

public record City
{
    public string Name { get; set; } = string.Empty;
    public State State { get; set; } = State.Unknown;
    public int Population { get; set; }
    public List<string> Aliases { get; set; } = [];
}

That’s it. The middleware will generate code during build time for the serialization and deserialization of these domain types. You may notice that Address and City are not primitive types. Neo4j does not support properties whose type isn’t a primitive or a collection of primitives. This was one of the most difficult features to support and it works great now. Read below for an example of a generated Cypher for instances of types that include such “complex” properties.

LINQ That Actually Understands Graphs

The biggest leap forward is the LINQ provider. This isn’t just basic LINQ-to-SQL style querying – it has graph-specific extensions for traversal. There are many more LINQ operators for graph traversal/exploration that can be added in the future.

// Find Alice's friends who live in Seattle, within 2 degrees of separation
// Note the filter expression using a member of a "complex"
//   property: HomeAddress.City
var seattleFriends = await graph.Nodes<Person>()
    .Where(p => p.FirstName == "Alice")
    .Traverse<Person, Friend, Person>()
    .WithDepth(1, 2)  // 1-2 hops away
    .Where(friend => friend.HomeAddress.City.Name == "Seattle")
    .OrderBy(friend => friend.LastName)
    .ToListAsync(); // Everything is async

// Complex property queries work seamlessly
var portlandUsers = await graph.Nodes<Person>()
    .Where(p => p.Age < 30)
    .Where(p => p.HomeAddress != null && p.HomeAddress.City.Name == "Portland")
    .ToListAsync();

Behind the scenes, this generates optimized Cypher queries using an Expression Visitor architecture that understands both standard LINQ operations and graph-specific patterns.

Transaction Management

Transactions work as expected.

await using var transaction = await graph.GetTransactionAsync();

try
{
    // Multiple operations in a single transaction
    await graph.CreateNodeAsync(person, transaction: transaction);
    await graph.CreateRelationshipAsync(friendship, transaction: transaction);
    await graph.UpdateNodeAsync(company, transaction: transaction);
    await transaction.Commit();
}
catch
{
    await transaction.Rollback();
    throw;
}

The async/await patterns are first-class citizens, and the transaction scope automatically manages connection lifecycle.

Compile-Time Validation

One of the most powerful additions is the code analyzer package that provides compile-time validation to ensure that your domain types adhere to the Graph Model requirements/constraints. If you are using an IDE such as VS Code or Visual Studio, you will get immediate feedback as you type your code.

// ❌ Analyzer error: Missing parameterless constructor
public record BadNode : Node
{
    public BadNode(string requiredParam) { }  // GM001 error
}

// ❌ Analyzer error: Graph interface types not allowed as properties
public record BadNode : Node
{
    public INode RelatedNode { get; set; }  // GM003 error
}

// ❌ Analyzer error: Circular reference without nullable
public record BadNode : Node
{
    public Foo Foo { get; set; }  // GM010 error - should be Foo?
}

public record Foo
{
    public Foo { get; set; }
}

These analyzers catch common mistakes at build time, preventing runtime surprises.

A simple example

Given the Person and Knows records from earlier in this post, let’s create a simple graph:

var savas = new Person
    {
        FirstName = "Savas",
        LastName = "Parastatidis",
        Email = "savas@techcorp.com",
        Age = 35,
        HomeAddress = new Address
        {
            Street = "456 Elm St",
            City = city1,
            ZipCode = 98001,
            Country = "USA",
            Aliases = ["Home", "Personal"]
        },
    };

    var jim = new Person
    {
        FirstName = "Jim",
        LastName = "Webber",
        Email = "jim@techcorp.com",
        Age = 40,
        HomeAddress = new Address
        {
            Street = "123 Maple St",
            City = city2,
            ZipCode = 90210,
            Country = "USA",
            Aliases = ["Home", "Work"]
        },
    };

    await graph.CreateNodeAsync(savas);
    await graph.CreateNodeAsync(jim);

    var savasFriendJim = new Friend(savas.Id, jim.Id) {
      Since = new DateTime(1996, 10, 1)
    };
    await graph.CreateRelationshipAsync(savasFriendJim);

The above generates a graph that looks like this:

Examples of LINQ to Cypher translation

Consider this simple LINQ expression:

var peopleLeavingInSpecificCities = await graph.Nodes<Person>()
   .Where(p => p.HomeAddress!.City.Name == "Tech City" ||
               p.HomeAddress!.City.Name == "Innovation Town")
   .Select(p => p.FirstName)
   .ToListAsync();

This generates a projection Cypher query with the appropriate navigation expression for the “complex” property:

MATCH (src:Person)
MATCH (src)-[:__PROPERTY__HomeAddress__]->(src_homeaddress)
WHERE (src_homeaddress.City.Name = $p0 OR src_homeaddress.City.Name = $p1)
RETURN src.FirstName

Now consider this LINQ expression. It returns entire Person objects, which means that we have to load all the complex properties so that they can be appropriately deserialized from our code-gen’d logic.

var savasFriends = await graph.Nodes<Person>()
   .Where(p => p.FirstName == "Savas")
   .Traverse<Person, Friend, Person>()
   .ToListAsync();

The middleware will generate a Cypher query such as this one:

MATCH (src:Person)-[r:FRIEND_OF]->(tgt:Person)
WHERE src.FirstName = $p0

// Complex properties from target node
OPTIONAL MATCH tgt_path = (tgt)-[trels*1..]->(tprop)
WHERE ALL(rel in trels WHERE type(rel) STARTS WITH '__PROPERTY__')
WITH src, r, tgt, CASE
  WHEN tgt_path IS NULL THEN []
  ELSE [i IN range(0, size(trels)-1) | {
    ParentNode: CASE
      WHEN i = 0 THEN tgt
        ELSE nodes(tgt_path)[i]
      END,
    Relationship: trels[i],
    Property: nodes(tgt_path)[i+1]
  }]
END AS tgt_flat_property
WITH src, r, tgt,
  reduce(flat = [], l IN collect(tgt_flat_property) | flat + l) AS tgt_flat_properties
WITH src, r, tgt, apoc.coll.toSet(tgt_flat_properties) AS tgt_flat_properties

RETURN {
  Node: tgt,
  ComplexProperties: tgt_flat_properties
} AS Node

It took me a while to get this right but I think it’s going to be useful.

The suite of packages

The library has grown into a multi-package ecosystem:

  • Cvoya.Graph.Model – Core abstractions and interfaces
  • Cvoya.Graph.Model.Neo4j – Neo4j provider implementation
  • Cvoya.Graph.Model.Analyzers – Compile-time validation
  • Cvoya.Graph.Model.Serialization – Serialization/deserialization logic, a data structure for representing in-memory object graphs, a schema data structure for describing IEntity (the base interface for INode and IRelationship) data structures.
  • Cvoya.Graph.Model.Serialization.CodeGen – A code generator to ensure performant serialization/deserialization for your domain types.

If you reference the Neo4j package, you will get all the necessary dependencies for your project except the Analyzers since that’s optional but highly recommended.

What’s next?

The next major areas I’m exploring include:

  • Additional graph database providers
  • Advanced path analysis and graph algorithms
  • Declarative configuration of indexes
  • Enhanced spatial and temporal data support

🤝 Open Source and Community

The project is fully open source under Apache 2.0 license and available on GitHub. The architecture is designed to be extensible, and I’d love to see community contributions – whether that’s additional providers, performance optimizations, or new LINQ extensions. What started as a simple weekend project to avoid mapping headaches has become a comprehensive graph data framework. If you’re working with graph databases in .NET, I’d encourage you to check it out and let me know what you think!


Resources:

Have feedback or want to contribute? Feel free to open an issue or submit a pull request!

Savas Parastatidis

Savas Parastatidis works at Amazon as a Sr. Principal Engineer in Alexa AI'. Previously, he worked at Microsoft where he co-founded Cortana and led the effort as the team's architect. While at Microsoft, Savas also worked on distributed data storage and high-performance data processing technologies. He was involved in various e-Science projects while at Microsoft Research where he also investigated technologies related to knowledge representation & reasoning. Savas also worked on language understanding technologies at Facebook. Prior to joining Microsoft, Savas was a Principal Research Associate at Newcastle University where he undertook research in the areas of distributed, service-oriented computing and e-Science. He was also the Chief Software Architect at the North-East Regional e-Science Centre where he oversaw the architecture and the application of Web Services technologies for a number of large research projects. Savas worked as a Senior Software Engineer for Hewlett Packard where he co-lead the R&D effort for the industry's Web Service transactions service and protocol. You can find out more about Savas at https://savas.me/about

Share
Published by
Savas Parastatidis

Recent Posts

The Beginning of CVOYA

There’s a unique energy that comes with starting something new — a blend of excitement,…

3 weeks ago

Enhancements in Graph Model: Dynamic Entities & Full-Text Search

As I continued work on BrainExpanded and its MCP service, I came to realize that…

4 months ago

Playing with graphs and neo4j

After my initial implementation of some BrainExpanded-related ideas on top of dgraph using its GraphQL…

6 months ago

A Graph Model DSL

Say hello to the Graph Model Domain Specific Language (GMDSL), created with the help of…

6 months ago

BrainExpanded – Web app and Data Sources

As I wrote in previous posts, the manual recording of memories for BrainExpanded is just…

7 months ago

BrainExpanded – End-to-end working

Imagine a world where your memory is enhanced by a team of intelligent agents, working…

7 months ago