How do you shard a graph database or graph data?

Question

Imagine I have graph data that is beyond the size of a single machine.

How would you shard a graph database?

I could partition the graph across replicas and query all replicas to aggregate answers.

But what about joins across machines?

Do I need to colocate join keys on all replicas?

Or do I just send the query to all replicas and in parallel and aggregate the responses?

You haven't mentioned a database vendor, but I imagine sharding should be mostly transparent to you. Joining, querying — all those operations should be handled by the database so your code doesn't need to. Can you [edit] your question to include more information about the database vendor you are using? Bear in mind that questions about how to configure a database are likely off-topic. Tool questions belong on StackOverflow or another relevant community. — Greg Burghardt, May 25 '23 at 11:21
I am the database vendor, in other words I am experimenting with database internals and I am curious about the implementation techniques of sharding graph data. — Samuel Squire, May 26 '23 at 11:17
I see. This is more a database architecture question. I see two close-votes for "needing focus". Don't let multiple question marks fool you. You cannot properly shard data unless you consider the other questions in this post. — Greg Burghardt, May 26 '23 at 12:11

0 Answers0