Our team has an idea of implementing a simple declarative DSL that would let users query the enterprise's domain model via a single interface without caring which specific microservices to call to get specific portions of data and how to then relate and combine them.
Suggested syntax is based on SQL, but:
- Is much more limited: no grouping or aggregation, no explicit subqueries, no functions etc.
- Joins cannot be specified and are only implicit based on the predefined schema (entities and relations).
Example:
SELECT entityTypeOne.name, entityTypeTwo.value, entityTypeTwo.date
WHERE entityTypeOne.name LIKE 'Sample%'
AND entityTypeTwo.date BETWEEN (2015-05-01, 2015-05-31)
Expected result:
╔════════╦═══════╦════════════╗
║ name ║ value ║ date ║
╠════════╬═══════╬════════════╣
║ London ║ 1000 ║ 01/05/2015 ║
║ London ║ 2000 ║ 02/05/2015 ║
║ London ║ 3000 ║ 03/05/2015 ║
║ Moscow ║ 2000 ║ 02/05/2015 ║
║ Moscow ║ 9000 ║ 05/05/2015 ║
║ Tokyo ║ 1000 ║ 30/05/2015 ║
╚════════╩═══════╩════════════╝
The underlying entity-relation schema knows that entities are related like this: entityTypeOne.id = entityTypeTwo.parentId
which creates an implicit join.
The "query engine" should know that it will first query the entityTypeTwo microservice applying the date range filtering on server, then entityTypeOne microservice applying the id filtering based on previous query's result.
The problems we currently see:
- Representing the object-relation schema.
- Figuring out the optimal order of querying.
- Denormalizing resulting data.
I was wondering if this is a known problem and if there are any algorithms to check (maybe something from graph theory)?
This is the closest thing I could find so far:
What is a heterogeneous query?
If it makes things simpler we can assume that microservices are exposing data via OData.