1

So, I'm making an application which monitors specific things that happen within the area of a user. And I'm really stuck at the approach on how to serve specific real-time data to my users. I am using MongoDB for the database and NodeJS for my API.

(I know once it will release it will probably reach about 1.000 users within the first 30 days. So this makes it crucial to set up an efficient way to handle this)

I got a table which has a collection that looks like this:

// FooCollection Record 1 (out of X.XXX.XXX)
[
  {
    "info": "Some crucial information",
    "regionId": 1,
    "vehicleId": 1,
    "codes": [ "code1", "code2", "code3" ],
    "place": "New York"
  }
]

Now, I have 26 different regionId's and 8 different vehicleId's, each region had it's own vehicles, example:

- Region 1
  - Vehicle 1
  - Vehicle 2
  // all the way to vehicle 8
- Region 2
  - Vehicle 1
  - Vehicle 2
  // all the way to vehicle 8
// all the way to region 26

These are fixed supplied to users with the help of an API. The codes (105.000 records) and place (2425 records) are supplied too, but the user is free to type their own place and code, but they simply won't get a result if that doesn't exist in the table (duh). They will of course be helped with an autocomplete dropdown but we won't talk about UX for now

This means that there are A LOT of possible combinations the user can enter. However, the columns regionId and vehicleId will be filtered on frequently, but codes and place won't be requested that much.

I have thought of a few different approaches

  1. Use NodeJS Server-Side-Events, basically make a stream (automatically instantiated at the start of the application with a loop of course) for every single possibility and put a realtime query on that, basically frying the database with millions of requests per second (because of all the possibilities) - So I'm not too fond of this approach
// Pseudo code point [1]
- Base stream (required params for query etc)
  -> Vehicle, Region inherit, supply params
  -> Loop through codes, if Foo `codes` array contains a `code` (for example "code1") apply that to the stream
  -> Loop through places and make a new instance per place

  1. Make a WebSocket which the user can emit the filter parameters to, and give the user the requested stream back (instantiate a new Stream based on the given parameters), example:
// Pseudo code point [2]
- User connects with websocket
  -> User emits data which he/she wants to see, for example: 
     {
       regionIds: [ 1, 4, 5 ], 
       vehicleIds: [ 22, 25 ], 
       codes: [ "code2", "code5024" ], 
       places: ["New York", "Las Vegas" ]
     }
    -> New stream gets generated on the given parameters
      -> Scenario A: User disconnects and there are no other users connected to that specific filter, terminate that instance
      -> Scenario B: Another user applies the same filter, don't make a new instance with the same params, connect to the instance user A created

  1. Always have the basic points running in a stream, and generate a new one if a user wants it. Drop the stream if no users are connected to it (see point 2 for reference)
// Pseudo code point [3]
- on app start
  -> for each region ->
    -> for each vehicle ->
      -> new Stream(region, vehicle)
- on user request code / place
  -> new CustomStream(code, place)


  1. Another one you dear developers and engineers can help me out with? :-)
// Pseudo code point [4]
- StackOverflow user reads this question
  -> The user's brain does the cool magic thing
    -> Gets genius idea
      -> Posts it in an answer

TL;DR

I want to achieve a dynamic way of supplying real-time data to users, which can have loads of possible combinations. The most used combinations will be the regions and vehicles, but there can also be specific combinations for codes and places.

I would also like to not blast the CPU and database with operations and queries because of endpoints that keep querying but aren't being used, which will make the application slow

Example of a query (with a real-time update listener on the database):

db.getCollection("foos").find({ regionId: '1', vehicleId: '1', /* Some stuff to check if code contains user entered code */, place: "Las Vegas" })
// Now, imagine this but with every single possible combination polling the database, not so great huh?

I'm really stuck here and I am really hoping you helpful software architects, developers, etc can help me out here and share your knowledge

Thimma
  • 119
  • 2
  • send all the data and let the clietns filter? – Ewan Aug 22 '20 at 21:05
  • @Ewan that is about 10GB of data, and you shouldn't really let the client do all the heavy work imo – Thimma Aug 22 '20 at 21:08
  • per second? The advantage to letting the client do the work is as the clients scale up, so does the cpu power – Ewan Aug 22 '20 at 21:11
  • @Ewan so basically you are saying, feed the API the clients CPU power? How would one implement that? – Thimma Aug 22 '20 at 21:16
  • 1
    each client has a cpu right? filtering requires cpu power. if you filter on the server then it saves bandwidth but costs cpu, if you filter on the client you save cpu but (potentially) use more bandwith. the optimal division of labour depends on the details of your data. – Ewan Aug 22 '20 at 21:19
  • 1
    Divide the data into multiple streams so clients can pick and choose exactly what "all the data" is for them. Then let them filter on their end. – candied_orange Aug 24 '20 at 06:37

0 Answers0