I have a multiple files (one per CountryCode) which all get ~5000 entries added to it per day.
Each entry in the file looks like (256chars max):
{countryCode_customerId:{"ownerId": "PDXService","notificationId": "0123456789-abcdef","requestDate": "1970-01-01T00:00:00Z","retentionDate": "2020-08-13T14:02:35Z"}}
My API gets around 4K TPS and each request has CountryCode and CustomerID. For each request, I must query this file. If the countryCode_customerId is found within the file, I must reject the request (I have to use a local file to avoid latency overhead).
My concern is that this file will be unbounded and can grow quite large. I want to know which compression algorithm would fit best for such a file that will also allow for fast lookups.
I have considered: Trie and DAWG. If you can suggest some better ones, that'd be greatly appreciated!