How do services that rely on client-side access to an API or data consumption endpoint prevent abuse?

Question

For example, web application monitoring services like New Relic RUM or Bucky use JavaScript to collect data on a user's session and eventually send that data back to an endpoint. For New Relic, the data is sent to an endpoint like "http://bam.nr-data.net" with a great deal of parameters. With Bucky a similar payload is sent to an endpoint of your choosing.

How do these services prevent malicious users from abusing those endpoints? Couldn't a user easily mess with the JavaScript to send junk data? And, with a little more work, might they be able to send many payloads of that junk data, polluting the data these services are meant to collect?

I'm not a front-end developer myself (in DevOps), but from what I understand it is essentially impossible to completely prevent abuse of anything accessible in client-side code. Other answers here on SE and elsewhere suggest the same.

Are these services simply biting the bullet and hoping nobody abuses them? Do they rely on obscurity/obfuscation to minimize the chances? Or have they figured out a way to mitigate the implications of the client-accessible nature?

Here are links to [New Relic RUM](https://blog.newrelic.com/2011/05/17/how-rum-works/) and [Bucky](http://github.hubspot.com/bucky/); couldn't include them in the post as I don't have enough reputation. — edaemon, Nov 16 '16 at 18:07
What's your threat model for this abuse? What harm might I, as a malicious user, do that you are concerned with preventing? — Jonah, Nov 16 '16 at 18:31
It's less that I'm concerned with preventing it -- it's more of a theoretical question. I use services like these and I'm wondering what they do about the issue. For an example of harm one could cause with NR/Bucky, you could mess up the victim's data to suggest that their page load time is 3-4 times as long as it really is, making their data useless and causing them to spend time and money on improving something that doesn't need their attention. Does that make sense? — edaemon, Nov 16 '16 at 18:37
I think that the point that @Jonah is trying to make is: if there is no real benefit for the attacker, why would they take the effort to read your obfuscated code and modify the api requests that they are not obviously wrong (they have to use a valid cookies' token and your api key) and can mislead your analytics? When there is no incentive, there is no threat. — Juanmi Rodriguez, Nov 17 '16 at 08:40
@JuanmiRodriguez - there's certainly very little incentive and essentially no financial incentive, but in my experience "for the lulz" is all the incentive that some malicious users need. — edaemon, Nov 17 '16 at 20:10

score 1 · Accepted Answer · answered Nov 17 '16 at 08:23

Analytics tools usually have some sort of transaction which identifies the request, page load, session or whatever unit we want to measure performance characteristics for. We could use that to make it difficult for me to report fake metrics about your site.

If metrics will only be accepted for recent transactions and transaction ids are some sort of GUID then the search space of all transaction ids can be enormous while the set of valid transaction ids is relatively small. I can guess transaction ids all day but can probably only get the system to accept metrics for transactions I know about and initiated.

That means I can send you fake metrics about my own use of your site but if I want to flood you with distorted data I probably need to make a large number of requests myself. At that point my attack starts to look more like a denial of service flood than a sneaky manipulation of metric data.

Any client reported data needs to be treated as somewhat unreliable anyway. Ad blockers and unreliable networks mean a fair number of client requests will never reach your backend so while you get a hopefully useful sample of what average performance looks like you're rarely ever able to get a complete picture of every client's performance.

That makes sense. It sounds like the general approach is to make it difficult to abuse and eliminate any low-hanging fruit, but since it's not possible to prevent completely you have to approach the data with a small degree of uncertainty. Would that be an accurate summary? — edaemon, Nov 17 '16 at 20:18

score 0 · Answer 2 · answered Nov 17 '16 at 07:09

One way that I know of is by using pre-shared key for a particular domain.

A very good example of this will be the Google Maps API.

Maps API allow access to the map services from the client end by using an API key(sent with each request).

Google servers know that this API key is associated with this particular domain(from where the request originated). If some other domain uses the same API key to get the data, it will be blocked by server. But still abuser can modify the request headers and get the data. Servers also maintain the number of queries received and they can be configured to monitor any unusual activity or large number of requests.

So there are only preventive measures that the service provider can take and thus cannot fully prevent the abuse.

How do services that rely on client-side access to an API or data consumption endpoint prevent abuse?

2 Answers2