32

I have a client that is looking to get a website/mobile apps/desktop apps built that deal with very sensitive data (more sensitive than bank/card details). Because of the sensitive nature of the data, they do not want to save it in a central database but they still want their apps to synchronise (let's say I add some data into my mobile app, I then want to be able to go to my desktop app and see the same data).

I cannot think of a nice, reliable way of doing this and I am not sure there is one. Which is why I am here. Does anyone know how I could deal with this data?

One solution I was thinking about was to have a client side database on each app that would somehow sync between apps, I can see this being very unreliable and getting messy though.

user2424495
  • 463
  • 4
  • 3
  • 2
    If you want the data to be synchronized, it still has to be somewhere accessible, so the data can be pulled into your application. You could divide the data among more databases, thus if one of them gets somehow breached, you wouldn't leak all your data. If this satisfies the customer, just add more database connections to your application and pull your data from them. – Andy Oct 20 '15 at 15:41
  • 2
    Is this a peer to peer problem? or just 1 desktop talking to 1 smart phone (for each data space)? – ebyrob Oct 20 '15 at 15:45
  • 7
    You could ensure confidentiality in the database by encrypting the data on the server with a key which is only known to the user. – Philipp Oct 20 '15 at 15:48
  • @ebyrob It could be any number of devices that access the app/web site. So it is possible that a user could work on their phone, then switch to a tablet, then a laptop using a desktop app, then laptop using the website. – user2424495 Oct 20 '15 at 16:28
  • So the data is just user-related, and the only one who should get access to it is the user to whom the data belongs (but maybe using different devices). Is that correct? – Doc Brown Oct 20 '15 at 16:30
  • @Doc Brown yes that is correct. – user2424495 Oct 20 '15 at 17:30
  • 26
    This sounds like a scheme that somebody who doesn't understand security thought of. Whoever came up with this requirement should be formulating a question about securing data on [Security.SE](http://security.stackexchange.com/). – jpmc26 Oct 20 '15 at 22:11
  • 2
    If peer-to-peer, use a decentralized database implementing automatic [Shamir's Secret Sharing](https://en.wikipedia.org/wiki/Shamir%27s_Secret_Sharing), if not then use a database with [Zero Knowledge](https://en.wikipedia.org/wiki/Non-interactive_zero-knowledge_proof) in a similar fashion to [SpiderOak](https://spideroak.com/). – gaborous Oct 20 '15 at 22:31
  • 4
    @user2424495: If the data needs to be available via a website, the data must be available where your website is served from - which is typically a central server. Or you'd need to write a browser plugin that supplies the data client-side. – Bergi Oct 21 '15 at 00:31
  • What's wrong with having it stored in two places? 1) Localstorage on the client 2) In an encrypted session on the server which gets cleared, let's say, every 7 days or something. @bergi you definitely don't need to store the data in any meaningful way, and you wouldn't need a browser plugin to do this level of crypto (I understand client-side crypto isn't always good, but in this case you're encrypting the data on the server, dencrypting it on the client). – Ryan O'Donnell Oct 21 '15 at 15:44
  • 1
    Security requires Confidentiality, Integrity and Availability. I.e. Keeping secrets, preventing unauthorised modification, and preventing destruction. The last requires backup of some sort. – Ben Oct 21 '15 at 17:03
  • Look at bittorrent sync. You could put all the data in a synced folder, and either use BT sync directly, or use the bittorrent protocol in the same way it does. – Nathan MacInnes Oct 21 '15 at 17:09

5 Answers5

60

Plenty of sensitive information gets stored in databases. In fact, a central database is probably the most secure way to store this data. Large enterprise databases have tons of functionality to do things like encrypt sensitive information, to audit who accesses it, to limit or prevent people including DBAs from viewing the data, etc. You can have professional security experts monitoring the environment and professional DBAs overseeing backups so that you don't lose data. It would almost certainly be much easier to compromise data stored on some random user's mobile device or laptop than to penetrate a well designed security infrastructure and compromise a proper central database.

You could design the system with a central database that stores only encrypted data and store the user's private key on the user's device. That way even if the central database is completely compromised, the data is usable only by the user. Of course, that means that you can't restore the user's data if they lose their key (say the only copy was on their phone and their phone was damaged). And if someone compromises the key and, presumably, their login credentials, they would be able to see the data.

Justin Cave
  • 12,691
  • 3
  • 44
  • 53
  • Yes, I agree that with encryption, the database would technically be secure but I've been told that simply by storing the data online would deter customers from using the product. So ideally, I don't want to store any sensitive data. If the data was on the users phone/laptop then it would be less of an issue as it would not be my fault if it was 'hacked'. – user2424495 Oct 20 '15 at 16:20
  • 24
    @user2424495 - If the goal is actual security, having the data stored centrally almost certainly wins. From a marketing standpoint, it may not be your fault if someone's phone is hacked. But it will certainly reflect poorly on the app if word gets out that it's relatively easy to hack (since most people's systems are very poorly secured). I'd rather explain to people that their data is stored encrypted using military grade security than to hope that they don't blame me when their poorly secured phone gets hacked. – Justin Cave Oct 20 '15 at 16:39
  • 27
    This is the only answer so far that truly addresses the question and provides the best possible security outcome. The requirements the OP was given are ludicrous. If the data is so sensitive that the idea of the data even being available over a public network is offensive to users, then the app idea is not realistic. Full stop. Client devices are not secure and cannot be trusted. – maple_shaft Oct 20 '15 at 16:48
  • It sounds like user is concerned about access from US security agencies like NSA and FBI. If stored in a US-based database, that could be accessed (with or without subpoena, that is another discussion). But if stored only "within" the app, that would make this access more difficult, if not impossible. I don't have enough information for an answer, but @user2424495 might look at letting your user store data in file in their choice of file-sharing service like Dropbox, etc. That might satisfy your client. – mharr Oct 20 '15 at 20:21
  • 2
    @mharr if the database stores only encrypted data (encrypted before leaving the device) it doesn't matter what a court order says, it physically can't be decrypted without the encryption keys, which only the user has. – Richard Tingle Oct 20 '15 at 21:21
  • 9
    @RichardTingle Unless said government agency has broken the encryption already. – Bob Oct 21 '15 at 01:18
  • @maple_shaft don't you think that is very silly of you? How can you dismiss every possible application based upon a few lines of description? This application design is actually VERY interesting: it'd apply to ANY scenario where: 1) I want a webapp to do some calculations for me, but not store my information and 2) I want to access that information, if only for a short period, on multiple devices. There are plenty of situations where I'd consider my home PC a 'safe' place for my data, but not a random server. This is an interesting problem: programmers should be interested in problems :) – Ryan O'Donnell Oct 21 '15 at 15:49
  • 3
    I never said that the problem wasn't "interesting", I find the question and answers so far very intriguing and thought provoking. This is EXACTLY the kind of question that makes this site great. I am really questioning the requirements and some of the assumptions that are possibly being made about the data. My spidey sense just screams to me that these requirements and assumptions about the importance of the data are the musings of a bloviated and self worshipping company that fancies itself informed and insightful but in reality cont... – maple_shaft Oct 21 '15 at 16:29
  • 1
    (cont).. is painfully ignorant to anything about information security. These kinds of requirements from these kinds of businesses are the worst because they presuppose architectural constraints to system design with incomplete and invalid information. This certainly makes for an interesting thought exercise, because in reality I can see that perhaps there are legitimate architectural constraints. – maple_shaft Oct 21 '15 at 16:33
  • 1
    @RyanO'Donnell "1) I want a webapp to do some calculations for me, but not store my information and 2) I want to access that information, if only for a short period, on multiple devices." This is basically a logical contradiction. You want the application to transmit information asynchronously without storing it anywhere. That isn't possible. If you don't want it done asynchronously, then there's no point in involving a web application; load it by connecting your devices with USB. Hence the requirement being ludicrous. – jpmc26 Oct 21 '15 at 18:45
  • @jpmc26 I feel it is the same as what happens with passwords: at some point, you do have the plaintext available. However, we make the general trust that it isn't always stored. Second, a web application offers a lot of benefits over a normal application. I agree that the requirements have a lot of things silly going on, but I don't think the problem should be immediately dismissed. – Ryan O'Donnell Oct 21 '15 at 18:56
  • @maple_shaft I agree, I apologize for being hostile. You're definitely correct: 99.9% of the time requirements such as these are born out of misunderstanding of technology, security, or various other related elements. You're right to suggest that a conversation needs to be opened with the OP's benefactor, and I'd be interested in the outcome of that one! – Ryan O'Donnell Oct 21 '15 at 18:58
  • 1
    @RyanO'Donnell Of course the problem shouldn't be dismissed, but it does absolutely need a realistic, sane approach. This is an XY-problem if I've ever seen one, and XY-problems should always be addressed at the root cause, not the first problem asked about. – jpmc26 Oct 21 '15 at 19:37
38

You need to back up a couple steps and, in consultation with your client, work out a threat model. (Yes, that's a link to a 600-page book; yes, I am seriously recommending you read the entire thing.)

A threat model starts by asking questions like

  • Why does the app need to store this sensitive data in the first place?
    • Can you avoid storing it at all?
    • Can it be thrown away after a short time?
    • Does it truly need to be accessible to more than one device?
    • If it must be accessible on more than one device, does it need to be stored on more than one device?
  • Who are the people who are allowed to see each user's sensitive data?
    • Can this list be made shorter?
  • Who are the people who may come in contact with each user's sensitive data while trying to do their jobs, but have no need to know it?
    • Can this list be made shorter?
    • Can the data be rendered inaccessible to them without harming their ability to do their jobs?
    • If it can't be inaccessible, can it at least be made incomprehensible? (This is what encryption does, in the abstract: it renders data incomprehensible.)
  • Who are the people who want to see the sensitive data, but are not allowed?
    • What opportunities do they have to get at the data?
    • What do they want to do with the data once they have it?
    • How angry will they be if they don't get what they want?
    • How much money, time, CPU cycles, and human effort are they willing to spend?
    • Do they care if anyone knows they have seen the data?
    • Do they want to access specific users' sensitive data, or will anyone's do?
    • What do they already know?
    • What do they already have access to?

Once you know the answers to these questions you will be in a much better place to figure out what to do.

Keep in mind that there might be more than one answer to each set of questions, especially the ones dealing with the attackers (the people who want the sensitive data but are not allowed to have it). If you can't think of at least half a dozen different archetypal attackers, with different motivations, goals, and resources, you've probably missed something.

Also keep in mind that the attackers who cause you (and/or the client) the most trouble, are the most likely to make a giant splash in the media if their attack succeeds, or who do the largest amount of aggregate damage, probably are not the attackers who can cause the greatest harm to individual users if their attack succeeds. Your client's company rationally cares more about aggregate damage, but the users rationally care more about harm to themselves.

zwol
  • 2,576
  • 1
  • 16
  • 16
  • 4
    This doesn't really try to answer the question or disprove it, but this really is an awesome answer to a question that wasn't asked. – maple_shaft Oct 21 '15 at 02:54
  • 12
    @maple_shaft: Well, it answers the question that the OP meant to ask. Since the question could well be seen to suffer from [X-Y problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem/66378), this seems a good answer. – sleske Oct 21 '15 at 08:27
8

One option to do the synchronization would be to do it peer-to-peer. This will still require a central server, but that server won't handle any of the data.

When a device goes online, a central server gets a notification with the user-id. When a second device of the same user goes online, the server sends both devices the IP addresses of the other. The devices can then exchange data directly. Caveat: one device needs to act as a server, so at least one can not be behind a NAT router.

Don't forget that you will need strong authentication and encryption for both the notification mechanism and for the peer-to-peer exchange.

Philipp
  • 23,166
  • 6
  • 61
  • 67
  • 1
    Sounds like a versioning scheme would also be required to avoid sending all the data back and forth all the time between the two devices... – ebyrob Oct 20 '15 at 15:51
  • The p2p exchange would be a great solution, if it wasn't for the need of unnecessary setup forcing onto the end user, which, in my opinion, would make the usage of the application less user friendly. Then there is the question, whether the customer wants to choose between data vulnerability and a bit of a hussle when setting up the app, which depends a lot on how exactly sensitive the data is and how much the users care. – Andy Oct 20 '15 at 15:51
  • 1
    @DavidPacker assuming you set up and maintain the first server, what are the additional setup steps? – ebyrob Oct 20 '15 at 15:56
  • @ebyrob I might be misunderstood, but I understand it that the server provided by the app creator does not containg anything but the procedure for p2p synchronization. But the data has to be pulled through this server from one of the clients' devices - and the client has to make himself, or his data, accessible - this is the setup I have been talking about. – Andy Oct 20 '15 at 16:06
  • @DavidPacker All configuration which would be needed would be the username and password. A key derivation function can be used to derive a private key from these. – Philipp Oct 20 '15 at 16:39
  • 1
    @David, Philipp is suggesting peer-to-peer exchange of the sensitive data, thus no sending of that to or even thru the central server. The central server is only there to facilitate one peer finding another peer; then it gets out of the way. – Erik Eidt Oct 20 '15 at 17:24
5

Make it somebody else's problem.

Store the data locally in each app, then give users the option to enable synchronization using their own account with a third-party service (Dropbox, Google Drive, etc). Also, think about encrypting any data uploaded to the third-party service (there are pros and cons to doing that).

This gives the appearance that users own their own data, since they have to opt-in to data synchronization. It makes the apps useful for people who don't want any sharing to happen. And it makes someone else responsible (technically and, potentially, legally) for the ongoing headaches of keeping any shared data safe.

James Mason
  • 159
  • 3
1

Your client's concern seems to be about the visibility of this data. the first question to ask your customer is if the data was encrypted, where can it be stored? Then, ask your customer what kinds of access controls they want in place before the data can be decrypted and processed - where can the decryption key be stored? is is a seperate key per user? etc...

If your customer doesn't want the data stored anywhere, do they want the user to enter it my hand each time?

Michael Shaw
  • 9,915
  • 1
  • 23
  • 36