14

I'm interested in developing a large-scale user-facing website that is written in Java.

As for design, I'm thinking of developing independent, modular services that can act as data providers to my main web application.

As for writing these modular services (data providers), I can leverage an existing framework like Spring and develop these services following the RESTful design pattern, and expose resources via HTTP with a message format like JSON...or I can leverage an existing network framework like Netty (http://netty.io/) and serialization format like Protobufs (https://developers.google.com/protocol-buffers/docs/overview) and develop a TCP server that sends back and forth the serialized protobuf payload.

When should you choose one over the other? Would there be any benefit of using a serialization format like Protobufs and sending stream of bytes over the wire? Would there be overhead in just using JSON? How much overhead is there between using TCP/IP and using HTTP? When should you use Spring over Netty, and vice versa to build such a service?

HiChews123
  • 1,103
  • 2
  • 9
  • 10
  • It sounds like you're thinking more about the technology stack than you are about the actual requirements. How could any of us possibly answer this question without knowing what it is that you need to *do*? Are you creating a multiplayer game that's supposed to have near-zero latency? Or a social bookmarking application where most of the access is already via HTTP and you might be caching data for hours at a time and don't even care about freshness, let alone latency? – Aaronaught Oct 12 '13 at 13:34
  • 3
    I don't think OP is asking us to make a choice for him. He is simply asking a high-level question about how such choices are made and what factors are considered. Don't think there's anything wrong to providing a high-level answer to that.... and I did. – DXM Oct 12 '13 at 15:39
  • I'm generally opposed to using binary formats unless you really have to. No binary file formats, no binary serializations, etc. For example, in Java, binary serializations cause incompatibilities between Java versions and versions of your own software, but I believe that XML doesn't nearly as much. I would think the following TCP/IP > HTTP > XML Of course, it would depend upon what you are doing. I think that JSON is an alternative to XML. I don't know much about Spring or Netty, though I do read that people are using Spring. – Kaydell Oct 14 '13 at 03:05
  • +1 DXM, I am asking high-level questions as food for thought when thinking about making such a decision. – HiChews123 Oct 16 '13 at 00:29

2 Answers2

24

There are definitely pros/cons about using JSON over REST vs. straight up TCP/IP with binary protocol and I think you are already suspecting that binary protocol will be faster. I can't tell you exactly how much faster (and this would depend on a lot of factors), but I would guess maybe 1-2 orders of magnitude difference.

At first glance if something is 10-100 times slower than something else, you might have a knee-jerk reaction and go for "fast thing". However, this speed difference is only in the protocol itself. If there's database/file access on the server side, that won't get impacted by your choice of the transfer layer. In some cases, it might make your transfer layer speed much less significant.

HTTP REST and JSON are good for a number of reasons:

  • they are easily consumable by just about anyone. You can write your Web App, then turn around and publish your API for the rest of the world to use. Now anyone can hit the same end-points and get to your services
  • they are easily debuggable, you can open a packet sniffer or simply dump incoming requests to text files and see what's going on. You can't do that with binary protocols
  • they are easily extendable. You can add more attributes and data at a later time and not break compatibility with old clients.
  • consumable by javascript clients (not sure they have protobuf JS parser yet, don't believe there's one)

Protobufs over TCP/IP:

  • they are faster

If it was my choice, I would hands down go with HTTP REST and JSON. There's a reason that so many other companies and websites went that route. Also keep in mind that in the future you could always support 2 end points. If your design is correct, your end-point choice should be completely decoupled from your server-side business logic or the database. So if you realize later on that you need more speed for all/some requests, you should be able to add protobufs with minimal fuss. Right off the bat however, REST/JSON will get you off the ground faster and get you further.

As far as Netty vs Spring goes. I haven't used Netty directly, but I believe it is just a light-weight web server where as Spring is a framework that provides a lot more for you than just that. It has data access layers, background job scheduling and (I think) an MVC model, so it is much more heavyweight. Which one to choose? If you decided to go HTTP way, then next question is probably how standard is your app? If you are about to write some crazy custom logic that doesn't fit the standard mold and all you need is just a HTTP server layer, go with Netty.

However, I'm suspecting you app isn't that special and it could probably benefit from a lot of things that Spring has to offer. But that means that you should structure your app around Spring's framework and do things the way they expect you to do, which would mean learning more about Spring before diving into your product. Frameworks in general are great because again they get you off the ground faster, but the downside is that you have to fit into their mold instead of doing your own design and then expect the framework to just work.

(*) - in the past it was pointed out that my posts do not reflect opinions of the entire world, so I'll go on the record and just add that I have limited experience with either Netty (I've used Play framework before which is based on Netty) or Spring (I've only read about it). So take what I say with a grain of salt.

DXM
  • 19,932
  • 4
  • 55
  • 85
  • 1
    +1, especially for "this speed difference is only in the protocol itself. If there's database/file access on the server side, that won't get impacted by your choice of the transfer layer". 99% that's exactly how it will be, and premature optimization (in the wrong place) won't help with that at all. – Shivan Dragon Oct 12 '13 at 20:48
  • Thank you for your lengthy response and in-depth analysis on comparing the two. I understand the benefits of building a RESTful application because it's easily consumable by public clients. In the case however I want to keep everything in-house and do not want to expose the service (I take care of serialization/deserialization), I can't see why not using a custom binary protocol wouldn't be the first pick. Yes, you can get off the ground faster with existing frameworks, but at the expense of being locked into their APIs and less fine-grained control. – HiChews123 Oct 16 '13 at 00:32
  • REST is easy to consume by ALL clients, not just public ones, but they are certainly included. My company has a product that we've been building for about a year now. We had "proprietary" protocol that happened to be rest. We just opened it up to others. One thing they teach you in business school is "options thinking", make decisions to leave you as many options as possible so as you can make decisions at a later date. So given all set equal, I'd pick REST not because I have JS clients or API access today, but that I have the option of having it in the future if I need it. Then again, if... – DXM Oct 16 '13 at 03:38
  • ... you are set on using binary protocol, go for it. 96% chance your choice of protocol won't have any effect on your final application, so I wouldn't sweat that decision too much. And as I said in the answer, with decent design, you should be able to swap protocols at a later date anyway. Another thing I like to do is try both cases, if I'm on the fence on making a decision, I flip a coin and pick option A. Next time I do a similar project I pick option B just so I can then go back and compare/contrast my experience. Sometimes, that's the only way you decide for yourself which is better – DXM Oct 16 '13 at 03:41
  • @DXM, great responses, bravo! – HiChews123 Oct 16 '13 at 06:17
0

This is actual a non question. According to the Internet protocol suite tcp is a protocol in the transport layer and http is a protocol in the application layer. You're comparing totally different things with each other. (See more here: http://en.wikipedia.org/wiki/Internet_protocol_suite)

In fact, most http is over tcp/ip. So to answer your question, yes you should use tcp/ip. Then you want to add a application layer protocol over that (like http) and then a data format (like json, xml, html). Netty let you use http and protobuff is equal to json, xml, html.

It all depends on what your requirements are and what type of data you'll need to transport. Do you need sessions in your protocoll, can a handshake improve your protocol configuration, how much data will you send at once, do you need encryption? These are questions you need to answer when choosing an application protocol.

For choosing a data representation format (json, xml, html, protobuff, etc.) it depends on your bandwidth, readability, language/tool support etc.

You can't compare http to tcp.

Remember that speed isn't everything. Speed is of no use if you can't express yourself in a sensible way.

iveqy
  • 468
  • 1
  • 5
  • 10
  • 5
    There is nothing in his question that suggests he doesn't know the difference between layers of the networking stack. He asked should he use HTTP (the fact that HTTP is a layer above TCP/IP is assumed) or use TCP/IP with his own custom protocol. There is nothing wrong with his question. – Michael Oct 14 '13 at 16:14
  • I disagree of course. That's not how I understood him – iveqy Oct 14 '13 at 17:35
  • 1
    Yes, I understand that HTTP is at a layer above TCP/IP, my question is indeed about thinking about making a decision in terms of tradeoffs - latency, speed of development, etc. Thanks for posing questions for me to think about, though! – HiChews123 Oct 16 '13 at 00:34
  • @iveqy re-understand him then. Your answer is a non-answer and just pollutes the thread. You're trying to answer a question that was never asked rather than simply conceding that you misunderstood the question. – David Cowden Oct 16 '13 at 01:46
  • 2
    @acspd7 I would avoid creating your own protocoll, there's plenty of already proven protocolls out there and unless your protocoll is something that will give you an advantage over your competitors, you're probably better off with a standard protocoll. I've implemented a custom protocoll, it was plenty of fun! However encryption, hole punching, keep alives, handshaking (different networks require different framlengths) etc. it's a lot of work for doing things good. Not to mention all the documentation you'll need. Think of what you really need in features before doing something custom. – iveqy Oct 16 '13 at 04:34
  • @DavidCowden Can't know if I misunderstood him before he said so. Just because Michael understood his question in an other way doesn't mean that I was wrong. Now I was and that's fine, however I do enough supporting to learn never to assume anything. – iveqy Oct 16 '13 at 04:36
  • @iveqy, I'm not interested in writing my own custom protocol actually, sorry for the misunderstanding. "Custom" in this context means something that isn't so mainstream as XML or JSON. Google Protocol Buffers (GPB) are getting more recognition, though. Check it out: https://developers.google.com/protocol-buffers/ What I'm actually concerned about is speed and compactness of the messaging. GPB seems to offer benefits in both speed and compactness of the message, so why not opt for it besides the fact that it's not a "well-known" format? – HiChews123 Oct 16 '13 at 06:07
  • 1
    GPB are well documented, used by many others so I can't really see any problem with using it. Beeing more terse than XML and JSON should be great! (you might lack in human readability, but if that's not a requirement...). However, don't you miss a layer? Usually you've a layer between tcp and xml, json, protobuff. Something like http, ssh, etc. – iveqy Oct 16 '13 at 06:31