0

I'm trying to decide who is right in the following argument:

How to effectively process data in MSSQL? Which one is faster?

Opinion 1: Data requests (this is especially true for complex ones) should be handled by SQL queries and the result be returned to the requesting C# code.

Opinion 2: First, raw data should be requested by simplistic SQL queries (SELECT * involving all concerned (joined) tables) and selection by conditions should be handled by C# LINQ or set operations involving lists, maps, arrays and whatnot, combined with iterations and local variables. This way, the heavy lifting happens in memory, making it faster than the disk-intensive operations of SQL server.

Opinion 3: A smart mix of the above.

(Personally I think Opinion 2 just explains how Opinion 1 works under the hood, of course we have to replace the terms C#, sets, loops etc. with low level constructs that SQL uses.)

stevie
  • 225
  • 2
  • 5
  • 1
    a select * against a database is a real performance killer. And what disk-intensive operations are you talking about? Before your sql server gives you your data it also has to be loaded into memory so you don't win anything on that point, it's a non-argument. – Pieter B Jan 12 '17 at 13:05
  • SQL Server is very good a joining related data. Unless you really know what you're doing, trying to replicate joins in C# with different sets of data will be much slower than in SQL, so I disagree with Opinion 2 in general. – Graham Jan 12 '17 at 19:39
  • The answer to this question is dependent on where the latency in your infrastructure exists, how your database is designed, the type of processing you need to do to the data, and probably other factors that I can't think of off the top of my head. – David Cloutman Jan 12 '17 at 20:10

3 Answers3

4

Option 3 - a smart mix of application server and SQL server queries.

Both option 1 and 2 are dogmatic and each has reasonable exceptions that depend on things like:

  • SQL Server load
  • Application server load
  • Contention on the SQL Server side (waits, locking etc...)
  • Licensing costs of the different servers (perhaps it is cheaper to do something on one server and not the other)
  • Actual logic of the query

In general, it is true that the simpler the queries, the less work SQL Server has to do, but this depends so much on the DB design, what indexes exist, how much data is involved and the relative "expensiveness" of the operations (time and space considerations, as well as actual money - costs of the servers and running them).


For what's it worth - at Stack Overflow we do 3.

Oded
  • 53,326
  • 19
  • 166
  • 181
1

Opinion 3: A smart mix of the above.

The definition pretty much says it all. The only question is 'can a smart mix be achieved?', which would depend on those writing the code in the first place.

0

Option 1 is "faster" with an individual request, but scales badly if the selection logic is complex, because usually a sql box will be shared among many app boxes.

Option 2 as written will entail huge data transfers and possibly running out of memory on the app boxes. I assume you mean having have SOME where clauses against the indexes??

I don't think you can sum up everything else as Option 3 "be smart!" there is a right and a wrong answer to "should clause X be in the sql". But you can only be sure if you measure the performance. Not just of the query, but of the system as whole, under load.

Generally I have found it to be closer to option 2 than option 1. I guess this question tends to come up when a DBA proposes doing ALL logic on the database because "its faster". which is definitely not correct.

Ewan
  • 70,664
  • 5
  • 76
  • 161
  • Yeah, I asked the guy proposing Option 2 and he admitted that pre-filtering the query with at least some where clauses is preferable :-P – stevie Jan 12 '17 at 13:41