1

I'm writing a program for a company that will generate a daily report for them. All of the data that they use for this report is stored in a local SQLite database. For this report, the utilize pretty much every bit of the information in the database. So currently, when I query the datbase, I retrieve everything, and store the information in lists. Here's what I've got:

using (var dataReader = _connection.Select(query))
{
    if (dataReader.HasRows)
    {
        while (dataReader.Read())
        {
            _date.Add(Convert.ToDateTime(dataReader["date"]));
            _measured.Add(Convert.ToDouble(dataReader["measured_dist"]));
            _bit.Add(Convert.ToDouble(dataReader["bit_loc"]));
            _psi.Add(Convert.ToDouble(dataReader["pump_press"]));
            _time.Add(Convert.ToDateTime(dataReader["timestamp"]));
            _fob.Add(Convert.ToDouble(dataReader["force_on_bit"]));
            _torque.Add(Convert.ToDouble(dataReader["torque"]));
            _rpm.Add(Convert.ToDouble(dataReader["rpm"]));
            _pumpOneSpm.Add(Convert.ToDouble(dataReader["pump_1_strokes_pm"]));
            _pumpTwoSpm.Add(Convert.ToDouble(dataReader["pump_2_strokes_pm"]));
            _pullForce.Add(Convert.ToDouble(dataReader["pull_force"]));
            _gpm.Add(Convert.ToDouble(dataReader["flow"]));
        }
    }
}

I then utilize these lists for the calculations. Obviously, the more information that is in this database, the longer the initial query will take. I'm curious if there is a way to increase the performance of the query at all? Thanks for any and all help.

EDIT

One of the report rows is called Daily Drilling Hours. For this calculation, I use this method:

// Retrieves the timestamps where measured depth == bit depth and PSI >= 50
public double CalculateDailyProjectDrillingHours(DateTime date)
{
    var dailyTimeStamps = _time.Where((t, i) => _date[i].Equals(date) &&
                                        _measured[i].Equals(_bit[i]) &&
                                        _psi[i] >= 50).ToList();
        return
            _dailyDrillingHours =
            Convert.ToDouble(Math.Round(TimeCalculations(dailyTimeStamps).TotalHours, 2, MidpointRounding.AwayFromZero));
}

// Checks that the interval is less than 10, then adds the interval to the total time
private static TimeSpan TimeCalculations(IList<DateTime> timeStamps)
{
    var interval = new TimeSpan(0, 0, 10);
    var totalTime = new TimeSpan();
    TimeSpan timeDifference;

    for (var j = 0; j < timeStamps.Count - 1; j++)
    {
        if (timeStamps[j + 1].Subtract(timeStamps[j]) <= interval)
        {
            timeDifference = timeStamps[j + 1].Subtract(timeStamps[j]);
            totalTime = totalTime.Add(timeDifference);
        }
    }

    return totalTime;
}
PiousVenom
  • 247
  • 2
  • 9
  • 2
    Have you given thought to moving the calculations to the database? – System Down Oct 23 '12 at 19:42
  • 2
    you need to learn SQL, a report shouldn't need much if any code other than SQL to be generated. – Ryathal Oct 23 '12 at 19:44
  • One way would be to write a stored procedure that basically creates the exact table you need, and pull that from the database as a C# object. Do all of the calculation and transformation on the database... – aserwin Oct 23 '12 at 19:48
  • 1
    @aserwin [SQLite doesn't support stored procedures.](http://stackoverflow.com/q/3335162/41071) – svick Oct 23 '12 at 19:54
  • D'OH! Then forget my idea. – aserwin Oct 23 '12 at 19:57
  • 1
    I don't think we can evaluate the query without seeing the table structure, the SQL of the query, and maybe some actual benchmark results and an execution plan. And in that case, you would be better off having this migrated to http://dba.stackexchange.com/questions or http://stackoverflow.com/questions. – FrustratedWithFormsDesigner Oct 23 '12 at 20:01
  • I've never really worked with SQL before now, and am just learning as I go. Having experienced two vastly different sized databases, and having seen the drastic slowness of the bigger one, I was just wondering if there was anyway to make it faster. I realize this may not be a very elegant way of doing things, but it gets the job done, and that's all my employer really cares about. The speed issue he's not worried about. But I am. I just feel it could be better. – PiousVenom Oct 23 '12 at 20:03
  • @Prayos: Without knowing any specifics, all I can say is: look into indexing. Once you can provide actual SQL, DDL for the table(s), and maybe some timing numbers and execution plans, post to the other two sites I mentioned in my last comment. – FrustratedWithFormsDesigner Oct 23 '12 at 20:10

1 Answers1

6

If I understand what you're doing correctly, then you're actually copying the whole database into memory and querying it using Linq. This is extremely wasteful since you now have the data both in memory and in the database itself. What you need to do it to query the database itself for the data you need, which is what a database was designed for. For that you'll need to learn SQL. For instance, look at this Linq query:

var dailyTimeStamps = _time.Where((t, i) => _date[i].Equals(date) &&
                                        _measured[i].Equals(_bit[i]) &&
                                        _psi[i] >= 50).ToList();

You're pulling a list of all timestamps just to filter it to a few. However, using a SQL query like this:

SELECT timestamp 
FROM table
WHERE date = @date
AND measured_dist = bit_loc
AND pump_press >= 50

This will immidiately give you a list of the timestamps you need, without having to load the whole data in memory.

System Down
  • 4,743
  • 3
  • 24
  • 35
  • I had assumed, and probably rather incorrectly, that it was better to have 1 query get all the data and work with that, than to run individual queries for each calculation. I'm guessing that this is not the case? – PiousVenom Oct 23 '12 at 20:19
  • 2
    @Prayos - DBs are optimized for that sort of data querying, so it's best to leave it to them. See this recent question http://programmers.stackexchange.com/questions/171024/never-do-in-code-what-you-can-get-the-sql-server-to-do-well-for-you-is-this – System Down Oct 23 '12 at 20:25
  • This might sound incredibly dumb, but is it possible/smart to thread the queries to have them all running simultaneously? – PiousVenom Oct 23 '12 at 20:40
  • 1
    @Prayos - Again, the DB usually does that better than you can. Also, multi threading isn't as straight forward as most people might think, and have their own overhead. – System Down Oct 23 '12 at 20:41