0

Using Python and sqlite3. This simple print loop takes a few seconds to execute. I don't know much about databases, so I'm guessing I'm not using these queries correctly. Surely sqlite should be faster? The entire db has 5k or so rows and just a few columns if that matters.

con = lite.connect('gamelist.db')
with con:
    cur = con.cursor()
    cur.execute("SELECT * FROM GameList WHERE processed IS NULL LIMIT 4")
    for i in range(0, 4):
        print cur.fetchone()
  • 1
    Have you done any research on query optimization? Start with 'Select *' – JeffO May 04 '14 at 21:34
  • and have you considered that a query that has a limit of 4 may return less than 4 rows?\ – jwenting May 05 '14 at 06:37
  • and not just query optimisation, but database optimisation. And network performance, and all the other things that can cause a database to appear slow. – jwenting May 05 '14 at 06:38
  • You're very helpful guys, but I have very little experience with sqlite and was just wondering if I was abusing the queries horribly. Turns out my memory was just hogged by a process that wasn't terminated. It works fine now. I really don't think there's all that much wrong with clumsy sub-optimal make-shift loops when you're just trying to get a feel for what you're doing... – Benjamin Lindqvist May 05 '14 at 11:03
  • Use EXPLAIN QUERY PLAN in sqlite3 utility prompt to understand what is your query actually doing. – lowtech May 22 '14 at 21:44

1 Answers1

6

Don't guess, measure

When you have an issue related to performance, don't guess. Grab a profiler, measure your code, and decide what to optimize based on the results, not your or somebody else's guesses.

Your first guess is that sqlite3 is slow. Your second guess is that you have memory leaks. Chances are, you'll find when profiling your app that neither of your two guesses is even close to the actual issue. It happens to every developer; that's why one shouldn't guess.

Python documentation lists a few profilers for Python. If you never used one, grab a profiler of your choice and see where it could get you. Bonus points for coming back and posting the results by editing your question.

Be precise

Also, when dealing with performance, be as precise as you can. Terms like "A few seconds", "slow", "standard queries" have nothing to do when it comes to performance, profiling and optimization.

  • There is no such a thing as a few seconds. There is a statistical data about the performance of a given piece of code ran in a very specific context. Something which takes five seconds on a Pentium IV we threw away yesterday may run in a few milliseconds on the new server we just purchased.

  • There is no such a thing as code too slow: there is a piece of code which performs under an average of 327 ms. on a given hardware under given stress, while a non-functional requirement (see also this answer, point 2 and 8) specifies that it should perform under 300 ms. in 95% of cases.

  • There is no such a thing as a standard query. For somebody, a standard query may include joins. For another person, it's a select, insert, update or delete with nothing else such as where.

You specified the number of rows in the database. That's the precision I'm talking about. Measuring a select on ten rows is not the same as on one billion rows.

Reduce the measured code to a minimum

When you can't use a real profiler, it can help to remove all the code you don't need, and then measure. In your particular case, this involves the conversion of an array to a string and the displaying of this string to a console.

Try actually measuring the performance of the same code without printing those results and see how it affects.

Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513