11

One programmer is testing and comparing the same application which uses the same database structure, and the same data, only in two separate databases, one with Oracle 8 and one with Oracle 9.

The app runs a query with no ORDER BY clause.

He claims that the ORDER-BY-less query should return the rows in the same order in both databases.

I tell him there's no warranty of the same row order unless you explicity provide an ORDER BY clause.

The database has the same indexes and keys. But the explain plan shows that in one of the databases the engine is using the key of one of the joined tables whereas in the other database it's using another's.

He insinuates that the two DB environtments are not equal, which is so because they have different statistics, different rdbms engines, etc., but not because I failed to replicate every index the original database has.

I tell him he must explicity provide an ORDER BY clause if the order is really that important.

The question

So I can explain him better:

In what order does a query fetch rows when you don't explicity provide an ORDER BY clause, and why doesn't that query return the rows in the same order ?

Tulains Córdova
  • 39,201
  • 12
  • 97
  • 154
  • 3
    It's undefined. I don't believe that the SQL specification states the specific order that records are to be returned, so it's going to be implementation dependent. – Robert Harvey Aug 09 '13 at 21:34
  • 1
    @RobertHarvey That's exactly why I tell him. In fact: by definitions relational tables must have no particular order. – Tulains Córdova Aug 09 '13 at 21:36
  • 1
    It might make sense even on the exact same version of software to use a different index because it may be more selective based on the index statistics it has gathered for the exact data in that database. So you *really* can't rely on the order if you don't specify it. – psr Aug 09 '13 at 23:21
  • You can realistically expect the first query by unindexed column to follow "insert time" order; by indexed "update time". Consecutive requests may likely be "tainted" by cached results and so quite randomized; nevertheless never depend on it - it may change from version to version, by parameters, by update operations and by bad weather on the full moon. "Undefined" is the correct answer, and anything else is at best informed guesses. – SF. Aug 10 '13 at 17:44
  • 1
    Result sets returned by RDMBS-es are just that: **sets**, which by definition don't have a particular order. So the RDBMS can return them in any order it likes and change the order again on the next query execution. Relying on a particular order without an ORDER BY clause would be a mistake. I always try to explain this to my coworkers, but I'm only sucessful half the time :D . – Radu Murzea Aug 11 '13 at 08:52
  • Duplicate of this http://stackoverflow.com/questions/3764475/oracle-sql-returns-rows-in-arbitrary-fashion-when-no-order-by-clause-is-used – MarkJ Aug 11 '13 at 14:40

2 Answers2

25

From Wikipedia:

The ORDER BY clause identifies which columns are used to sort the resulting data, and in which direction they should be sorted (options are ascending or descending). Without an ORDER BY clause, the order of rows returned by an SQL query is undefined.

So it's undefined.

The SQL specification doesn't state the specific order that records are to be returned, so it's going to be implementation dependent.

With no indexes on the table, the sensible order would be the order in which the records were inserted. With a Primary Key defined, the sensible order would be the order of the Primary Key. But since the ANSI spec doesn't require a specific order, it's up to the vendor, and their sensibilities may differ from yours or mine.

Since the order is not stated in the specification, it is unwise to rely on the behavior of a particular vendor's implementation, since it can vary from one vendor to another, and the vendor may change the order any time they wish, without warning.

As you said, just include the ORDER BY clause, if order is important.

Radu Murzea
  • 1,810
  • 2
  • 18
  • 24
Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
  • The query has several joined tables. So one DB engine sorts the resultset by one criteria and the other DB used another criteria. Obvioulsy, when an ORDER BY clause is provided, both queries return the rows in the specified order. – Tulains Córdova Aug 09 '13 at 21:40
  • +1 AFAICR the order is even allowed to vary each time you execute any given query on a particular database instance. – MarkJ Aug 11 '13 at 07:31
  • 2
    I would say that, without an ORDER BY clause, the only sensible result order is whatever creates the least overhead. For simple DB storage engines and queries, that will often be insertion order (and for generated primary keys the same as primary key order). But as soon as you have hash joins in there, I'd expect the order to be essentially random. – Michael Borgwardt Aug 11 '13 at 12:58
2

Obviously pointing out that the spec doesn't say what order the data comes in, hasn't worked. Probably because he knows the data is on a disk or in memory somewhere and thus has that as an order. Ask him what is the order of calculated data from several tables. I.e. create an example where you join 4 tables, doing a calculation on 2 of them and returning just the calculated value.

The engine returns the data in the order it finds it (absent an order by), but how it finds it depends upon factors that can change - index's, stats, caches. Generally the data will be in a consistent order -- but if you depend upon the order, you need to ask for it.

jmoreno
  • 10,640
  • 1
  • 31
  • 48