41

Is varchar just a remnant from before text came around, or are there use cases where you would want to use a varchar? (Or char for that matter..)

(I use Postgres and MySQL (MyISAM) daily, so those are what I'm most interested in, but answers for other databases are of course welcome. ^_-)

Oded
  • 53,326
  • 19
  • 166
  • 181
Izkata
  • 6,048
  • 6
  • 28
  • 43
  • 6
    At least for [SQL Server](http://msdn.microsoft.com/en-us/library/ms187993.aspx), `text` is deprecated. There are also considerations of usage that are related to where the data is stored and how it therefore gets accessed. – Oded Jul 09 '12 at 19:08
  • On some DBMSs you may not be able to use a text column in a sort or where clause. I'm not familiar with Postgres but check your documentation. – jqa Jul 10 '12 at 00:12
  • 1
    [This StackOverflow question](http://stackoverflow.com/questions/4848964/postgresql-difference-between-text-and-varchar-character-varying) might provide some more info. – J0ANMM Jan 20 '17 at 08:40

3 Answers3

38

In General

text columns are non standard and implementation specific. In many cases, depending on the database they may have a combination of one or more of the following restrictions: not indexable, not searchable and not sortable.

In Postgres

All these types are internally saved using the same C data structure..

In MySQL

The text column is a specialized version of BLOB and has restrictions on indexing.

Just these two examples can be extrapolated to the other SQL RDBMS systems and should be reason enough to understand when to choose one type over the other.

Just to make it implicitly clear, you should never use TEXT as it is proprietary and non-standard. Any SQL you write against it will not be portable and will guaranteed to cause you problems in the future. Only use types that are part of the ANSI Standard.

  • Use CHAR when you know you have a fixed number of characters for every entry.
  • Use VARCHAR when you have a variable number of characters for every entry.
  • If you need more storage than VARCHAR can provide, CLOB with UTF-8 encoding or equivalent standard type.
  • NEVER use TEXT as it is non-standard.
  • 1
    Accepted for `non standard and implementation specific` and `not indexable, not searchable and not sortable`, which I didn't realize. I was under the impression `text` _was_ standardized. – Izkata Aug 08 '12 at 02:48
  • 1
    do you mean the ASCII `text` standard or the UNICODE `text` standard :-) or one of the other half dozen `text` encoding standards? –  Aug 08 '12 at 03:19
  • Nono, I really mean as a database datatype - one that was, perhaps, added after `char` and `varchar` =P – Izkata Aug 08 '12 at 03:26
  • 1
    if you go digging through the SQL standards documents I don't think you will find anything about `text` as a character type. I haven't seen anything, some vendors call it `long char` and the like, it is basically a BLOB with an encoding attached to it. –  Aug 08 '12 at 15:21
  • Just to confirm: In Postgres, there is no point of ever using `CHAR` or `VARCHAR` instead of `TEXT`, right? So why do those field types then still exist / why are they not deprecated? – Martin Thoma Jun 01 '19 at 08:29
  • @MartinThoma How do you read this answer as *always use the non-standard option*. –  Jun 01 '19 at 17:22
  • 2
    @JarrodRoberson to be honest there are plenty of reputable resources that do conclude (when in Postgres environment) that "always use `TEXT`". If you're going to migrate to a different database, that's hardly a deal breaker, especially since you'll have to consider that postgres' unlimited `VARCHAR` (due to TOAST there's no row limit like for example with MySQL) may not translate to unlimited `VARCHAR` in other databases anyway. – Kayaman Jun 11 '19 at 08:17
  • 2
    ...and as [Postgres doesn't support CLOB](https://www.postgresql.org/docs/current/unsupported-features-sql-standard.html), the second to last point doesn't hold. You'll never be able to support drop-in replacements *even* if adhering to the standard. As well as writing ANSI SQL isn't a viable option in the real world, unless you're writing toy SQL. – Kayaman Jun 11 '19 at 09:20
13

text, varchar and char are all used for different reasons. There are of course implementation differences (how much size they occupy .. etc), but also there are usage and intent considerations. What type you use also tells you something about the kind of data that will be stored in it (or we'd all use text for everything). If something has a fixed length, we use char. If it has variable length with a well defined upper limit then use varchar. If it's a big chunk of text that you have little control over then text would be probably your best bet.

System Down
  • 4,743
  • 3
  • 24
  • 35
  • I asked a similar question a little while ago, and got a similar answer. I guess the MAIN problem is trying to decide when varchar is no longer more efficient than TEXT. I saw a formula for how a varchar is stored but I didn't sit down and ponder it. Is there a rule of thumb, like, after [256,000] characters it is better to use a TEXT? (btw, if you want to answer my question along this line better, find it here: http://bit.ly/Ot2JJF, or just answer it here...) – BillyNair Jul 10 '12 at 09:33
  • 3
    Sooooooo, the only real difference is to duplicate the bounds-checking that should probably be in the program code anyway? – Izkata Jul 10 '12 at 15:10
  • 2
    @Izkata - There are implementation differences as well. It isn't about bounds checking, its about data *type*. A (US) zip code is always a 5 digit code, so using something like 'char' becomes part of the definition of this piece of data. If it was only stuff like bound checking we could all just use one data type for *everything* and do our checking and casting code side. – System Down Jul 10 '12 at 15:57
  • 7
    @SystemDown As far as I know, `char`, `varchar`, and `text` are all designed for storing the same _type_ of data. So both answers here are about bounds checking. If there are efficiency differences, what are they? _Why would I use `varchar` over `text`?_ – Izkata Jul 10 '12 at 16:23
  • 1
    float and double are also used for the same type of data, yet they have differences and are used differently. As for implementation differences, I'm not acquainted enough with Postgres to answer that I'm afraid. – System Down Jul 10 '12 at 16:51
  • 4
    @SystemDown Although storing postal codes as a char(5) may bite you if you start internationalizing. UK post codes vary in length and 5 characters is almost never enough. I don't know if the space in a UK post code is relevant for the parsing, though. – Vatine Aug 05 '12 at 13:05
5

Databases are intensely concerned with performance--speed and minimizing storage. In most other parts of the computer world, you are not going to be bothered about how many characters are in your character string; it could be one, it could be the entire contents of an encyclopedia; it's all just a string. In fact, a lot of languages don't even bother you about whether it's a string or a number.

But as computers get faster and gain more memory, people put more data into their databases and do fancier queries. For a database CPU and memory are just as limiting today as they were in the days of 64Kb main memory and 10Mb hard drives (on mainframe computers).

A fixed number of bytes is a lot easier to deal with than a variable length number. 10 bytes is a lot easier to deal with than 1,000,000. So your database wants you to give it a clue so it can give you a gigabyte of results from terrabytes of data in microseconds. If you're not using your database that hard, you won't need the speed it's offering and will be annoyed at the needless questions. But if you do need the performance, you'll be happy to give it some hints.

As noted in the other answers, use char if it always uses a certain number of characters, varchar if the length can vary but it doesn't get too large (my guess is most DB's treat it as a char or text depending on size), and text if it could be any length. If your SQL tries to use a text column, it might be best to summarize it somehow and put it in a char or small varchar column also, then do where's and order by's on that. Of course, that's only if performance matters to you.

RalphChapin
  • 3,270
  • 1
  • 14
  • 16