Why do we need to put N before strings in Microsoft SQL Server?

Question

I'm learning T-SQL. From the examples I've seen, to insert text in a varchar() cell, I can write just the string to insert, but for nvarchar() cells, every example prefix the strings with the letter N.

I tried the following query on a table which has nvarchar() rows, and it works fine, so the prefix N is not required:

insert into [TableName] values ('Hello', 'World')

Why the strings are prefixed with N in every example I've seen?

What are the pros or cons of using this prefix?

`N` means National, as in "National Varying Character", see [Equivalent ANSI SQL Data Types](http://office.microsoft.com/en-us/access-help/equivalent-ansi-sql-data-types-HP001032229.aspx). — ErikE, Jun 17 '14 at 20:16
I agree with this question and nobody has answered it so far, AFAICT. Maybe it could be restated as “why is it bad to let SQL implicitly convert my `VARCHAR` to `NVARCHAR` when my string literal is ASCII?”. — binki, Oct 14 '14 at 14:11
This question was already asked and answered here: [What is the difference between varchar and nvarchar?](http://stackoverflow.com/q/144283/439793) — , Oct 14 '14 at 17:29

score 29 · Answer 1 · edited Oct 09 '18 at 09:35

29

NVarchar is used for Unicode. If your database is not storing multilingual data you can keep using Varchar. As an example: N'abc' simply converts your string to unicode.

edited Oct 09 '18 at 09:35

Lightness Races in Orbit

8,755
3
41
45

answered Jul 06 '12 at 14:52

Pieter B

12,867
1
40
65

2

Why don't you have to prefix with U instead of N then? – Attila Kun Jul 06 '12 at 21:42
U could be confused for unsigned as a guess – JB King Jul 06 '12 at 23:22
1

`U&'abc'` is the right way to specify Unicode strings. See the [SQL 2003 BNF](http://savage.net.au/SQL/sql-2003-2.bnf.html#Unicode%20character%20string%20literal) – ceving Oct 26 '15 at 13:27
3

The N actually stands for "National Language Character" set. – Mike Bovenlander Apr 18 '18 at 09:47

score 25 · Answer 2 · answered Jul 06 '12 at 16:02

25

By default SQL server uses the Windows-1252 character codes for varchar. It contains most of characters for latin-based languages (English, German, French, etc.) But it does not contain characters for non-latin based languages (Polish, Russian, etc.). As stated by @Pieter B, nvarchar is used to get around that issue because it is for Unicode which contains those missing characters. This comes at a cost, it takes twice as much space to store nvarchar than varchar.

Putting N in front of your string ensures the characters are converted to Unicode before being placed into a nvarchar column. Most of the time you will be okay leaving the N off, but I wouldn't recommend it. It is a lot better to be safe than sorry.

answered Jul 06 '12 at 16:02

bwalk2895

1,988
1
13
15

4

Just a clarification: "By default" SQL server uses the encoding corresponding to the Varchar field's collation, which is overridable at the time of creation of the field, generally based on the default collation for your instance. The default collation for your instance can be set at install time, but generally corresponds to the system default locale's CP_ACP. That will be Windows 1252 on a US-English machine, but 932 on a machine with a Japanese system locale, 1251 on a Russian machine, etc. The moral of the story? Use NVarchar :) – JasonTrue Jul 14 '12 at 04:12
1

So far this is the only answer that addresses the question as asked "Why use the N prefix on literal strings since SQL will implicitly transcode?". The other answers are all for a different question "What's the difference between nvarchar vs. varchar?" – Tim Sparkles Jul 31 '18 at 21:30

dan04 · Answer 3 · 2021-10-19T23:28:06.267

Edit: Microsoft finally introduced UTF-8 support in SQL Server 2019 (15.x). So this answer, originally posted in 2012, is outdated.

Because MS SQL Server has poor support for UTF-8 compared to other RDBMS.

MS SQL Server follows the convention, used within Windows itself, that "narrow" strings (char in C++, CHAR or VARCHAR in SQL) are encoded in a legacy "code page". The problem with code pages is that they have a limited number of characters (most are single-byte encodings, which limits the reportoire to 256 characters) and are designed around a single language (or group of languages with similar alphabets). This makes it hard to store multilingual data. For example, you can't store both Russian and Hebrew data because Russian uses code page 1251 and Hebrew uses code page 1255.

Unicode solves this problem by using a single giant coded character set with room for more than a million characters, enough to represent every language in the world. There are several Unicode encoding schemes; Microsoft prefers to use UTF-16, for historical reasons. Because UTF-16 represents strings as a sequence of 16-bit code units instead of the traditional 8-bit, a separate character type is needed. In MSVC++, this is wchar_t. And in MS SQL, it's NCHAR or NVARCHAR. The N stands for "national", which seems backwards to me because Unicode is about inter-nationalization, but that's the ISO terminology.

Other SQL implementations let you store UTF-8 text in a VARCHAR column. UTF-8 is a variable-length (1-4 bytes per character) encoding that's optimized for the case when your data is mostly in the Basic Latin range (which are represented as the same 1 byte per character as ASCII), but can represent any Unicode character. Thus, you'd avoid the "twice as much space" problem mentioned by bwalk2895.

Unfortunately, MS SQL Server does not support UTF-8 VARCHAR, so instead you have to either use UTF-16 instead (and waste space for ASCII text), use a non-Unicode code page (and lose the ability to represent foreign characters), or store UTF-8 in a BINARY column (and deal with inconveniences like SQL string functions not working properly, or having to view the data as a hex dump in your GUI DB manager).

In versions earlier then SQL Server 2012, they ware using UCS-2 encoding, which is strictly 2byte. In newer versions, they are using UTF-16 which is variable length mapping to 4bytes per character (similar as UTF-8 but starting at 2 bytes). — j123b567, Jan 19 '15 at 13:05

Why do we need to put N before strings in Microsoft SQL Server?

3 Answers3