30

I've just read this page http://weblogs.asp.net/scottgu/archive/2010/06/10/jquery-globalization-plugin-from-microsoft.aspx

One of the things they did was to convert the arabic date to the arabic calendar. I'm wondering if it is a good idea at all to do so. Will it actually be annoying/confusing for the user (even if the user is Arabic).

Also, my second question is that do we really need to change 33,899.99 to 33.899,99 for some cultures like German? I mean it doesn't hurt to do so since the library already does it for us but wouldn't this actually cause more confusion to the user (even if he is German, etc).

I'm sure whatever culture these people come from, if i give you a number 33,899.99 there's no way you'd get that wrong right? (unless my website/application is the first website/application you've ever used in your entire life, which arguably is possible but the probability is just that low)

I meant "universal" as the format that everyone will see and know what it means. It doesn't have to be some standard written in black-and-white and the like. As long as everyone can read it and know straightaway without confusion what the text is representing, that's universal. To be sure, 1.234,00 is definitely not universal. I mean i'm very sure you can find someone who in their entire lifetime, has been using computers yet have never came across this number format at all. Since most websites/apps had been using 1,234.00 without changes to accommodate localization, I believe that it has been the de-facto (the universal format that everyone will see and know what it means).

As for dates, if we write 01/02/03 I'm sure there's no way anyone will know (straightaway, rightaway, without ambiguity) what date it is. But no one can get Jan 2 2003, Feb 1 2003, Feb 3 2001 wrong if we wrote them as such isn't it?

Btw this question is targeting localization, don't tell me stuff like "Hey not everyone reads English alright!" because that is a matter of internationalization (which is beyond this topic). Let's stick to the discussion on localization.

Pacerier
  • 4,973
  • 7
  • 39
  • 58
  • 66
    You will soon find just how few things are "universal" in this world. – Roman May 10 '11 at 04:50
  • 8
    This question and the answers it's sure to get (and already has) is a great learning experience for everyone. – Jordan May 10 '11 at 05:16
  • 2
    @R0MANARMY: Amen! Lots of "universals" -- including some surprising ones -- aren't even remotely. – JUST MY correct OPINION May 10 '11 at 05:23
  • 4
    While we're at it, we should also get rid of all those time zones. Everybody should be able to understand what 17:12 GMT means, right? – nikie May 10 '11 at 07:04
  • 35
    Of course 1.234,00 is not as much of a problem as 1,234 or 1.234. – reinierpost May 10 '11 at 07:18
  • 2
    I hate localization and even if I'm italian the 1.234 format confuses the hell out of me (I assume it's 1,234, ie, 1 thousand etc).. Same with dates, if it's 04/05/2010 I have no idea whatsoever what it might be – Andreas Bonini May 10 '11 at 10:05
  • 51
    You are right, localization is unnecessary. But the big problem is that the anglo-saxon world continues to use their specific conventions instead of the universal french ones... – mouviciel May 10 '11 at 10:38
  • 5
    We can do that, but only if you Americans stop calling a lame Milliard a Billion :) – Ingo May 10 '11 at 12:03
  • 4
    You're right. We should begone with the meter and gram, too, so that the rest of the world can enjoy the nutty experience of measuring things in feet and pounds. ;-) – Denis de Bernardy May 10 '11 at 12:05
  • Possible duplicate: http://stackoverflow.com/questions/1583999/do-numbers-need-to-be-localized – Erik B May 10 '11 at 12:17
  • 2
    http://stackoverflow.com/questions/5930714/delphi-invalid-float-value, http://stackoverflow.com/questions/5851647/x-xxxx-is-not-a-valid-floating-point-converting-between-languages-locals, etc. I wish the world would agree on using the period (.) as the decimal separator. A lot of things would be so much easier then. (I am from Sweden, where we use comma (,) as the decimal separator, but many mathematicians and scientists, myself included, insist on using period instead.) Still, the worst problem of them all, I think, is that Americans use these strange units that I have no familiarity with... – Andreas Rejbrand May 10 '11 at 12:30
  • If I recall correctly, Mr. Burns is to blame for the latter problem. – Andreas Rejbrand May 10 '11 at 12:39
  • 27
    How about a nice compromise? English numbers, metric units, European time (24h), driving on the right, and Asian dates (year/month/day). – Mike Dunlavey May 10 '11 at 13:25
  • 1
    @Ingo: "Americans stop calling a lame Milliard a Billion" Right. And Brits stop with "colour", "theatre", pronouncing "junta" like "giuhnta", and in general applying the English vowel-shift to everything. (We could learn that lesson too. As in Eye-Rack. :) – Mike Dunlavey May 10 '11 at 13:42
  • 2
    I was born and taught to use 1.234,56; but I had come to love 1 234.56 (the space is supposed to be half-width space where the typography allowed). I was born and taught with 03/04/2005 but I have come to love 2005-04-03. – Lie Ryan May 10 '11 at 14:17
  • 3
    You can't rely on how big the number is or how many decimal digits it has to make the format unambiguous. About the date, I think the year-month-day with the year's 4 digits is pretty unambiguous as the descending order of magnitude is pretty self explanatory as well as sort order which doesn't change even if you sort it alphabetically. – Petruza May 10 '11 at 14:28
  • @Mike Dunlavey i just seem unable to get am pm working. my brain can't manage to keep remebering which ones which. but then, so was left and right hard for it. – Andy May 10 '11 at 14:45
  • 3
    I did some research and actually using comma for decimal separator seems to be more "universal" than point. – Petruza May 10 '11 at 15:06
  • 1
    Well, I don't know if Pacerier is sorry now (s)he asked the question, but I'm pretty sure (s)he's a lot wiser now than (s)he was 11 hours ago! – TrojanName May 10 '11 at 16:08
  • @Mike Dunlavey, the English vowel shift predates the USA, so if you haven't caught up yet it's not our fault. – Peter Taylor May 10 '11 at 17:48
  • @Kop: amusing, I would interpret both 1,234 and 1.234 the same way, ie 1 + 234/1000. In France we don't use a separator for thousands, just a space, and calculators either have a comma or a dot for marking decimals... oh well ;) – Matthieu M. May 10 '11 at 18:44
  • 1
    Ah, number localization and time zones... My two personal favorites. :( – Andreas Johansson May 11 '11 at 10:55
  • @Mike Dunlavey's unificication proposal FTW – Petruza May 11 '11 at 17:51
  • Isn't correct spelling also unnecessary, and getting UI elements to line up, and timezones? – hippietrail Sep 15 '17 at 03:33

12 Answers12

121

Why should non-Anglos have to decode dates, numbers, etc. while Anglos can just read them? Numerical and date localization is absolutely necessary if you want non-Anglos to feel, you know, welcome as users and customers. Why should a German user have to work out what your number is instead of, you know, getting it in his or her own language's format?

Further, your view of number formats (and dates: q.v. below) is hopelessly simplistic. For example undoubtedly you'd find numbers like 1,234,567 "natural" and "obvious" and "logical" ... but what about people who come from cultures with myriad-based numbering schemes? My students (Chinese), for example, are always confused about numbers over 1000 because they group numbers differently. A more "natural" grouping for their thought processes (which include a myriad above the thousand point) is 123,4567. Further there are many contexts in which the European number systems in general are simply not suited. It would be nice in those circumstances to be able to write the all-Chinese 一百二十三万四千五百六十七 or even various hybrid systems that are in common use here.

Your idea for dates is wrong-headed too. You've correctly pointed out how 01/02/03 is ambiguous (if only because Americans refuse to comply with standards on dates) and suggest instead that Feb 3 2001 is unambiguous. I'm not sure, however, if you've noticed something there. It's unambiguous and unambiguously English. Going back to my students, I'm pretty damned certain that they'd far prefer to see 2001年2月3日 (or even 二〇〇一年二月三日) which is both unambiguous and, get this, something they can read without having to decode.

The bottom line on i18n and l10n: Do you want money and/or users? You make what your users want. Your users want things in their own language, not in yours. End of story.


edited to add

It gets even worse than myriad-based systems. Take a look at Indian numbering for this lovely progression:

1
10
100
1000
10,000
1,00,000
10,00,000
1,00,00,000

...and so on up to:

100,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,00,000

See that grouping by three at the end? See that grouping by two after the grouping by three? See the sudden reintroduction of a group by three again?


further edited to add (I just can't keep off this subject it seems!)

Even the assumption of decimal number systems being universal is wrong. There are native numbering systems that are 4-based, 5-based, 8-base (octal), 10-based (decimal), 12-based, 20-based and even 60-based. These are all systems which have been in active use by real people (as in not made up for science fiction stories). Not all of these are still living (although we can see, for example, vestiges of 12-, 60-based numerical systems in English terminology).

As for dates, let us not forget the lunar calendars still in active use in much of the world. The Muslim world tends to use a lunar calendar where the dates can drift throughout the whole year while the Chinese use one with a complicated system that keeps the dates never more than a month away from true. (And that's just naming two off the top of my head.)

Dante May Code
  • 103
  • 1
  • 2
  • 3
JUST MY correct OPINION
  • 4,002
  • 1
  • 23
  • 22
  • 3
    Very nice examples. – Roman May 10 '11 at 04:48
  • 18
    I'll add that for French, the comma (',') is actually used as a separator for decimals (1,5 = 3/2). The period is not used at all, and spaces act as group separators (150 000,53 dollars...) when the writer feels like doing so. Whenever I find myself writing sums of money into an English website, I have no idea what format I should write or which one is going to be accepted. Are spaces okay? If I use a period instead of a comma, am I going to transfer 100 time the amount of money I actually expected to transfer? Proper l10n is pretty important to your users. – I GIVE TERRIBLE ADVICE May 10 '11 at 04:49
  • @JUST MY correct OPINION. Please do not cite internationalization examples because my question was specifically (as stated in my post above) a localization question. Putting it simply, I'm asking a localization question and not an internationalization question. Your answers are targeted towards internationalization, not localization, and hence not targeted towards my question – Pacerier May 10 '11 at 05:41
  • @Pacerier: a input parser or string-based input format validator needs to process whatever localized input it was fed. In this context, the internationalization of that parser depends on the universal localizability (configurability?) of it. – rwong May 10 '11 at 05:54
  • 4
    @Pacerier: i18n and l10n work hand in hand. Try to localize dates without internationalizing the glyphs sometime. (Hint: this is not possible.) Hell, try to localize **numbers** without internationalization sometime. (Hint: again this is not possible.) – JUST MY correct OPINION May 10 '11 at 06:04
  • 1
    These examples are great, and I agree with you in principle... but in real life, I HATE HATE HATE number localization. Like when some overly smart program decides that I'm Russian and therefore should use commas as decimal separator, and I use dots (because I CAN'T use commas in programming, even in Russia), and then I have a bug that a spend a better part of a day fixing... I personally would rather prefer single standard, however non-intuitive it might seem to me at first. – Nevermind May 10 '11 at 06:27
  • 4
    I'm of two minds on this one, Nevermind. On the one hand, yes, my life as a programmer would be made easier if i18n and l10n didn't happen. On the other hand it's not the user's job to adapt to the computer's quirks just because I'm lazy. (This also doesn't even start with the whole vitality of the ecosystem of thought thing!) In the end, though, the bottom line is just what your customers and/or users want. And what they want is stuff in their language. That some geeks here and there don't like it won't change that basic, simple fact. – JUST MY correct OPINION May 10 '11 at 06:42
  • 2
    See http://stackoverflow.com/questions/1583999/do-numbers-need-to-be-localized for some more examples of number localization issues. I'd also like to point out that correct localization becomes even more important when you start dealing with accessibility, because there's another layer of machine translation between the original data and the human being. A human might be able to figure out that your data formats are Anglocentric, but their screen reader might not be able to make that judgement call. – E.Z. Hart May 10 '11 at 06:55
  • 1
    @JUST MY correct OPINION I hate this not as a programmer, but as a user. Some programs use l10n, some do not, and in the end the user is never sure whether to use comma or dot. – Nevermind May 10 '11 at 09:06
  • Oh, yeah, in that regard I'm with you 100%, Nevermind. But the reason it's so inconsistent that way is because of, basically, questions like these. Programmers -- lazy programmers (as if there's any other kind!) -- assume that everybody else is perfectly willing to adapt to their system unless forced at gunpoint to acknowledge i18n/l10n issues. The result is a horrible mish-mash of programs with varying levels of support ranging from "what do you mean by 'another language'?" to full support for ancient Babylonian. – JUST MY correct OPINION May 10 '11 at 10:20
  • +1 for the Indian number sample, which I intended to add if it was not already present. – Fredrik Mörk May 10 '11 at 11:57
  • I'm assuming Indians use a period as a decimal separator? – David Murdoch May 10 '11 at 14:12
  • 1
    Base 10 is universal. I don't care if some inbred family out in the boonies has 13 fingers on each hand and devised a numbering system with a base of 26; they are wrong. Wrongity wrong wrong wrong. The only thing that will change that is if some little green alien comes to visit and tell us otherwise. – David Murdoch May 10 '11 at 14:20
  • @David Murdoch i always wonder why time deviated from this. why not call 1 day = 100 hours, then 100minutes per hour and 100seconds a minute. – Andy May 10 '11 at 14:48
  • 1
    @David Murdoch: Base 10 is **not** universal. The Maori, for example, use a quaternary system. Quinary systems have been used all over the world throughout history and are still in place in Celtic and Inuit languages. The Yuki and Pamean languages use octal numbering schemes (they based counting on the spaces between fingers instead of fingers themselves). A few languages you may have heard of have duodecimal configurations: German, English, Indonesian (among other less familiar families). Vigesimal systems live on in French and Basque (among others) with vestiges in English and German. – JUST MY correct OPINION May 10 '11 at 15:24
  • @Andy: Our time keeping is the way it is because we got it via the Sumerians and Babylonians who had the Gonzo Raving Loon systems (note the plural!) of numbers including quinary (base 5) duodecimal (base 12) and sexagesimal (base 60) all interacting in bizarre ways. We also get our geometric systems (360 degrees, etc.) from that source. – JUST MY correct OPINION May 10 '11 at 15:27
  • right, 360 degrees. i forgot that one. Still, we kept that time scale but got rid of roman numbers (just imagine still using the roman number system though). – Andy May 10 '11 at 15:33
  • @JUST MY correct OPINION, but it doesn't matter. Just because it is in use doesn't mean it *should* be __localized__. Sure, if you are catering to a group that may *only understand a certain counting system* then of course things should be localized. But just because I know and use the term "dozen" doesn't mean I want everything counted in a duodecimal system. Heck, a majority of users here at programmers.se know binary but never would the site be localized with binary just because we understand it. – David Murdoch May 10 '11 at 15:47
  • Well there's nothing intrinsic to the sexagesimal system that renders it difficult to do maths in. The Babylonians were doing all kinds of sophisticated calculations using sexagesimal. Just the mere fact that it doesn't map to digits doesn't make it unusable. (Evidence: hexadecimal....) – JUST MY correct OPINION May 10 '11 at 15:48
  • 4
    @David Murdoch: This goes right back to the start of my post. If you want people to give you money, you have to give them what they want. What they want is software that's in their language and/or that displays numbers, dates, currencies, etc. in the format they're used to. If you don't give it to them (properly!), they just won't buy your software. It really is as simple as that. – JUST MY correct OPINION May 10 '11 at 15:50
  • @David and @JUST - comments aren't designed for extended discussions. If there's additional information then it should be edited into the answer. If you want to continue the discussion please take it to chat. – ChrisF May 10 '11 at 15:54
  • 1
    ChrisF, technically they WERE designed for any type of discussion...because comments existed before the chat system. :-p – David Murdoch May 10 '11 at 16:05
  • 1
    @David - however, more than 20 comments on a post is considered noise. Any useful information obtained should be added to the post and the comments deleted. For more on this see this meta post - http://meta.programmers.stackexchange.com/questions/1457/how-do-our-moderators-and-community-members-feel-about-cleaning-up-comment-nois – ChrisF May 10 '11 at 19:35
  • Don't forget that for numeric data later meant to be graphed, swapping commas and periods will make said chart inaccurate by orders of magnitude. Folks from that locality will assume that your software is broken, not just improperly localized. Prepare for extra support tickets. – ford Mar 06 '12 at 23:50
58

Also, my second question is that do we really need to change 3,899.99 to 3.899,99 for some cultures like German? I mean it doesn't hurt to do so since the library already does it for us but wouldn't this actually cause more confusion to the user (even if he is German).

Even if the user is German, he'll be confused with German notation? I'm pretty certain I don't agree with that. No, what would cause more confusion to users is the peppering of their localised sites with non-localised data. Numbers and dates are a key issue in this respect.

If you're localising a service, you need to localise it properly. Skimping on number format is lazy and ignorant.

3,899.99 is not the universal format - and the assumption that "everyone will just understand it" is intellectually lazy. It's also comparatively rude. You have the facility to do - since we're discussion a particular link you posted - so why not just do it and show some cultural awareness at least?

temptar
  • 2,756
  • 2
  • 23
  • 21
  • 27
    +1 for noting the rudeness – nikie May 10 '11 at 08:12
  • +1 for "what would cause more confusion to users is the peppering of their localised sites with non-localised data. Numbers and dates are a key issue in this respect." The devil is in the detail. – StuperUser May 10 '11 at 12:24
  • 1
    In America we might say "Fifty-two __point__ seventy-one" to mean 52.71 (or 52,71 if you are reta-uh, accustomed to another locale [j/k...I kid]). Since a comma is in no way related to a "point" what is the shorthand way of expressing the decimal character in comma-for-decimal locales? I really hope you aren't stuck saying "Fifty-two and seventy-one hundredths". – David Murdoch May 10 '11 at 14:11
  • 6
    @David : in America (which is a continent, I remind you) we also might say "fifty-two **comma** seventy-one" (I we spoke english). Was replacing the word `point` for `comma` so hard to imagine? – Petruza May 10 '11 at 14:44
  • @ David Murdoch sorry if i am chasing you here through answers, but 2,3 would be twee komma drie and not twee punt drie. when saying this in my head the , comes out alot more naturally due to upbringing then a . (although when i am thinking english dot is alot nicer then a comma) – Andy May 10 '11 at 14:54
  • 3
    @Petruza, Actually, America is not a continent. There is a North American and a South America, but no continent America (or at least that is what is taught in US schools). Yes it was extremely hard to imagine. I just can't get my head around it. Seriously. You have just blown my mind. I sit here in utter disbelief. /sarcasm. The reason replacing "point" with "comma" didn't spring to mind is that "comma" is used in English as a punctuation mark whereas "point" is not (we'd use the term "period" instead); so there would be a little more ambiguity when using "comma" instead of "point". – David Murdoch May 10 '11 at 15:19
  • @David Yes I know how continents are taught in USA, but again, the "american" way is not the universal way. – Petruza May 10 '11 at 15:34
  • @Petruza Nothing is universal, not even the numerals. – apaderno May 10 '11 at 15:44
  • @Petruza, Just curious, where are you from and what continents do they teach? – David Murdoch May 10 '11 at 15:51
  • 6
    David, comma, or, Komma, is used as a punctuation mark in German, and virgule, the French equivalent is used as a punctuation mark in French. I'd also add that in English in Europe, we do not use perod to designate an end of sentence; we use "full stop". A key complaint about the original post was the lack of cultural awareness displayed. Just because things are done one way in the US does not mean it is the same elsewhere. It is possible that english.stackexchange is a suitable location on the etymology of point as used in decimal notation. – temptar May 10 '11 at 16:07
35

American <> Universal.

Reading a date like "Jan 2 2003" takes some time to decode. We always put the day before the month. So it would have to be "2 Jan 2003". Sure we get it but we have to think 5 seconds to decode it.

Show a number like 1,234 and most Europeans will be thinking of a decimal number. 1,345.00 just "feels wrong".

It's OK if you don't localize your app. Just don't expect it to be a hit outside of your country.

Carra
  • 4,261
  • 24
  • 28
  • 8
    What's worse is Americans rarely write "Sep 11", they write 9/11/01, which is inherently confusing. In fact, in Canada, we sometimes use the equally confusing convention of 11/9/01, and until the September 11 attacks, I was always confused over which country used which notation and it always annoyed me. After the attacks, all the hoopla over "9/11" at least gave me an easy way to remember which country used which convention. In any case, anyone who doesn't live in the US typically understands that every place in the world uses a different way of writing dates. – Scott Whitlock May 10 '11 at 13:19
  • Here in Sweden, lots of people talk (or rather, write) about "9/11" too. Except in common Swedish usage, a date specified as "9/11" means the 9th day of the 11th month; that is, November 9. Sometimes I want to give dates backwards like that just because, and when people get confused, point out to them that I'm just using the exact same date format that they do. It's either yyyy-mm-dd, yy-mm-dd, d/m or d/m/y, all of which most Swedes will readily understand. – user May 10 '11 at 13:31
  • 1
    The default here is 11/9/01, it's used in all our calenders. For example: Outlook uses it. But since the media covered it as nine elven we remember it as that. – Carra May 10 '11 at 14:38
  • 1
    For an English speaker Jan 2 2003 should be just as easy to decode as 2 Jan 2003; no matter your locale. – David Murdoch May 10 '11 at 15:26
  • 3
    @Carra, "Outlook uses it". I'm sure it is localized. Outlook and Excel show 9/11/01 for me. – David Murdoch May 10 '11 at 15:27
  • 9/11 is less a date point than a cultural event label however; European date usage, even in English is generally in the order Day Month Year, whether it's all numeric, or mixed form, eg 18 Feb 2011 or 18/2/2011. That we understand non-local usage is not an excuse to ignore that it is non-standard in our locale. – temptar May 10 '11 at 16:10
  • 1
    @temptar - the unwillingness of people to replace old standards with obviously better new ones is just as short-sighted as our lingering grasp on the British imperial measurement system over here in Canada and the U.S. The date problem is comparatively easy to fix... just stop using the confusing numeric-only variants (mm/dd and dd/mm) in favour of MMM dd or dd MMM. – Scott Whitlock May 10 '11 at 16:39
  • I don't use the numeric only ones as a general rule, just to be clear on that. :-) – temptar May 10 '11 at 16:43
  • @temptar - wasn't accusing you :) but how can we get everyone else to switch? – Scott Whitlock May 10 '11 at 17:36
  • @Scott: "It will be 9/11 times 2356." Chris: "My god, that's... I don't even know what that is." - Team America. – Andrew Grimm May 12 '11 at 07:44
  • Speaking of decoding things, whatever `<>` means, it's not in a language that *I* speak. ;) – Ben Zotto Sep 14 '12 at 07:46
29

Your problem here seems to be a bad assumption. There is no "universal format" for numbers or dates. 3,899.99 is valid in some places, and confusing in others. Same for the converse. People can frequently figure out what they need to, but that's not the point. The same goes for the date formats you talk about. The formats themselves are distinct between locales. There is no "universal" here.

Except in certain scientific and technical domains that general software doesn't usually address, there's no universal format for any of these things. If you want your software to be accepted on native terms anywhere but your own place, you'll need to work for it.

Can you shove some notion of a defacto standard down people's throats? Sure. But localization of numbers (per your question) could never be considered unnecessary in any professional, internationalized software.

Ben Zotto
  • 351
  • 2
  • 5
  • 7
    Indeed, the idea that there's a universal format is actually insulting to some. Yes, perhaps they're a bit touchy, but some people value having a unique local identity and the divisiveness that follows from that. – S.Lott May 10 '11 at 10:02
  • Actually the metric system is universal, except for America (and some other former english colonies) – Petruza May 10 '11 at 14:47
  • @Petruza Even if all countries would adopt the metric system, dates and numbers could use different formats. The metric system is about the measuring units, not the formats to use for the numbers; I could write "10.1 kg" or "10,1 kg" and I would still be using a measuring unit of the metric system. – apaderno May 10 '11 at 15:49
  • 1
    @Petruza, FYI, I think most US Americans hate the Imperial units and would rather be taught metric. I know I do. – David Murdoch May 10 '11 at 15:54
18

You seem to assume that what you are used to read is universal, while it is not.

Where I live, comma denotes the decimal separator, and a dot is used (sparingly) as thousands separator. It is unnatural for me to parse $3,004.25. But if you give me $5,535 I'd probably read it as about 5 dollars and a half. Reading 3.899,99 would not be confusing at all to me, and I just can't see why you think so.

For dates, I can certainly read Feb 3 2001, as well as you can read 3 feb 2001.

So, users can usually parse most numbers and dates localized in us-en. Sometimes there can be ambiguity, like in $5,535. In any case I just don't get why you think that it would be more clear than their own locale.

Andrea
  • 5,355
  • 4
  • 31
  • 36
  • 2
    It's "more clear" because humans tend to be very strongly ethnocentric and whatever you're used to is "logical" and "clear" and "obvious" while what you're not used to is "illogical", "fuzzy" and "inobvious". This pervades our industry and, indeed, pretty much every industry. Wanna have some fun? Look at paper sizing.... – JUST MY correct OPINION May 10 '11 at 07:02
  • I think you have mistunderstood my claim. I contested the OP sentence "but wouldn't this actually cause more confusion to the user (even if he is German)." So I think we agree. – Andrea May 10 '11 at 09:34
  • No, I understood your claim perfectly. I was amplifying, not contradicting. ;) – JUST MY correct OPINION May 10 '11 at 10:23
13

I don't have much knowledge on number and currency localisation, but dates are covered by ISO 8601 (http://en.wikipedia.org/wiki/ISO_8601) in the format YYYY-MM-DD e.g. 2011-05-10 for 10th May.

StuperUser
  • 6,133
  • 1
  • 28
  • 56
  • Unfortunately those moro-- users cannot be bored with ISO standards it seems and keep clinging to their old ways! /sarcasm – Matthieu M. May 10 '11 at 18:38
  • 1
    a standard which the average user doesn't know is useless IMHO. – Pacerier May 11 '11 at 02:40
  • Useless for UI/UX only, but there's more to standard formats than that. Most formats for date input will be specified per application (hopefully), so a developer won't have to worry about anything other than parsing it properly and putting a string label to notify moro-- users of the expected format. Or implement a date picker. – StuperUser May 13 '11 at 12:02
12

Also, my second question is that do we really need to change 3,899.99 to 3.899,99 for some cultures like German? I mean it doesn't hurt to do so since the library already does it for us but wouldn't this actually cause more confusion to the user (even if he is German).

I am German, and speaking personally for me, it is indeed confusing, because I read so many English texts and than German texts again and all mixed together. That's why I have to do a little break on each number I encounter to figure out, what notation rule it adheres to and what magnitude it actually means.

In my localization settings I have therefore configured the digit grouping character to be a single quote ('), as in 1'234.00, which is the system used in Switzerland.

I'm sure whatever culture these people come from, if i give you a number 3,899.99 there's no way you'd get that wrong right? (since he'd probably learned the universal format anyway)

You can probably figure it out, but it takes some time.

I meant "universal" as the format that everyone will see and know what it means. To be sure 1.234,00 is definitely not universal. I mean i'm very sure you can find someone who have never seen this number format in all their lifes. since most apps had been using 1,234.00 without localization i believed that it has been the de-facto (the universal format that everyone will see and know what it means).

You can definitely find a lot of people here in Germany who have never seen the 1,234.00 notation.

Also, there are legal implications in this matter. Imagine a bank account balance written with the wrong decimal point. All sorts of trouble can arise from that.

As for dates, if we write 01/02/03 I'm sure there's no way anyone will know what date it is.

Yeah, I hate that! And so many people still do that, leaving me puzzled each time what date is meant, even more so as there's apparently a conflict between British and American notation, where one of them writes the day before the month and the other the month before the day.

If localization is not possible (like in a written text), I recommend the standardized format 2003-02-01 for anything that might potentially cross your national borders. While you can't immediately see, if Feb 01 or Jan 02 is meant, this notation is standardized to have the more weighty numbers first (month before day).

T-Bull
  • 251
  • 1
  • 3
  • 1
    So far as I can tell *nothing* has ever used year-day-month as a format, so 2016-12-06 can't really be anything *other* than year-month-day. – supercat Dec 06 '16 at 19:30
10

There are ways to achieve what you want, but they all involve using formats that are natural to no-one. for example, we store timestamps in our product in human-readable format: YYYYMMDDHHMMSS. There's no confusion there, but no-one uses that format in their daily lives.

What I think you're needing is an underlying format that is universal, say, seconds since 1st Jan 1970. Once you have that, you can display the time in any format you like, and libraries and OSes can re-display the correct time any way you like.

gbjbaanb
  • 48,354
  • 6
  • 102
  • 172
  • 5
    So true about having a underlying format that can then be displayed in any way. Get this right and then localisation is more straightforward. – ChrisF May 10 '11 at 09:28
  • actually, I cannot represent most birth dates on past events with your epoch, what don't we use Julian date instead, and throw in enough to measure up to micro-seconds ? It fits into a 64-bits integer, so no problem! Ah wait, astrophysicians tell me they care about dates before -100000 BC so they'd prefer... – Matthieu M. May 10 '11 at 18:40
  • @Matthieu, d'oh, someone forgot to put those lights in people's hands that turn red when they're 30 :) We'll just have to choose one or more different underlying representations depending on the problem domain. – gbjbaanb May 12 '11 at 13:40
8

More a comment than an answer I guess, but...

It Doesn't Come Naturally

"since he'd probably learned the universal format anyway"

Pardon me, but I didn't really "learn" that "universal" format until fairly late at school, because that's the thing: no one cares. Or at least, until they need to (e.g. for official documents, for business, etc...). So it is confusing when you see formats that do not use the conventions of your culture. The usual MM/DD/YYYY or DD/MM/YYYY is probably a big issue as well.

Learning is a Privilege

Also, quite a lot of people don't get the chance to "learn" those things, unfortunately. Not everybopdy gets to go to school (even primary school) or finish it, even in developed countries. And I cannot stress this enough but: we design software for them as well.

Look the way Google do it. They apparently try really hard to work on language-related products internally, and they also try their best to provide localized services, for this very reason.

While a lot of us here, programmers, do usually prefer our "common" language (English, and encoded in ASCII, pretty please), a great many people feel more welcome when they are greeted in their own language. Like stepping out of the gate in a foreign airport where no one speaks your language and the writing is very remote to yours. It can be a bit scary, if you're not used to it. And to be honest and bring it back to the IT world, even opening a source file containing unicode characters or a phonetically-approximate version of a word from another language is quite distressing.

I do get your premise on the confusion though: you end up wondering every time what the system designer's intent was and what format they aim for. But It's not nearly as confusing for you when that happens as it is for someone who won't be able to understand other things.

Culture Matters

Finally, I think there's another thing at the core of the problem here: if we were to follow your suggestions, then we would cannibalize a lot of other great cultures, which all deserve to exist, be remembered and celebrated. And they live through their writings and these tiny - seemingly irrelevant - details as well: they are part of the cultures, and they exist in each culture for a reason.

I don't want to turn this into a tirade on the expansion of the big mean evil western empire, but it's the sort of things that does trigger people's harsh reactions and comments. Simply because it seems like you want to discard other people's opinions, tastes and cultures (I'm sure you don't) and even suggest that you know better than them what they should use and prefer. Or, in any case, that you have little respect for their ways and don't consider them part of your userbase, don't want to make an effort to welcome them, and don't really care much for them in general.


EDIT - Personal Anecdote: my partner yells at me everytime time I write "apologize" or "authorization" with an American spelling instead of the British/Australian spelling, so guess what she'd do to me if I'd start using dates and numbers in a funny way... :)

haylem
  • 28,856
  • 10
  • 103
  • 119
  • I had an opportunity to relate to someone in a central Chinese city (九江) something similar to your airport example. He was complaining that he couldn't get foreign tourists to come to the city in any great numbers no matter how much he advertised wanting them. I pointed out that in stepping off the train I was faced by billions of signs with every last one in 汉字 instead of even Pinyin (not to mention English!) and that in comparison to other Chinese tourist-oriented cities 九江 was coming across as really brusque and unwelcoming. He didn't understand, sadly, so he still has his problem. – JUST MY correct OPINION May 10 '11 at 13:36
  • @JUST MY corret OPINION: funny, I was specifically remembering some personal experiences in China while writing this, but I thought there was no point in singling out a country in particular :) When travelling to eastern Europe, my partner is also often complaining that "they don't speak english here". Well, yeah, that happens... I think your example about the Chinese date format is the most telling though. A lot of Chinese really wouldn't get (or would need time) to process another format. And matters get a lot worse if you use script/cursive instead of typewritten characters. – haylem May 10 '11 at 21:24
  • I wouldn't single out China for this either. A lot of countries (or cities within countries) claim that they're very interested in foreign tourism and then seemingly go out of their way to feel hostile to foreign tourists. – JUST MY correct OPINION May 11 '11 at 01:26
  • I wouldn't call anybody who uses geoip instead of the browser language preferences good at language related stuff. – CodesInChaos May 11 '11 at 11:03
  • +1 for the partner edit. me not knowing that fl. oz was fluid ounce and not flour ounce got me in big trouble :P. – Andy May 11 '11 at 13:09
  • 1
    Words like "apologize" are the correct spelling in British/Australian spelling. http://en.wikipedia.org/wiki/Oxford_spelling So don't apologize, and definitely don't apologise, for it! – Andrew Grimm May 12 '11 at 07:34
  • @Andrew Grimm: Interesting! But it's only a dictionary after all, and most spell-checkers will mark a -ize ending as wrong when using a en-gb or en-au locale. For Australia in particular, The Macquarie Dictionary seems to give precedence to the -ise. (http://en.wikipedia.org/wiki/Macquarie_Dictionary). Learnt something though: didn't even know that ending type was related to the French grammar :) – haylem May 12 '11 at 11:23
5

Most of the World uses , as decimal separator (green on image below). I don't see why majority should adapt to minority.

vartec
  • 20,760
  • 1
  • 52
  • 98
  • 1
    But are they the minority in terms of population? China has a population of about 1.5 trillion and India has a population of about 1.2 trillion, and thus make up about 40% of the 6.7 trillion people on the planet. – rjzii May 10 '11 at 16:17
  • 1
    the blue seems to look alot like old british colonies. – Andy May 10 '11 at 16:19
  • 1
    @Rob z I believe you mean billion, not trillion? – Jimmy Collins May 10 '11 at 19:23
  • @Jimmy C - A yes, those trillions should be billions. – rjzii May 10 '11 at 19:28
  • @Andy, missing a lot of Africa, but I suppose China and Japan are a fair trade. – Peter Taylor May 10 '11 at 22:56
  • Problem is, the number syntax is entrenched in most computer languages and some widely-used file formats, like comma-separated-values. I've had to wrestle with this stuff, and nobody's really happy with the result. – Mike Dunlavey May 11 '11 at 01:27
  • Also, not to be picky, but the area of the green parts are much larger in proportion to their population. (Greenland, Brazil, Siberia) Of course, Canada and Australia also have a lot of empty space. – Mike Dunlavey May 11 '11 at 01:30
  • @Mike Dunlavey but if we keep expanding like we do in population, we will have to fill them green emptys. the situation might turn around. vartec might have been thinking ahead to that. – Andy May 11 '11 at 07:20
  • @Andy: I'm afraid you're right. Afraid because we will have bigger problems than commas. – Mike Dunlavey May 11 '11 at 13:03
  • @ Mike Dunlavey but thats a diffrent se sites problem. and we might be at the problem stage. – Andy May 11 '11 at 13:07
  • I think the decimal comma is more popular when counting countries, but with USA, china & india in, it definitely looses its place to the point in terms of population – Petruza May 11 '11 at 17:48
  • 3
    For those curious about the red states countries, they use something called a Momayyez. – Andrew Grimm May 12 '11 at 07:41
  • `Of course, Canada and Australia also have a lot of empty space.` Actually, it's worse than that. In Canada, the format is not a country wide standard: it is province specific. In Québec, we would write `1 000,00 $` and use ISO 8601 as date format. I imagine this could be the case in other countries that have differents level of government. – authchir Nov 03 '12 at 03:38
3

I think I saw a proposed alternative format for thousands separator: 1'000'000
Although I don't know if there's a proposed decimal separator.
I don't think localization is unnecessary right now.

Usually all of us programmers, and people who wander often in the internet are very likely to be aware of english number format and measures, bur there are lots of people who don't know that much and yet use computers, and can get confused if they see a format that looks like their own, but means the opposite.

I do think that a unification of numbering and measuring standards would be a good thing to do at a global scale, in which 1,000.00 format could be used as standard, why not, but we should definitely loose inches and pounds, and call 1,000,000,000 a billion.

PS: I watch a lot of american documentaries in Discovery and history channel, and they always give measures with football fields as a unit. Do they realize that no one outside USA has a clue about how long an (american) football field is?

Petruza
  • 1,008
  • 1
  • 8
  • 14
  • 2
    well, american football field is roughly the same size as normal football field (±5%), so actually I don't see nothing wrong with that particular example ;-) more annoying is "Library of Congress" as a measure of amount of information. – vartec May 10 '11 at 15:59
  • 1
    but footbal is the one with the egg ball right? does that compare to the one with the round ball which sometimes has the same name? – Andy May 10 '11 at 16:23
  • I have no idea how large a soccer field is either. – CodesInChaos May 11 '11 at 11:05
  • @vartec ok, but someone who doesn't know the sport, probably doesn't know that its field is as long as soccer's either. (e.g. me) As a side note, a soccer field is more or less between 90 and 100 meters long, google it for yards. – Petruza May 11 '11 at 17:44
  • @Andy: no, football, I you can guess from the name, is the one you play with feet ;-) The fields are roughly same size. – vartec May 11 '11 at 18:23
  • @Petruza: true, but these kind of comparison aren't suppose to actually provide you any valuable information, is just so the viewer can say *"wow, it's huge"*. After all it **popular** science. – vartec May 11 '11 at 18:25
  • @vartec Right, after all know I know how long a football field is but I still ignore yards and pounds, so I'll just say wow and munch some popcorn – Petruza May 11 '11 at 18:52
  • @vartec Actually, *foot* in *football* doesn't mean that the ball is played with the foot, but that the sport is played on your feet as opposed to playing on a horse. http://en.wikipedia.org/wiki/Football#Etymology – Petruza May 11 '11 at 18:56
  • Where I come from, Football fields are [130 - 145m](http://www.gaa.ie/coaching-and-games-development/rules-and-specifications/rules-of-specifications/). – TRiG Dec 28 '13 at 07:14
-1

My personal belief is

Software localization is harmful

Not only is it poorly executed most of the time, but also it cuts of users from the English speaking community, which has far more resources to get help. This might not be such a big deal for German, but if you take Slovak for example, it's more like shooting your users in the foot, even assuming you get a decent translation. You have to know the local terminology and if it is not suitable to express what you want, you will need to extend it (which risks causing havoc, because somebody else might choose just another term for the same thing at about the same time, or might give the same term a different meaning).
I once had to fix something on a French Mac. It was a pain, because my French IT terminology was worse than that of the owner of said Mac. I had to translate back and forth between guesses of corresponding terminology.

I am quite certain, it is not significantly harder to pick up English terminology, than the native one, and learning English to a degree where you can operate a computer is not much of a challenge. Also it opens up a world to you. Instead, as of now, many software vendors make it increasingly hard to obtain an English version of their software, if you're from Germany, and even worse, many sites automatically pick your language. Microsoft goes so far, that if you google something in English and you open a result on their site, the resource is automatically shown to you in your locale. ARGGH!!!

IMHO, if an interface requires localization to be understood, you got the interface wrong and the flaws won't go away if you translate it, they rather multiply.

Software localization must be done right

If you localize software, there's absolutely no point in doing it, if you don't try to get it right. There's little value in a German interface a German can barely understand.

You agree, that in course of translation, you would translate "house" to "Haus" even though about everybody in Germany would understand the meaning of the former.
Likewise, you should translate 9/11/2001 to 11.9.2001. A German would very likely figure out, that the former notation represents a date, but would probably interpret it as as the 9th day of November 2001.

The reason why I picked just this historical date is to illustrate something with my personal experience. German media sometimes referred to the tragedy as "nine eleven". My English was not very good at that time, especially not my knowledge about date notation, so I visualized this in my head as "911" and figured this was some sort of metaphor using the phone number to associate the event with the feeling of emergency. Looking back at my train of thoughts, I agree it is extremely far fetched, but for someone used to a different date notation it did seem plausible, because it didn't ring a bell, as "eleven nine" would have. This should show you, how engraved certain conventions are in our thinking and how this can lead to misinterpretation of obvious data.

As for numbers, this is not such a big problem for humans, but a far bigger problem for programs. German localized software will output 1234.56 as "1.234,56" (or "1234,56" if you're lucky, or "1'234,56" if you're not lucky at all). If you directly feed that to a forgiving parser assuming English format, it will be interpreted as 1.234 and an unforgiving one will choke on the comma. This is about the same problem, the other way around.
It is also important to note, that on the numpad German keyboard layout, there's a comma, where you have a dot. So you are likely to get German format input from Germans. Being able to handle this is way more important than translating "Cancel" to "Abbrechen".

So if you really head down the "I don't care for formats"-road, do you and your users a favor by not translating anything, because that will at least alert them of an English context. You inflict less damage by forcing people to use a dictionary for words they don't understand, than by printing out information they misinterpret, because they have a different understanding of its meaning than you.

haylem
  • 28,856
  • 10
  • 103
  • 119
back2dos
  • 29,980
  • 3
  • 73
  • 114
  • 8
    Your opinions are utterly and completely ignorant of so many key features of plain old **humanity**, not to mention i18n and l10n, that I don't even know where to begin. As an analogy, it's like you dumped a bunch of cogs, wheels, wires and pieces of random steel on the table and called it your car. – JUST MY correct OPINION May 10 '11 at 10:28
  • 3
    i do beleive his bringing the keyboard into this is new though. when typing numbers on my native keyboard, i have a , not a . – Andy May 10 '11 at 11:20
  • 1
    @JUST MY correct OPINION: If there are "so many", how come, you do not point out even one. Your analogy is also poor, because I don't understand it. If you disagree and really care to actually tell me anything, you should try to make yourself clear, rather than throwing some stuff at me in an inappropriate tone. – back2dos May 10 '11 at 12:43
  • 3
    While I only partly agree, nevertheless +1 for the courage. And, indeed, I (being german) too like original english messages, buttons etc. more than the mutilation of my mothers tongue that is so often the result of "localization" done by - ähem - incapable idiots. – Ingo May 10 '11 at 13:05
  • 2
    Ingo: This is not evidence against l10n. It's evidence against idiots. I submit that if we applied your standards for l10n to programming in general the "logical" argument would be to stop using software at all. – JUST MY correct OPINION May 10 '11 at 13:11
  • Sure, @OPINION, this is only how I feel. But you'll probably admit that for some things the following applies: If done badly, it is worse as if not done at all. In german, we have a saying: *Gut gemeint ist das Gegenteil von gut gemacht." meaning (according to google translator): Well meant is the opposite of well done. Hence, I fell back2dos's has a point. – Ingo May 10 '11 at 19:23
  • 7
    Being force fed localization is really annoying. For example many websites ignore my language/culture preference set in the browser and instead use geoip. And then some sites think that an auto translated page is useful... – CodesInChaos May 11 '11 at 11:17
  • 3
    @CodeInChaos I can't vote up the geoip enough! For example here's a way to make fried eggs, courtesy of an ancient version of Google translate (and translated back into English): Take two chicken testicles, crack their shells and pour the contents into frying pan coated with crude oil. Flame mediumly. It was hilarious, but what if the user didn't actually know English and used crude oil instead sunflower oil? – AndrejaKo May 12 '11 at 11:38