18

Globally Unique Identifiers (GUID) are a grouped string with a specific format which I assume has a security reason.

A GUID is most commonly written in text as a sequence of hexadecimal digits separated into five groups, such as:

3F2504E0-4F89-11D3-9A0C-0305E82C3301

Why aren't GUID/UUID strings just random bytes encoded using hexadecimal of X length?

This text notation contains the following fields, separated by hyphens:

| Hex digits | Description
|-------------------------
| 8            | Data1
| 4            | Data2
| 4            | Data3
| 4            | Initial two bytes from Data4
| 12           | Remaining six bytes from Data4

There are also several versions of the UUID standards.

Version 4 UUIDs are generally internally stored as a raw array of 128 bits, and typically displayed in a format something like:

uuid:xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx

Xeoncross
  • 1,213
  • 1
  • 11
  • 24
  • 4
    No, it probably isn't for security reasons, the bitstring has the same entropy with or without the dashes. I would think it is so that GUID's can be recognized at a glance instead of going "here's a bunch of hex characters, is that md5.. or perhaps sha1.. no, wait, it could be..." and so on. Also, GUID's are usually not just random bytes. –  Oct 14 '12 at 02:35
  • http://blogs.msdn.com/b/oldnewthing/archive/2008/06/27/8659071.aspx – Daniel Little Nov 10 '13 at 23:19
  • Similar Question from SO [UUID format: 8-4-4-4-12 - Why?](http://stackoverflow.com/q/10687505/1671639) – Praveen Sep 12 '14 at 10:55
  • [specific to the latest version](https://stackoverflow.com/q/47230521/1739000) (version 4) – NH. Jan 19 '18 at 22:01

2 Answers2

12

From RfC4122 – A Universally Unique IDentifier (UUID) URN Namespace

The formal definition of the UUID string representation is provided by the following ABNF:

UUID                   = time-low "-" time-mid "-"
                         time-high-and-version "-"
                         clock-seq-and-reserved
                         clock-seq-low "-" node

So, those are just the different fields from the original time and MAC-based UUID. The RFC says it originates from the Apollo Network Computing System.

ivan_pozdeev
  • 583
  • 3
  • 15
Jörg W Mittag
  • 101,921
  • 24
  • 218
  • 318
2

The text representation with the dashes is separating the four fields of the Guid/UUID into five groups (with the last field being separated itself after the first two bytes): Guid Text Encoding

The representation doesn't have anything to do with security, as there are different methods of computing it and is intended to be a unique identifier not necessarily a secure one.

The most likely reason the fields are split (even though the standard doesn't mention it) is for readability/separation of the component parts.

Turnkey
  • 1,697
  • 9
  • 10
  • 2
    That tells us what the format is, information that was already in the question. It doesn't explain *why*, which is what the OP was asking. – Keith Thompson Oct 14 '12 at 19:49
  • 1
    It is just separating them into the fields, likely for better readability and identification. Maybe the last one was split further because of its length. – Turnkey Oct 14 '12 at 20:06
  • 1
    logical. Same reason phone numbers, credit card numbers, and many other long numbers are frequently split up in groups when printed or written down. – jwenting Oct 15 '12 at 05:39