32

When programming sometimes things break. You made a mistake and your program tries to read from a wrong address.

One thing that stands out to me that often those exceptions are like :

Access violation at address 012D37BC in module 'myprog.exe'. Read of address 0000000C.

Now I see a lot of error logs and what stands out to me is the : 0000000C. Is this a "special" address? I see other access violations with bad reads but the addresses just seem random, but this one keeps coming back in totally different situations.

Pieter B
  • 12,867
  • 1
  • 40
  • 65
  • 1
    I've also noticed that `0000000C` is _way_ more common than `00000008`, but none of the answers seem to address that at all :/ – Mooing Duck Jan 21 '15 at 00:41
  • 2
    Perhaps that `System.Runtime.CompilerServices.RuntimeHelpers.OffsetToStringData` is `12=0x0C` is a reason why this offset is more common. – Mark Hurd Jan 21 '15 at 09:47
  • 1
    @MarkHurd That's scary. Do you really think that there's so many unmanaged applications that on purpose read / write .NET strings that this would be a major source of access violations? – Luaan Jan 21 '15 at 13:28

3 Answers3

57

00000000 is a special address (the null pointer). 0000000C is just what you get when you add an offset of 12 to the null pointer, most likely because someone tried to get the z member of a structure like the one below through a pointer that was actually null.

struct Foo {
    int w, x, y; // or anything else that takes 12 bytes including padding
    // such as: uint64_t w; char x;
    // or: void *w; char padding[8];
    // all assuming an ordinary 32 bit x86 system
    int z;
}
  • 29
    Or maybe because some small integral value was mistakenly dereferenced as if it were a pointer. Small values are much more common than huge values, so this tends to produce illegal adresses like 0X0000000C rather than, e.g., 0x43FCC893. – Kilian Foth Jan 20 '15 at 08:58
  • ...and the specific reason it breaks is because it's easiest to protect the whole page following `NULL`, or something to that effect..? – Alex Celeste Jan 20 '15 at 10:10
  • 3
    The reason I asked this question is because 0000000C comes back so often compared to other adresses. Why is offset 12 a magnitude more common then offset 4, 8 or 16 ? – Pieter B Jan 20 '15 at 10:32
  • i am surprised that 12 is the offset, I would have suspected offset 8 to be far more likely in a language like C++ with the vtable pointer taking up the first 8 bytes (assuming 64-bit pointers) – Michael Thorpe Jan 20 '15 at 10:39
  • 5
    After further investigation this answer is totally correct. In my source the "tag" property of classes is used extensively (either good or bad I have to deal with it.) The tag property in my case is part of a low level base class and it got always created at that offset. – Pieter B Jan 20 '15 at 10:58
  • 1
    Excellent point. Perhaps null pointer case was covered, but null pointer ++ is just a normal (and in this case invalid) address, thus it fails only upon accessing it. – Neil Jan 20 '15 at 11:47
  • 8
    @Leushenko Yes, memory protection usually works on whole pages, and even if it was possible to only catch 0, it's preferable to also protect the following addresses because they're likely to be accessed if pointer arithmetic with a null pointer happens (as in OP's case). –  Jan 20 '15 at 14:39
  • Just a guess here--could the first bytes be used with tracking memory allocations, thus making 12 the first real address in some language? – Loren Pechtel Jan 20 '15 at 19:16
  • @LorenPechtel I'm not even sure whether a language implementation (read: reasonably privileged library) could change the protection of page 0. Even then, I'm not aware of anyone doing this (and I'm very certain no C++ implementation does it). Catching null pointers is just far too useful, and the consequences if that metadata is overwritten by some wayward code are very severe. It doesn't even free any significant amount of memory (only < 100 KiB of address space and *no* physical memory at all). So, in brief, your guess is totally off. –  Jan 20 '15 at 20:19
  • @delnan You misunderstand. I'm wondering if this is a reference to zero where the first 12 bytes are something other than user data. – Loren Pechtel Jan 20 '15 at 21:02
  • @LorenPechtel I was reading this as "addresses 0 and following are used for allocations rather than protected, and something with a 12 byte header is allocated at address 0". Please elaborate if that is not the case, as I can't muster any other interpretation of either of your comments. –  Jan 20 '15 at 21:05
  • @delnan Remember that some systems put some pointers in front of a memory allocation in order to keep track of the heap. – Loren Pechtel Jan 20 '15 at 22:18
  • @LorenPechtel I understood that part perfectly well. What I'm saying is that there almost certainly won't be a memory allocation at 0000000C, or at any other address below at least 4096 (decimal), because that range of addresses is reserved and not mapped to any physical memory (see also: other comments and answers). –  Jan 20 '15 at 22:38
  • 1
    @delnan If your pointer is actually a null and you skip over some header data to get to the body... – Loren Pechtel Jan 20 '15 at 23:17
  • 1
    @LorenPechtel Oh, now I get it. You're saying there's a null pointer, as my answer suggested, but the 12 bytes being skipped over don't come from other parts of the struct but from allocator metadata. I don't think that's particular likely either. If such metadata exists, it's *before* the pointer that's used as handle on the object (i.e. if an object pointer points at 0, we'd expect the metadata at -12). Stashing the metadata and adding that number of bytes to the pointer is generally done once, by the allocator — the program's more robust against allocator changes that way. –  Jan 20 '15 at 23:25
  • 1
    @delnan Good point, but there can still be object metadata. – Loren Pechtel Jan 20 '15 at 23:37
  • @LorenPechtel: Which is _exactly_ what his answer you're commenting on suggested in the first place. I just can't get over how much more often I see crashes with `0000000C` compared to `00000008`. – Mooing Duck Jan 21 '15 at 00:44
11

In Windows it is illegal to dereference the entire first and last page, in other words the first or last 64 KiB of the process memory (the ranges 0x00000000 to 0x0000ffff and 0xffff0000 to 0xffffffff in a 32-bit application).

This is to trap the undefined behavior of dereferencing a null pointer or index into a null array. And the page size is 64 KiB so Windows just has to prevent the first or last page being assigned a valid range.

This won't guard against uninitialized pointers that could have any value (including valid addresses).

user
  • 2,703
  • 18
  • 25
ratchet freak
  • 25,706
  • 2
  • 62
  • 97
  • 1
    The 64KB is true, but it's not about pages. Pages on x86 are 4KB. – ElderBug Jan 20 '15 at 11:37
  • @ElderBug windows rounds up the page size to 64 kB (probably to make the page table smaller) – ratchet freak Jan 20 '15 at 11:38
  • 7
    Windows can't really do that. Page table is a structure defined and required by x86, and small pages are fixed at 4KB. It's set in stone (more precisely, in silicon). The 64KB is probably for convenience. – ElderBug Jan 20 '15 at 12:01
  • 12
    I'd rather write 64 KiB instead of 65 kB in this case, since the power-of-two size is relevant. – CodesInChaos Jan 20 '15 at 12:11
  • 4
    The 64KB range is a left over from the Aplha version of NT. And it's not the page size, but the allocation granularity. http://blogs.msdn.com/b/oldnewthing/archive/2003/10/08/55239.aspx – shf301 Jan 20 '15 at 17:25
  • 3
    @CodesInChaos: While uppercase "M", "G", and "T" are ambiguous, I see no reason to deprecate the use of "k" for 10^3 and "K" for 2^10. – supercat Jan 20 '15 at 19:07
  • There **is** a reason a whole whopping 64K of address-space are left unassigned. Remember that lots of Windows code uses resource-IDs and the like *OR* string-pointers, interchangeable as the same parameter, without a flag or such to select which it is. The number would be <64K, any pointer would be >=64K, thus easy to distinguish anyway. (In 16-bit days, it was the invalid 0 selector instead.) – Deduplicator Jan 20 '15 at 20:27
  • 1
    @ElderBug: http://en.wikipedia.org/wiki/Page_%28computer_memory%29#Huge_pages Pages are normally 4Kib, but can also be 4MiB or 2Mib on x86 and x64 respectively, or any of 8 KiB, 64 KiB, 256 KiB, 1 MiB, 4 MiB, 16 MiB, 256 MiB for Itanium. http://msdn.microsoft.com/en-us/library/windows/desktop/aa366720%28v=vs.85%29.aspx – Mooing Duck Jan 20 '15 at 20:33
  • 2
    @MooingDuck Yes indeed, that's why I precised small pages. Most x64 CPU also support the 1GiB pages. As far as I know, Windows always pages with 4KB pages, unless allocated with special APIs. – ElderBug Jan 20 '15 at 21:00
  • @shf301: Though for the first 64K being a hole, that's ancient 16-bit-windows history, not the Alpha's fault. – Deduplicator Jan 20 '15 at 21:38
  • @supercat The answer wasn't using `K` for 1024 here, it was (correctly) using `k` for 1000. I generally prefer decimal prefixes over binary prefixes, but since this size is *exactly* 64 KiB, using the approximation 65 kB isn't a good idea here. – CodesInChaos Feb 18 '15 at 20:36
  • @CodesInChaos: My point was that I'd write it 64KB, using uppercase K and uppercase B, or else just 64K. In speaking, I would pronounce lowercase "k" "kilo" and the uppercase one "kay". For 2^20, and 2^30, the spoken forms are "meg(s)" and "gig(s)". IMHO, trying to adopt the new prefixes now will do more to create ambiguity in existing literature than resolve ambiguities going forward. – supercat Feb 18 '15 at 20:49
2

As for why 0x0C seems more common than 0x08 (is it really? I don't know; and in what kinds of applications?), this might have to do with virtual method table pointers. This is really more of a comment (wild mass guessing :), but it's somewhat larger, so here goes... If you've got a class with virtual methods, its own fields are going to be shifted by 0x04. For example, a class that inherits from another virtual class might have a memory layout like this:

0x00 - VMT pointer for parent
0x04 - Field 1 in parent
0x08 - VMT pointer for child
0x0C - Field 1 in child

Is this a common scenario, or even close? I'm not sure. However, note that in a 64-bit application, this could get even more interestingly shifted towards the 0x0C value:

0x00 - VMT parent
0x08 - Field 1 parent
0x0C - VMT child
0x14 - Field 2 child

So there's actually a lot of cases where applications might have significant overlap in null-pointer offsets. It might be the first field in a child class, or its virtual method table pointer - needed whenever you call any virtual method on an instance, so if you're calling a virtual method on a null pointer, you'll get access violation on its VMT offset. The prevalence of this particular value might then have something to do with some common API that provides a class which has a similar inheritance pattern, or more likely, a particular interface (quite possible for some classes of applications, like DirectX games). It might be possible to track some simple common cause like this, but I tend to get rid of applications that do null dereferencing pretty quickly, so...

Luaan
  • 1,850
  • 1
  • 13
  • 10
  • 1
    If you look through the comments, you can reduce the guessing considerably. – Deduplicator Jan 21 '15 at 13:23
  • @Deduplicator Well, I find the idea that managed .NET strings are used in unsafe code with manual pointer operations scary, and the thought that this would be the major cause for access violations even more so. "Yeah, this is totally memory safe, don't worry, we used C#. We just manually modify the memory from C++, but it's safe in C#." – Luaan Jan 21 '15 at 13:26