Why do we need datastructures other than HashMap

Question

Map (or HashMap) takes Constant time for Insert, Remove and Retrieve. While all other data structures which I know so far, do not take constant time and their time for above operations depends on the size of input.

So, why do we need all other data structures ever ? Isn't HashMap is universal data structure ?

Possible duplicate of [Is micro-optimisation important when coding?](http://softwareengineering.stackexchange.com/questions/99445/is-micro-optimisation-important-when-coding) — gnat, Jan 21 '17 at 09:52
Some languages (PHP, JavaScript) essentially only have hash maps and no other built-in collections. In many dynamic languages (JavaScript, Python, Perl), objects are usually represented as hash maps. This generally works and is quite flexible, but isn't exactly the most efficient way to do something. In these languages, working with arrays tends to be rather weird, e.g. a foreach-loop over an array where you deleted elements or didn't insert elements in the “correct” order leads to unexpected results. — amon, Jan 21 '17 at 11:52

score 6 · Accepted Answer · answered Jan 21 '17 at 09:16

6

Hash map has constant time for most operations - on average. Worst case complexity is much higher. A list or a tree can guarantee fast (even if not constant) time in every case (even for maliciously crafted input).
Element lookup in an array is usually a single CPU instruction. Hash map adds a lot overhead over that - even if both have constant average complexity.
It doesn't keep any order of elements. Inserting elements at particular position requires rebuilding whole hash map. Iterating over elements in particular order requires converting to a different data structure.
Hash map has much higher memory footprint than tightly packed array or carefully designed linked list. It usually has much worse cache locality.
Not all data types are easily hashable.

answered Jan 21 '17 at 09:16

Banthar

350
1
4

6. Sometimes, you simply don't need it. If I want a stack- or queue-like behavior, I'll use a stack or a queue. If I need to walk through a sequence step by step in both directions, I'll use a doubly-linked list. – Arseni Mourzenko Jan 21 '17 at 09:22
Re #3: it is trivial to combine a map with a linked list such that the insertion order will be preserved. It is still O(1) amortized worst-case step complexity. (The expensive part of a linked list is finding stuff, but in this hybrid data structure you can use the map to find the element.) – Jörg W Mittag Jan 21 '17 at 10:07
@JörgWMittag That fixes the indeterminate iteration order (at the cost of more memory, and in the case of open addressing far more complicated rehashing), but it doesn't allow arbitrary indexing. For that, you'd need a second hash map keyed by the insertion index. And keep it up to date when you remove something. – Sebastian Redl Jan 21 '17 at 10:29
@SebastianRedl: Sure, there's a cost. The specific statement in #3 is "it doesn't keep **any** order", and I was pointing out that at least insertion order is not that hard to preserve. When Ruby moved from unordered to insertion-ordered dictionaries, none of the execution engines exhibited any significant performance degradation. – Jörg W Mittag Jan 21 '17 at 11:09
So, If I say this way, `If I need to store and then retrieve elements based on key, HashMap is an ideal choice` would I be right ? – SimpleGuy Jan 21 '17 at 11:44
3

@SimpleGuy No, #1, #2, #4 and #5 might still apply in such case. Hash map is usually a good choice, but doesn't in any way make other data structures obsolete. – Banthar Jan 21 '17 at 12:01

score 2 · Answer 2 · answered Jan 21 '17 at 09:38

For example, array and hashmap both have constant access time (on average). However, for an array the constant is much, much smaller. And for an array, the memory usage is much smaller, because a hash map would store the indices as well, and have many unused slots (because hashing gets slow of the percentage of used slots gets too high).

And then there are vector instructions. A modern Intel processor can read 8 float values or 32 byte values from an array in a single instruction. Those 32 byte values would have to go through the (constant time but slow) hashing code 32 times. I'd say that would easily give you a factor 1000 for total slowdown.

Why do we need datastructures other than HashMap

2 Answers2