5

I want to find the oldest/newest element added into a heap of size k. At any given point of time if I have to find say the oldest element in the heap is there any approach with O(1) space and O(1) time.

I was thinking of an approach of having a queue of the same size as that of the heap and add all the element into the queue as they are added to the heap. This yields a constant time solution but this requires an extra O(k) space. I would like to know if this problem can be solved with constant space.

redDragon
  • 95
  • 1
  • 7

3 Answers3

3

You can use a Double-End Priority Queue for this. There are a number of implementation methods available and the structure is routinely available in many libraries. Most Java libraries call it a MinMaxPriorityQueue though. It offers O(1) retrieval of Min and Max (Newest/Oldest) in O(1) time.

In order to fulfill your space requirement, you'd need to use an interval heap where each node has two values. Book-keeping is a little tricky but no slower than normal.

2

If you maintain a separate data structure along side your heap, this would likely solve your problem. This is the approach that the LinkedHashMap uses (a linked list along side the HashMap) to allow for an LRU cache and well known ordering of the entries in the LinkedHashMap (insertion order) (grepcode LinkedHashMap.Entry).

This data structure would look like a Deque or something similar and is ordered by insertion order. No other ordering is imposed on this structure.

The biggest problem with doing this though, is that the two structures (the heap and the deque) are not synchronized, one would have to do book keeping to remove from the deque when something is removed from the heap.

One approach to this would be to use a weak reference (Java Reference, C# WeakRefrence class). When you walk the deque you would poll the Reference for get() to see if its actually there, and if so return that, otherwise discard that element in the deque and keep walking the list until you find a Reference that has an object attached to it.

The gotcha here is that if you are pointing to the object the heap is pointing to, it is possible that some other part of the code has taken a strong hold on the object and you'll think that is still in the Heap.

To avoid this, the deque should instead point to the heap element rather than the object the heap element is pointing to, and be very clean with your removal of the element from the heap when it is disposed (and not letting it leak outside of the heap - except to the deque (which should probably be inside the heap)). I will admit to not knowing how long the Reference will still point to something and what conditions are necessary to deterministically clear it.

This then would allow the garbage collector to do the work for you of book keeping. The trade off is instead of each time the insert or removal is being done form the heap and book keeping being done then, the book keeping is done when the deque is polled or peeked. Which way the trade off goes is something to consider based on the insertion, removal, and polling the application does of the heap.

A thought of sticking yet another data structure inside the heap may allow for issues of deterministic references. If you have a HashMap in the heap that holds all the objects too (though note that its not actually holding copies of them juts its own strong references to the objects). When something is added or removed from the heap, it is also added or removed from the HashMap. Thus, the deque, instead of holding weak references to the object holds a strong reference. When the deque is polled, it checks the HashMap to see if the object is still in the heap.

Either way, what this ultimately provides is a deque such that the first element returned from the deque is the oldest in the heap, and the last element returned from the deque is the newest object in the heap.

1

I've copied my answer from this question: What data structure can I use to implement a double ended priority queue with log(n) insertion and find? I've repeated it here because these questions don't really appear to be duplications, but simply two questions that can be -- but don't have to be -- solved with the same technique:

Here is what I did for a block allocator free-list implementation that needed a double-ended priority queue:

  • Use standard red-black tree algorithms, with one small enhancement:
  • During insertion keep track of the minimum and maximum element in the tree as you do insertions or deletions. This only adds at most O(log n) comparisons as overhead during the already O(log n) insertion process, so doesn't change the overall algorithmic complexity.

This yields:

  • O(log n): insertion
  • O(1): find-min and delete-min
  • O(1): find-max and delete-max

Since those were the only operations I needed for my particular application, this was pretty much the best possible data structure. Even if you needed other priority queue operations (like changing priorities, etc) pretty much everything is still going to be O(log n) worst case because of the red-black tree.

The reason you get O(1) for both find-min/max as well as delete-min/max is that you've kept track of the minimum and maximum element during insertion, so you can immediately find it, eliminating the O(log n) for the search. Then the red-black tree algorithm guarantees O(1) operations to rebalance the tree after the element is removed. (This is a well known and very important property of red-black trees. This wouldn't be the case with most other binary search trees, such as AVL trees, which require O(log n) operations during rebalancing.)

wjl
  • 257
  • 3
  • 10